DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] usertools: rewrite pmdinfo
@ 2022-09-13 10:58 Robin Jarry
  2022-09-13 11:29 ` Ferruh Yigit
                   ` (7 more replies)
  0 siblings, 8 replies; 42+ messages in thread
From: Robin Jarry @ 2022-09-13 10:58 UTC (permalink / raw)
  To: dev; +Cc: Robin Jarry, Olivier Matz

dpdk-pmdinfo.py does not produce any parseable output. The -r/--raw flag
merely prints multiple independent JSON lines which cannot be fed
directly to any JSON parser. Moreover, the script complexity is rather
high for such a simple task: extracting PMD_INFO_STRING from .rodata ELF
sections. Rewrite it so that it can produce valid JSON.

Remove the PCI database parsing for PCI-ID to Vendor-Device names
conversion. This should be done by external scripts (if really needed).

Here are some examples of use with jq:

Get the complete info for a given driver:

 ~$ usertools/dpdk-pmdinfo.py build/app/dpdk-testpmd | \
   jq '.[] | select(.name == "dmadev_idxd_pci")'
 {
   "name": "dmadev_idxd_pci",
   "params": "max_queues=0",
   "kmod": "vfio-pci",
   "devices": [
     {
       "vendor_id": "8086",
       "device_id": "0b25",
       "subsystem_device_id": "ffff",
       "subsystem_system_id": "ffff"
     }
   ]
 }

Get only the required kernel modules for a given driver:

 ~$ usertools/dpdk-pmdinfo.py build/app/dpdk-testpmd | \
   jq '.[] | select(.name == "net_i40e").kmod'
 "* igb_uio | uio_pci_generic | vfio-pci"

Get only the required kernel modules for a given device:

 ~$ usertools/dpdk-pmdinfo.py build/app/dpdk-testpmd | \
   jq '.[] | select(.devices[] | .vendor_id == "15b3" and .device_id == "1013").kmod'
 "* ib_uverbs & mlx5_core & mlx5_ib"

Print the list of drivers which define multiple parameters without
string separators:

 ~$ usertools/dpdk-pmdinfo.py build/app/dpdk-testpmd | \
   jq '.[] | select(.params!=null and (.params|test("=[^ ]+="))) | {name, params}'
 ...

The script passes flake8, black, isort and pylint checks.

I have tested this with a matrix of python/pyelftools versions:

                             pyelftools
               0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29
         3.6     ok   ok   ok   ok   ok   ok   ok   ok
         3.7     ok   ok   ok   ok   ok   ok   ok   ok
  Python 3.8     ok   ok   ok   ok   ok   ok   ok   ok
         3.9     ok   ok   ok   ok   ok   ok   ok   ok
         3.10  fail fail fail fail   ok   ok   ok   ok

All failures with python 3.10 are related to the same issue:

  File "elftools/construct/lib/container.py", line 5, in <module>
    from collections import MutableMapping
  ImportError: cannot import name 'MutableMapping' from 'collections'

Python 3.10 support is only available since pyelftools 0.26.

NB: The output produced by the legacy -r/--raw flag can be obtained with
the following command:

  strings build/app/dpdk-testpmd | sed -n 's/^PMD_INFO_STRING= //p'

Cc: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Robin Jarry <robin@jarry.cc>
---
There were multiple compatibility issues with this script in the past
years. Also, the style and complexity may be unsettling for python
developers. After this patch, maintenance should be much easier.

 doc/guides/rel_notes/release_22_11.rst |   5 +
 usertools/dpdk-pmdinfo.py              | 856 ++++++++-----------------
 2 files changed, 261 insertions(+), 600 deletions(-)

diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
index 8c021cf0505e..67054f5acdc9 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -84,6 +84,11 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* The ``dpdk-pmdinfo.py`` script was rewritten to produce valid JSON only.
+  PCI-IDs parsing has been removed.
+  To get a similar output to the (now removed) ``-r/--raw`` flag, you may use the following command::
+
+     strings $dpdk_binary_or_driver | sed -n 's/^PMD_INFO_STRING= //p'
 
 ABI Changes
 -----------
diff --git a/usertools/dpdk-pmdinfo.py b/usertools/dpdk-pmdinfo.py
index 40ef5cec6cba..cc72e5ce27a2 100755
--- a/usertools/dpdk-pmdinfo.py
+++ b/usertools/dpdk-pmdinfo.py
@@ -1,626 +1,282 @@
 #!/usr/bin/env python3
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2016  Neil Horman <nhorman@tuxdriver.com>
+# Copyright(c) 2022  Robin Jarry
+# pylint: disable=invalid-name
 
-# -------------------------------------------------------------------------
-#
-# Utility to dump PMD_INFO_STRING support from an object file
-#
-# -------------------------------------------------------------------------
+r"""
+Utility to dump PMD_INFO_STRING support from DPDK binaries.
+
+This script prints JSON output to be interpreted by other tools. Here are some
+examples with jq:
+
+Get the complete info for a given driver:
+
+  %(prog)s dpdk-testpmd | \
+  jq '.[] | select(.name == "cnxk_nix_inl")'
+
+Get only the required kernel modules for a given driver:
+
+  %(prog)s dpdk-testpmd | \
+  jq '.[] | select(.name == "net_i40e").kmod'
+
+Get only the required kernel modules for a given device:
+
+  %(prog)s dpdk-testpmd | \
+  jq '.[] | select(.devices[] | .vendor_id == "15b3" and .device_id == "1013").kmod'
+"""
+
+import argparse
 import json
 import os
-import platform
+import re
+import string
 import sys
-import argparse
-from elftools.common.exceptions import ELFError
-from elftools.common.py3compat import byte2int
-from elftools.elf.elffile import ELFFile
+from pathlib import Path
+from typing import Iterable, Iterator, List, Union
 
+import elftools
+from elftools.elf.elffile import ELFError, ELFFile
 
-# For running from development directory. It should take precedence over the
-# installed pyelftools.
-sys.path.insert(0, '.')
-
-raw_output = False
-pcidb = None
-
-# ===========================================
-
-class Vendor:
-    """
-    Class for vendors. This is the top level class
-    for the devices belong to a specific vendor.
-    self.devices is the device dictionary
-    subdevices are in each device.
-    """
-
-    def __init__(self, vendorStr):
-        """
-        Class initializes with the raw line from pci.ids
-        Parsing takes place inside __init__
-        """
-        self.ID = vendorStr.split()[0]
-        self.name = vendorStr.replace("%s " % self.ID, "").rstrip()
-        self.devices = {}
-
-    def addDevice(self, deviceStr):
-        """
-        Adds a device to self.devices
-        takes the raw line from pci.ids
-        """
-        s = deviceStr.strip()
-        devID = s.split()[0]
-        if devID in self.devices:
-            pass
-        else:
-            self.devices[devID] = Device(deviceStr)
-
-    def report(self):
-        print(self.ID, self.name)
-        for id, dev in self.devices.items():
-            dev.report()
-
-    def find_device(self, devid):
-        # convert to a hex string and remove 0x
-        devid = hex(devid)[2:]
-        try:
-            return self.devices[devid]
-        except:
-            return Device("%s  Unknown Device" % devid)
-
-
-class Device:
-
-    def __init__(self, deviceStr):
-        """
-        Class for each device.
-        Each vendor has its own devices dictionary.
-        """
-        s = deviceStr.strip()
-        self.ID = s.split()[0]
-        self.name = s.replace("%s  " % self.ID, "")
-        self.subdevices = {}
-
-    def report(self):
-        print("\t%s\t%s" % (self.ID, self.name))
-        for subID, subdev in self.subdevices.items():
-            subdev.report()
-
-    def addSubDevice(self, subDeviceStr):
-        """
-        Adds a subvendor, subdevice to device.
-        Uses raw line from pci.ids
-        """
-        s = subDeviceStr.strip()
-        spl = s.split()
-        subVendorID = spl[0]
-        subDeviceID = spl[1]
-        subDeviceName = s.split("  ")[-1]
-        devID = "%s:%s" % (subVendorID, subDeviceID)
-        self.subdevices[devID] = SubDevice(
-            subVendorID, subDeviceID, subDeviceName)
-
-    def find_subid(self, subven, subdev):
-        subven = hex(subven)[2:]
-        subdev = hex(subdev)[2:]
-        devid = "%s:%s" % (subven, subdev)
-
-        try:
-            return self.subdevices[devid]
-        except:
-            if (subven == "ffff" and subdev == "ffff"):
-                return SubDevice("ffff", "ffff", "(All Subdevices)")
-            return SubDevice(subven, subdev, "(Unknown Subdevice)")
-
-
-class SubDevice:
-    """
-    Class for subdevices.
-    """
-
-    def __init__(self, vendor, device, name):
-        """
-        Class initializes with vendorid, deviceid and name
-        """
-        self.vendorID = vendor
-        self.deviceID = device
-        self.name = name
-
-    def report(self):
-        print("\t\t%s\t%s\t%s" % (self.vendorID, self.deviceID, self.name))
-
-
-class PCIIds:
-    """
-    Top class for all pci.ids entries.
-    All queries will be asked to this class.
-    PCIIds.vendors["0e11"].devices["0046"].\
-    subdevices["0e11:4091"].name  =  "Smart Array 6i"
-    """
-
-    def __init__(self, filename):
-        """
-        Prepares the directories.
-        Checks local data file.
-        Tries to load from local, if not found, downloads from web
-        """
-        self.version = ""
-        self.date = ""
-        self.vendors = {}
-        self.contents = None
-        self.readLocal(filename)
-        self.parse()
-
-    def reportVendors(self):
-        """Reports the vendors
-        """
-        for vid, v in self.vendors.items():
-            print(v.ID, v.name)
-
-    def report(self, vendor=None):
-        """
-        Reports everything for all vendors or a specific vendor
-        PCIIds.report()  reports everything
-        PCIIDs.report("0e11") reports only "Compaq Computer Corporation"
-        """
-        if vendor is not None:
-            self.vendors[vendor].report()
-        else:
-            for vID, v in self.vendors.items():
-                v.report()
-
-    def find_vendor(self, vid):
-        # convert vid to a hex string and remove the 0x
-        vid = hex(vid)[2:]
-
-        try:
-            return self.vendors[vid]
-        except:
-            return Vendor("%s Unknown Vendor" % (vid))
-
-    def findDate(self, content):
-        for l in content:
-            if l.find("Date:") > -1:
-                return l.split()[-2].replace("-", "")
-        return None
-
-    def parse(self):
-        if not self.contents:
-            print("data/%s-pci.ids not found" % self.date)
-        else:
-            vendorID = ""
-            deviceID = ""
-            for l in self.contents:
-                if l[0] == "#":
-                    continue
-                elif not l.strip():
-                    continue
-                else:
-                    if l.find("\t\t") == 0:
-                        self.vendors[vendorID].devices[
-                            deviceID].addSubDevice(l)
-                    elif l.find("\t") == 0:
-                        deviceID = l.strip().split()[0]
-                        self.vendors[vendorID].addDevice(l)
-                    else:
-                        vendorID = l.split()[0]
-                        self.vendors[vendorID] = Vendor(l)
-
-    def readLocal(self, filename):
-        """
-        Reads the local file
-        """
-        with open(filename, 'r', encoding='utf-8') as f:
-            self.contents = f.readlines()
-        self.date = self.findDate(self.contents)
-
-    def loadLocal(self):
-        """
-        Loads database from local. If there is no file,
-        it creates a new one from web
-        """
-        self.date = idsfile[0].split("/")[1].split("-")[0]
-        self.readLocal()
-
-
-# =======================================
-
-def search_file(filename, search_path):
-    """ Given a search path, find file with requested name """
-    for path in search_path.split(':'):
-        candidate = os.path.join(path, filename)
-        if os.path.exists(candidate):
-            return os.path.abspath(candidate)
-    return None
-
-
-class ReadElf(object):
-    """ display_* methods are used to emit output into the output stream
-    """
-
-    def __init__(self, file, output):
-        """ file:
-                stream object with the ELF file to read
-
-            output:
-                output stream to write to
-        """
-        self.elffile = ELFFile(file)
-        self.output = output
-
-        # Lazily initialized if a debug dump is requested
-        self._dwarfinfo = None
-
-        self._versioninfo = None
-
-    def _section_from_spec(self, spec):
-        """ Retrieve a section given a "spec" (either number or name).
-            Return None if no such section exists in the file.
-        """
-        try:
-            num = int(spec)
-            if num < self.elffile.num_sections():
-                return self.elffile.get_section(num)
-            return None
-        except ValueError:
-            # Not a number. Must be a name then
-            section = self.elffile.get_section_by_name(force_unicode(spec))
-            if section is None:
-                # No match with a unicode name.
-                # Some versions of pyelftools (<= 0.23) store internal strings
-                # as bytes. Try again with the name encoded as bytes.
-                section = self.elffile.get_section_by_name(force_bytes(spec))
-            return section
-
-    def pretty_print_pmdinfo(self, pmdinfo):
-        global pcidb
-
-        for i in pmdinfo["pci_ids"]:
-            vendor = pcidb.find_vendor(i[0])
-            device = vendor.find_device(i[1])
-            subdev = device.find_subid(i[2], i[3])
-            print("%s (%s) : %s (%s) %s" %
-                  (vendor.name, vendor.ID, device.name,
-                   device.ID, subdev.name))
-
-    def parse_pmd_info_string(self, mystring):
-        global raw_output
-        global pcidb
-
-        optional_pmd_info = [
-            {'id': 'params', 'tag': 'PMD PARAMETERS'},
-            {'id': 'kmod', 'tag': 'PMD KMOD DEPENDENCIES'}
-        ]
-
-        i = mystring.index("=")
-        mystring = mystring[i + 2:]
-        pmdinfo = json.loads(mystring)
-
-        if raw_output:
-            print(json.dumps(pmdinfo))
-            return
-
-        print("PMD NAME: " + pmdinfo["name"])
-        for i in optional_pmd_info:
-            try:
-                print("%s: %s" % (i['tag'], pmdinfo[i['id']]))
-            except KeyError:
-                continue
-
-        if pmdinfo["pci_ids"]:
-            print("PMD HW SUPPORT:")
-            if pcidb is not None:
-                self.pretty_print_pmdinfo(pmdinfo)
-            else:
-                print("VENDOR\t DEVICE\t SUBVENDOR\t SUBDEVICE")
-                for i in pmdinfo["pci_ids"]:
-                    print("0x%04x\t 0x%04x\t 0x%04x\t\t 0x%04x" %
-                          (i[0], i[1], i[2], i[3]))
-
-        print("")
-
-    def display_pmd_info_strings(self, section_spec):
-        """ Display a strings dump of a section. section_spec is either a
-            section number or a name.
-        """
-        section = self._section_from_spec(section_spec)
-        if section is None:
-            return
-
-        data = section.data()
-        dataptr = 0
-
-        while dataptr < len(data):
-            while (dataptr < len(data) and
-                   not 32 <= byte2int(data[dataptr]) <= 127):
-                dataptr += 1
-
-            if dataptr >= len(data):
-                break
-
-            endptr = dataptr
-            while endptr < len(data) and byte2int(data[endptr]) != 0:
-                endptr += 1
-
-            # pyelftools may return byte-strings, force decode them
-            mystring = force_unicode(data[dataptr:endptr])
-            rc = mystring.find("PMD_INFO_STRING")
-            if rc != -1:
-                self.parse_pmd_info_string(mystring[rc:])
-
-            dataptr = endptr
-
-    def find_librte_eal(self, section):
-        for tag in section.iter_tags():
-            # pyelftools may return byte-strings, force decode them
-            if force_unicode(tag.entry.d_tag) == 'DT_NEEDED':
-                if "librte_eal" in force_unicode(tag.needed):
-                    return force_unicode(tag.needed)
-        return None
-
-    def search_for_autoload_path(self):
-        scanelf = self
-        scanfile = None
-        library = None
-
-        section = self._section_from_spec(".dynamic")
-        try:
-            eallib = self.find_librte_eal(section)
-            if eallib is not None:
-                ldlibpath = os.environ.get('LD_LIBRARY_PATH')
-                if ldlibpath is None:
-                    ldlibpath = ""
-                dtr = self.get_dt_runpath(section)
-                library = search_file(eallib,
-                                      dtr + ":" + ldlibpath +
-                                      ":/usr/lib64:/lib64:/usr/lib:/lib")
-                if library is None:
-                    return (None, None)
-                if not raw_output:
-                    print("Scanning for autoload path in %s" % library)
-                scanfile = open(library, 'rb')
-                scanelf = ReadElf(scanfile, sys.stdout)
-        except AttributeError:
-            # Not a dynamic binary
-            pass
-        except ELFError:
-            scanfile.close()
-            return (None, None)
-
-        section = scanelf._section_from_spec(".rodata")
-        if section is None:
-            if scanfile is not None:
-                scanfile.close()
-            return (None, None)
-
-        data = section.data()
-        dataptr = 0
-
-        while dataptr < len(data):
-            while (dataptr < len(data) and
-                   not 32 <= byte2int(data[dataptr]) <= 127):
-                dataptr += 1
-
-            if dataptr >= len(data):
-                break
-
-            endptr = dataptr
-            while endptr < len(data) and byte2int(data[endptr]) != 0:
-                endptr += 1
-
-            # pyelftools may return byte-strings, force decode them
-            mystring = force_unicode(data[dataptr:endptr])
-            rc = mystring.find("DPDK_PLUGIN_PATH")
-            if rc != -1:
-                rc = mystring.find("=")
-                return (mystring[rc + 1:], library)
-
-            dataptr = endptr
-        if scanfile is not None:
-            scanfile.close()
-        return (None, None)
-
-    def get_dt_runpath(self, dynsec):
-        for tag in dynsec.iter_tags():
-            # pyelftools may return byte-strings, force decode them
-            if force_unicode(tag.entry.d_tag) == 'DT_RUNPATH':
-                return force_unicode(tag.runpath)
-        return ""
-
-    def process_dt_needed_entries(self):
-        """ Look to see if there are any DT_NEEDED entries in the binary
-            And process those if there are
-        """
-        runpath = ""
-        ldlibpath = os.environ.get('LD_LIBRARY_PATH')
-        if ldlibpath is None:
-            ldlibpath = ""
-
-        dynsec = self._section_from_spec(".dynamic")
-        try:
-            runpath = self.get_dt_runpath(dynsec)
-        except AttributeError:
-            # dynsec is None, just return
-            return
-
-        for tag in dynsec.iter_tags():
-            # pyelftools may return byte-strings, force decode them
-            if force_unicode(tag.entry.d_tag) == 'DT_NEEDED':
-                if 'librte_' in force_unicode(tag.needed):
-                    library = search_file(force_unicode(tag.needed),
-                                          runpath + ":" + ldlibpath +
-                                          ":/usr/lib64:/lib64:/usr/lib:/lib")
-                    if library is not None:
-                        with open(library, 'rb') as file:
-                            try:
-                                libelf = ReadElf(file, sys.stdout)
-                            except ELFError:
-                                print("%s is no an ELF file" % library)
-                                continue
-                            libelf.process_dt_needed_entries()
-                            libelf.display_pmd_info_strings(".rodata")
-                            file.close()
-
-
-# compat: remove force_unicode & force_bytes when pyelftools<=0.23 support is
-# dropped.
-def force_unicode(s):
-    if hasattr(s, 'decode') and callable(s.decode):
-        s = s.decode('latin-1')  # same encoding used in pyelftools py3compat
-    return s
-
-
-def force_bytes(s):
-    if hasattr(s, 'encode') and callable(s.encode):
-        s = s.encode('latin-1')  # same encoding used in pyelftools py3compat
-    return s
-
-
-def scan_autoload_path(autoload_path):
-    global raw_output
-
-    if not os.path.exists(autoload_path):
-        return
 
+# ----------------------------------------------------------------------------
+def main() -> int:  # pylint: disable=missing-docstring
     try:
-        dirs = os.listdir(autoload_path)
-    except OSError:
-        # Couldn't read the directory, give up
-        return
+        args = parse_args()
+        info = parse_pmdinfo(args.elf_files, args.search_plugins)
+        json.dump(info, sys.stdout, indent=2)
+        sys.stdout.write("\n")
+    except BrokenPipeError:
+        pass
+    except KeyboardInterrupt:
+        return 1
+    except Exception as e:  # pylint: disable=broad-except
+        print(f"error: {e}", file=sys.stderr)
+        return 1
 
-    for d in dirs:
-        dpath = os.path.join(autoload_path, d)
-        if os.path.isdir(dpath):
-            scan_autoload_path(dpath)
-        if os.path.isfile(dpath):
-            try:
-                file = open(dpath, 'rb')
-                readelf = ReadElf(file, sys.stdout)
-            except ELFError:
-                # this is likely not an elf file, skip it
-                continue
-            except IOError:
-                # No permission to read the file, skip it
-                continue
-
-            if not raw_output:
-                print("Hw Support for library %s" % d)
-            readelf.display_pmd_info_strings(".rodata")
-            file.close()
+    return 0
 
 
-def scan_for_autoload_pmds(dpdk_path):
+# ----------------------------------------------------------------------------
+def parse_args() -> argparse.Namespace:
     """
-    search the specified application or path for a pmd autoload path
-    then scan said path for pmds and report hw support
+    Parse command line arguments.
     """
-    global raw_output
-
-    if not os.path.isfile(dpdk_path):
-        if not raw_output:
-            print("Must specify a file name")
-        return
-
-    file = open(dpdk_path, 'rb')
-    try:
-        readelf = ReadElf(file, sys.stdout)
-    except ElfError:
-        if not raw_output:
-            print("Unable to parse %s" % file)
-        return
-
-    (autoload_path, scannedfile) = readelf.search_for_autoload_path()
-    if not autoload_path:
-        if not raw_output:
-            print("No autoload path configured in %s" % dpdk_path)
-        return
-    if not raw_output:
-        if scannedfile is None:
-            scannedfile = dpdk_path
-        print("Found autoload path %s in %s" % (autoload_path, scannedfile))
-
-    file.close()
-    if not raw_output:
-        print("Discovered Autoload HW Support:")
-    scan_autoload_path(autoload_path)
-    return
-
-
-def main(stream=None):
-    global raw_output
-    global pcidb
-
-    pcifile_default = "./pci.ids"  # For unknown OS's assume local file
-    if platform.system() == 'Linux':
-        # hwdata is the legacy location, misc is supported going forward
-        pcifile_default = "/usr/share/misc/pci.ids"
-        if not os.path.exists(pcifile_default):
-            pcifile_default = "/usr/share/hwdata/pci.ids"
-    elif platform.system() == 'FreeBSD':
-        pcifile_default = "/usr/local/share/pciids/pci.ids"
-        if not os.path.exists(pcifile_default):
-            pcifile_default = "/usr/share/misc/pci_vendors"
-
     parser = argparse.ArgumentParser(
-        usage='usage: %(prog)s [-hrtp] [-d <pci id file>] elf_file',
-        description="Dump pmd hardware support info")
-    group = parser.add_mutually_exclusive_group()
-    group.add_argument('-r', '--raw',
-                       action='store_true', dest='raw_output',
-                       help='dump raw json strings')
-    group.add_argument("-t", "--table", dest="tblout",
-                       help="output information on hw support as a hex table",
-                       action='store_true')
-    parser.add_argument("-d", "--pcidb", dest="pcifile",
-                        help="specify a pci database to get vendor names from",
-                        default=pcifile_default, metavar="FILE")
-    parser.add_argument("-p", "--plugindir", dest="pdir",
-                        help="scan dpdk for autoload plugins",
-                        action='store_true')
-    parser.add_argument("elf_file", help="driver shared object file")
-    args = parser.parse_args()
+        description=__doc__,
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "-p",
+        "--search-plugins",
+        action="store_true",
+        help="""
+        In addition of ELF_FILEs and their linked dynamic libraries, also scan
+        the DPDK plugins path.
+        """,
+    )
+    parser.add_argument(
+        "elf_files",
+        metavar="ELF_FILE",
+        nargs="+",
+        type=existing_file,
+        help="""
+        DPDK application binary or dynamic library.
+        """,
+    )
+    return parser.parse_args()
 
-    if args.raw_output:
-        raw_output = True
 
-    if args.tblout:
-        args.pcifile = None
+# ----------------------------------------------------------------------------
+def parse_pmdinfo(paths: Iterable[Path], search_plugins: bool) -> List[dict]:
+    """
+    Extract DPDK PMD info JSON strings from an ELF file.
 
-    if args.pcifile:
-        pcidb = PCIIds(args.pcifile)
-        if pcidb is None:
-            print("Pci DB file not found")
-            exit(1)
+    :returns:
+        A list of DPDK drivers info dictionaries.
+    """
+    binaries = set(paths)
+    for p in paths:
+        binaries.update(get_needed_libs(p))
+    if search_plugins:
+        # cast to list to avoid errors with update while iterating
+        binaries.update(list(get_plugin_libs(binaries)))
 
-    if args.pdir:
-        exit(scan_for_autoload_pmds(args.elf_file))
+    drivers = []
 
-    ldlibpath = os.environ.get('LD_LIBRARY_PATH')
-    if ldlibpath is None:
-        ldlibpath = ""
-
-    if os.path.exists(args.elf_file):
-        myelffile = args.elf_file
-    else:
-        myelffile = search_file(args.elf_file,
-                                ldlibpath + ":/usr/lib64:/lib64:/usr/lib:/lib")
-
-    if myelffile is None:
-        print("File not found")
-        sys.exit(1)
-
-    with open(myelffile, 'rb') as file:
+    for b in binaries:
         try:
-            readelf = ReadElf(file, sys.stdout)
-            readelf.process_dt_needed_entries()
-            readelf.display_pmd_info_strings(".rodata")
-            sys.exit(0)
+            for s in get_elf_strings(b, ".rodata", "PMD_INFO_STRING="):
+                try:
+                    info = json.loads(s)
+                    # convert numerical ids to hex strings
+                    info["devices"] = []
+                    for vendor, device, subdev, subsys in info.pop("pci_ids"):
+                        info["devices"].append(
+                            {
+                                "vendor_id": f"{vendor:04x}",
+                                "device_id": f"{device:04x}",
+                                "subsystem_device_id": f"{subdev:04x}",
+                                "subsystem_system_id": f"{subsys:04x}",
+                            }
+                        )
+                    drivers.append(info)
+                except ValueError as e:
+                    print(f"warning: {b}: {e}", file=sys.stderr)
+        except FileNotFoundError as e:
+            print(f"warning: {b}: {e}", file=sys.stderr)
+        except ELFError as e:
+            print(f"warning: {b}: elf error: {e}", file=sys.stderr)
 
-        except ELFError as ex:
-            sys.stderr.write('ELF error: %s\n' % ex)
-            sys.exit(1)
+    return drivers
 
 
-# -------------------------------------------------------------------------
-if __name__ == '__main__':
-    main()
+# ----------------------------------------------------------------------------
+def get_plugin_libs(binaries: Iterable[Path]) -> Iterator[Path]:
+    """
+    Look into the provided binaries for DPDK_PLUGIN_PATH and scan the path
+    for files.
+    """
+    for b in binaries:
+        for p in get_elf_strings(b, ".rodata", "DPDK_PLUGIN_PATH="):
+            plugin_path = p.strip()
+            for root, _, files in os.walk(plugin_path):
+                for f in files:
+                    yield Path(root) / f
+            # no need to search in other binaries.
+            return
+
+
+# ----------------------------------------------------------------------------
+def existing_file(value: str) -> Path:
+    """
+    Argparse type= callback to ensure an argument points to a valid file path.
+    """
+    path = Path(value)
+    if not path.is_file():
+        raise argparse.ArgumentTypeError(f"{value}: No such file")
+    return path
+
+
+# ----------------------------------------------------------------------------
+def search_ld_library_path(name: str) -> Path:
+    """
+    Search a file into LD_LIBRARY_PATH and the standard folders where libraries
+    are usually located.
+
+    :raises FileNotFoundError:
+    """
+    folders = []
+    if "LD_LIBRARY_PATH" in os.environ:
+        folders += os.environ["LD_LIBRARY_PATH"].split(":")
+    folders += ["/usr/lib64", "/lib64", "/usr/lib", "/lib"]
+    for d in folders:
+        filepath = Path(d) / name
+        if filepath.is_file():
+            return filepath
+    raise FileNotFoundError(name)
+
+
+# ----------------------------------------------------------------------------
+PRINTABLE_BYTES = frozenset(string.printable.encode("ascii"))
+
+
+def find_strings(buf: bytes, prefix: str) -> Iterator[str]:
+    """
+    Extract strings of printable ASCII characters from a bytes buffer.
+    """
+    view = memoryview(buf)
+    start = None
+
+    for i, b in enumerate(view):
+        if start is None and b in PRINTABLE_BYTES:
+            # mark begining of string
+            start = i
+            continue
+
+        if start is not None:
+            if b in PRINTABLE_BYTES:
+                # string not finished
+                continue
+            if b == 0:
+                # end of string
+                s = view[start:i].tobytes().decode("ascii")
+                if s.startswith(prefix):
+                    yield s[len(prefix) :]
+            # There can be byte sequences where a non-printable character
+            # follows a printable one. Ignore that.
+            start = None
+
+
+# ----------------------------------------------------------------------------
+def elftools_version():
+    """
+    Extract pyelftools version as a tuple of integers for easy comparison.
+    """
+    version = getattr(elftools, "__version__", "")
+    match = re.match(r"^(\d+)\.(\d+).*$", str(version))
+    if not match:
+        # cannot determine version, hope for the best
+        return (0, 24)
+    return (int(match[1]), int(match[2]))
+
+
+ELFTOOLS_VERSION = elftools_version()
+
+
+def from_elftools(s: Union[bytes, str]) -> str:
+    """
+    Earlier versions of pyelftools (< 0.24) return bytes encoded with "latin-1"
+    instead of python strings.
+    """
+    if isinstance(s, bytes):
+        return s.decode("latin-1")
+    return s
+
+
+def to_elftools(s: str) -> Union[bytes, str]:
+    """
+    Earlier versions of pyelftools (< 0.24) assume that ELF section and tags
+    are bytes encoded with "latin-1" instead of python strings.
+    """
+    if ELFTOOLS_VERSION < (0, 24):
+        return s.encode("latin-1")
+    return s
+
+
+# ----------------------------------------------------------------------------
+def get_elf_strings(path: Path, section: str, prefix: str) -> Iterator[str]:
+    """
+    Extract strings from a named ELF section in a file.
+    """
+    with path.open("rb") as f:
+        elf = ELFFile(f)
+        sec = elf.get_section_by_name(to_elftools(section))
+        if not sec:
+            return
+        yield from find_strings(sec.data(), prefix)
+
+
+# ----------------------------------------------------------------------------
+def get_needed_libs(path: Path) -> Iterator[Path]:
+    """
+    Extract the dynamic library dependencies from an ELF file.
+    """
+    with path.open("rb") as f:
+        elf = ELFFile(f)
+        dyn = elf.get_section_by_name(to_elftools(".dynamic"))
+        if not dyn:
+            return
+        for tag in dyn.iter_tags(to_elftools("DT_NEEDED")):
+            needed = from_elftools(tag.needed)
+            if not needed.startswith("librte_"):
+                continue
+            try:
+                yield search_ld_library_path(needed)
+            except FileNotFoundError:
+                print(f"warning: cannot find {needed}", file=sys.stderr)
+
+
+# ----------------------------------------------------------------------------
+if __name__ == "__main__":
+    sys.exit(main())
-- 
2.37.3


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-10-12 20:40 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-13 10:58 [PATCH] usertools: rewrite pmdinfo Robin Jarry
2022-09-13 11:29 ` Ferruh Yigit
2022-09-13 11:49   ` Robin Jarry
2022-09-13 13:50     ` Ferruh Yigit
2022-09-13 13:59       ` Robin Jarry
2022-09-13 14:17         ` Ferruh Yigit
2022-09-13 14:17 ` Bruce Richardson
2022-09-13 19:42 ` [PATCH v2] " Robin Jarry
2022-09-13 20:54   ` Ferruh Yigit
2022-09-13 21:22     ` Robin Jarry
2022-09-14 11:46       ` Ferruh Yigit
2022-09-15  9:18         ` Robin Jarry
2022-09-20  9:08 ` [PATCH v3] " Robin Jarry
2022-09-20 10:10   ` Ferruh Yigit
2022-09-20 10:12     ` Robin Jarry
2022-09-20 10:42 ` [PATCH v4] " Robin Jarry
2022-09-20 14:08   ` Olivier Matz
2022-09-20 17:48   ` Ferruh Yigit
2022-09-20 17:50     ` Ferruh Yigit
2022-09-21  7:27       ` Thomas Monjalon
2022-09-21  8:02         ` Ferruh Yigit
2022-09-20 19:15     ` Robin Jarry
2022-09-21  7:58       ` Ferruh Yigit
2022-09-21  9:57         ` Ferruh Yigit
2022-09-22 11:58 ` [PATCH v5] " Robin Jarry
2022-09-22 12:03   ` Bruce Richardson
2022-09-22 15:12   ` Ferruh Yigit
2022-09-26 11:55   ` Olivier Matz
2022-09-26 12:52   ` Robin Jarry
2022-09-26 13:44 ` [PATCH v6] " Robin Jarry
2022-09-26 15:17   ` Bruce Richardson
2022-09-28  6:51     ` Robin Jarry
2022-09-28 10:53       ` Bruce Richardson
2022-09-28 11:12         ` Robin Jarry
2022-10-04 19:29 ` [PATCH v7] " Robin Jarry
2022-10-10 22:44   ` Thomas Monjalon
2022-10-12 15:16     ` Olivier Matz
2022-10-12 16:16       ` Thomas Monjalon
2022-10-12 16:30         ` Robin Jarry
2022-10-12 16:44           ` Thomas Monjalon
2022-10-12 16:48             ` Robin Jarry
2022-10-12 20:40               ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).