DPDK patches and discussions
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download: 
* [RFC v4 5/8] build: generate symbol maps
  2025-03-17 15:42  3% ` [RFC v4 " David Marchand
  2025-03-17 15:42 16%   ` [RFC v4 3/8] eal: rework function versioning macros David Marchand
@ 2025-03-17 15:43 18%   ` David Marchand
  1 sibling, 0 replies; 200+ results
From: David Marchand @ 2025-03-17 15:43 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue

Rather than maintain a file in parallel of the code, symbols to be
exported can be marked with a token RTE_EXPORT_*SYMBOL.

From those marks, the build framework generates map files only for
symbols actually compiled (which means that the WINDOWS_NO_EXPORT hack
becomes unnecessary).

The build framework directly creates a map file in the format that the
linker expects (rather than converting from GNU linker to MSVC linker).

Empty maps are allowed again as a replacement for drivers/version.map.

The symbol check is updated to only support the new format.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since RFC v3:
- polished python,
- fixed doc updates not belonging to this patch,
- renamed map files,
- changed msvc->mslinker as link mode,
- added parsing of AVX sources,

Changes since RFC v2:
- because of MSVC limitations wrt macro passed via cmdline,
  used an internal header for defining RTE_EXPORT_* macros,
- updated documentation and tooling,

---
 MAINTAINERS                                |   2 +
 buildtools/gen-version-map.py              | 106 ++++++++++
 buildtools/map-list-symbol.sh              |  10 +-
 buildtools/meson.build                     |   1 +
 config/meson.build                         |   4 +-
 config/rte_export.h                        |  16 ++
 devtools/check-symbol-change.py            |  90 +++++++++
 devtools/check-symbol-maps.sh              |  14 --
 devtools/checkpatches.sh                   |   2 +-
 doc/guides/contributing/abi_versioning.rst | 221 ++-------------------
 drivers/meson.build                        |  96 +++++----
 drivers/version.map                        |   3 -
 lib/meson.build                            |  89 ++++++---
 13 files changed, 363 insertions(+), 291 deletions(-)
 create mode 100755 buildtools/gen-version-map.py
 create mode 100644 config/rte_export.h
 create mode 100755 devtools/check-symbol-change.py
 delete mode 100644 drivers/version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 82f6e2f917..63754e76e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -95,6 +95,7 @@ F: devtools/check-maintainers.sh
 F: devtools/check-forbidden-tokens.awk
 F: devtools/check-git-log.sh
 F: devtools/check-spdx-tag.sh
+F: devtools/check-symbol-change.py
 F: devtools/check-symbol-change.sh
 F: devtools/check-symbol-maps.sh
 F: devtools/checkpatches.sh
@@ -127,6 +128,7 @@ F: config/
 F: buildtools/check-symbols.sh
 F: buildtools/chkincs/
 F: buildtools/call-sphinx-build.py
+F: buildtools/gen-version-map.py
 F: buildtools/get-cpu-count.py
 F: buildtools/get-numa-count.py
 F: buildtools/list-dir-globs.py
diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
new file mode 100755
index 0000000000..3364c224c0
--- /dev/null
+++ b/buildtools/gen-version-map.py
@@ -0,0 +1,106 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2025 Red Hat, Inc.
+
+"""Generate a version map file used by GNU or MSVC linker."""
+
+import re
+import sys
+
+scriptname, link_mode, abi_version_file, output, *files = sys.argv
+
+# From rte_export.h
+export_exp_sym_regexp = re.compile(r"^RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+), ([0-9]+.[0-9]+)\)")
+export_int_sym_regexp = re.compile(r"^RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
+export_sym_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^)]+)\)")
+# From rte_function_versioning.h
+ver_sym_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+ver_exp_sym_regexp = re.compile(r"^RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
+default_sym_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+
+with open(abi_version_file) as f:
+    abi = 'DPDK_{}'.format(re.match("([0-9]+).[0-9]", f.readline()).group(1))
+
+symbols = {}
+
+for file in files:
+    with open(file, encoding="utf-8") as f:
+        for ln in f.readlines():
+            node = None
+            symbol = None
+            comment = ''
+            if export_exp_sym_regexp.match(ln):
+                node = 'EXPERIMENTAL'
+                symbol = export_exp_sym_regexp.match(ln).group(1)
+                comment = ' # added in {}'.format(export_exp_sym_regexp.match(ln).group(2))
+            elif export_int_sym_regexp.match(ln):
+                node = 'INTERNAL'
+                symbol = export_int_sym_regexp.match(ln).group(1)
+            elif export_sym_regexp.match(ln):
+                node = abi
+                symbol = export_sym_regexp.match(ln).group(1)
+            elif ver_sym_regexp.match(ln):
+                node = 'DPDK_{}'.format(ver_sym_regexp.match(ln).group(1))
+                symbol = ver_sym_regexp.match(ln).group(2)
+            elif ver_exp_sym_regexp.match(ln):
+                node = 'EXPERIMENTAL'
+                symbol = ver_exp_sym_regexp.match(ln).group(1)
+            elif default_sym_regexp.match(ln):
+                node = 'DPDK_{}'.format(default_sym_regexp.match(ln).group(1))
+                symbol = default_sym_regexp.match(ln).group(2)
+
+            if not symbol:
+                continue
+
+            if node not in symbols:
+                symbols[node] = {}
+            symbols[node][symbol] = comment
+
+if link_mode == 'msvc':
+    with open(output, "w") as outfile:
+        print(f"EXPORTS", file=outfile)
+        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
+            if key not in symbols:
+                continue
+            for symbol in sorted(symbols[key].keys()):
+                print(f"\t{symbol}", file=outfile)
+            del symbols[key]
+else:
+    with open(output, "w") as outfile:
+        local_token = False
+        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
+            if key not in symbols:
+                continue
+            print(f"{key} {{\n\tglobal:\n", file=outfile)
+            for symbol in sorted(symbols[key].keys()):
+                if link_mode == 'mingw' and symbol.startswith('per_lcore'):
+                    prefix = '__emutls_v.'
+                else:
+                    prefix = ''
+                comment = symbols[key][symbol]
+                print(f"\t{prefix}{symbol};{comment}", file=outfile)
+            if not local_token:
+                print("\n\tlocal: *;", file=outfile)
+                local_token = True
+            print("};", file=outfile)
+            del symbols[key]
+        for key in sorted(symbols.keys()):
+            print(f"{key} {{\n\tglobal:\n", file=outfile)
+            for symbol in sorted(symbols[key].keys()):
+                if link_mode == 'mingw' and symbol.startswith('per_lcore'):
+                    prefix = '__emutls_v.'
+                else:
+                    prefix = ''
+                comment = symbols[key][symbol]
+                print(f"\t{prefix}{symbol};{comment}", file=outfile)
+            print(f"}} {abi};", file=outfile)
+            if not local_token:
+                print("\n\tlocal: *;", file=outfile)
+                local_token = True
+            del symbols[key]
+        # No exported symbol, add a catch all
+        if not local_token:
+            print(f"{abi} {{", file=outfile)
+            print("\n\tlocal: *;", file=outfile)
+            local_token = True
+            print("};", file=outfile)
diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
index eb98451d8e..0829df4be5 100755
--- a/buildtools/map-list-symbol.sh
+++ b/buildtools/map-list-symbol.sh
@@ -62,10 +62,14 @@ for file in $@; do
 		if (current_section == "") {
 			next;
 		}
+		symbol_version = current_version
+		if (/^[^}].*[^:*]; # added in /) {
+			symbol_version = $5
+		}
 		if ("'$version'" != "") {
-			if ("'$version'" == "unset" && current_version != "") {
+			if ("'$version'" == "unset" && symbol_version != "") {
 				next;
-			} else if ("'$version'" != "unset" && "'$version'" != current_version) {
+			} else if ("'$version'" != "unset" && "'$version'" != symbol_version) {
 				next;
 			}
 		}
@@ -73,7 +77,7 @@ for file in $@; do
 		if ("'$symbol'" == "all" || $1 == "'$symbol'") {
 			ret = 0;
 			if ("'$quiet'" == "") {
-				print "'$file' "current_section" "$1" "current_version;
+				print "'$file' "current_section" "$1" "symbol_version;
 			}
 			if ("'$symbol'" != "all") {
 				exit 0;
diff --git a/buildtools/meson.build b/buildtools/meson.build
index 4e2c1217a2..b745e9afa4 100644
--- a/buildtools/meson.build
+++ b/buildtools/meson.build
@@ -16,6 +16,7 @@ else
     py3 = ['meson', 'runpython']
 endif
 echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
+gen_version_map = py3 + files('gen-version-map.py')
 list_dir_globs = py3 + files('list-dir-globs.py')
 map_to_win_cmd = py3 + files('map_to_win.py')
 sphinx_wrapper = py3 + files('call-sphinx-build.py')
diff --git a/config/meson.build b/config/meson.build
index f31fef216c..f2d5401ea2 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -300,11 +300,13 @@ if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
     dpdk_extra_ldflags += '-latomic'
 endif
 
-# add -include rte_config to cflags
+# add -include some headers to cflags
 if is_ms_compiler
     add_project_arguments('/FI', 'rte_config.h', language: 'c')
+    add_project_arguments('/FI', 'rte_export.h', language: 'c')
 else
     add_project_arguments('-include', 'rte_config.h', language: 'c')
+    add_project_arguments('-include', 'rte_export.h', language: 'c')
 endif
 
 # enable extra warnings and disable any unwanted warnings
diff --git a/config/rte_export.h b/config/rte_export.h
new file mode 100644
index 0000000000..83d871fe11
--- /dev/null
+++ b/config/rte_export.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2025 Red Hat, Inc.
+ */
+
+#ifndef RTE_EXPORT_H
+#define RTE_EXPORT_H
+
+/* *Internal* macros for exporting symbols, used by the build system.
+ * For RTE_EXPORT_EXPERIMENTAL_SYMBOL, ver indicates the
+ * version this symbol was introduced in.
+ */
+#define RTE_EXPORT_EXPERIMENTAL_SYMBOL(a, ver)
+#define RTE_EXPORT_INTERNAL_SYMBOL(a)
+#define RTE_EXPORT_SYMBOL(a)
+
+#endif /* RTE_EXPORT_H */
diff --git a/devtools/check-symbol-change.py b/devtools/check-symbol-change.py
new file mode 100755
index 0000000000..09709e4f06
--- /dev/null
+++ b/devtools/check-symbol-change.py
@@ -0,0 +1,90 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2025 Red Hat, Inc.
+
+"""Check exported symbols change in a patch."""
+
+import re
+import sys
+
+file_header_regexp = re.compile(r"^(\-\-\-|\+\+\+) [ab]/(lib|drivers)/([^/]+)/([^/]+)")
+# From rte_export.h
+export_exp_sym_regexp = re.compile(r"^.RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+),")
+export_int_sym_regexp = re.compile(r"^.RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
+export_sym_regexp = re.compile(r"^.RTE_EXPORT_SYMBOL\(([^)]+)\)")
+# TODO, handle versioned symbols from rte_function_versioning.h
+# ver_sym_regexp = re.compile(r"^.RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+# ver_exp_sym_regexp = re.compile(r"^.RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
+# default_sym_regexp = re.compile(r"^.RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+
+symbols = {}
+
+for file in sys.argv[1:]:
+    with open(file, encoding="utf-8") as f:
+        for ln in f.readlines():
+            if file_header_regexp.match(ln):
+                if file_header_regexp.match(ln).group(2) == "lib":
+                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
+                elif file_header_regexp.match(ln).group(3) == "intel":
+                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3, 4))
+                else:
+                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
+
+                if lib not in symbols:
+                    symbols[lib] = {}
+                continue
+
+            if export_exp_sym_regexp.match(ln):
+                symbol = export_exp_sym_regexp.match(ln).group(1)
+                node = 'EXPERIMENTAL'
+            elif export_int_sym_regexp.match(ln):
+                node = 'INTERNAL'
+                symbol = export_int_sym_regexp.match(ln).group(1)
+            elif export_sym_regexp.match(ln):
+                symbol = export_sym_regexp.match(ln).group(1)
+                node = 'stable'
+            else:
+                continue
+
+            if symbol not in symbols[lib]:
+                symbols[lib][symbol] = {}
+            added = ln[0] == '+'
+            if added and 'added' in symbols[lib][symbol] and node != symbols[lib][symbol]['added']:
+                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
+            if not added and 'removed' in symbols[lib][symbol] and node != symbols[lib][symbol]['removed']:
+                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
+            if added:
+                symbols[lib][symbol]['added'] = node
+            else:
+                symbols[lib][symbol]['removed'] = node
+
+    for lib in sorted(symbols.keys()):
+        error = False
+        for symbol in sorted(symbols[lib].keys()):
+            if 'removed' not in symbols[lib][symbol]:
+                # Symbol addition
+                node = symbols[lib][symbol]['added']
+                if node == 'stable':
+                    print(f"ERROR: {symbol} in {lib} has been added directly to stable ABI.")
+                    error = True
+                else:
+                    print(f"INFO: {symbol} in {lib} has been added to {node} ABI.")
+                continue
+
+            if 'added' not in symbols[lib][symbol]:
+                # Symbol removal
+                node = symbols[lib][symbol]['added']
+                if node == 'stable':
+                    print(f"INFO: {symbol} in {lib} has been removed from stable ABI.")
+                    print(f"Please check it has gone though the deprecation process.")
+                continue
+
+            if symbols[lib][symbol]['added'] == symbols[lib][symbol]['removed']:
+                # Symbol was moved around
+                continue
+
+            # Symbol modifications
+            added = symbols[lib][symbol]['added']
+            removed = symbols[lib][symbol]['removed']
+            print(f"INFO: {symbol} in {lib} is moving from {removed} to {added}")
+            print(f"Please check it has gone though the deprecation process.")
diff --git a/devtools/check-symbol-maps.sh b/devtools/check-symbol-maps.sh
index 6121f78ec6..fcd3931e5d 100755
--- a/devtools/check-symbol-maps.sh
+++ b/devtools/check-symbol-maps.sh
@@ -60,20 +60,6 @@ if [ -n "$local_miss_maps" ] ; then
     ret=1
 fi
 
-find_empty_maps ()
-{
-    for map in $@ ; do
-        [ $(buildtools/map-list-symbol.sh $map | wc -l) != '0' ] || echo $map
-    done
-}
-
-empty_maps=$(find_empty_maps $@)
-if [ -n "$empty_maps" ] ; then
-    echo "Found empty maps:"
-    echo "$empty_maps"
-    ret=1
-fi
-
 find_bad_format_maps ()
 {
     abi_version=$(cut -d'.' -f 1 ABI_VERSION)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index c9088bb403..9180c2b070 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -33,7 +33,7 @@ VOLATILE,PREFER_PACKED,PREFER_ALIGNED,PREFER_PRINTF,STRLCPY,\
 PREFER_KERNEL_TYPES,PREFER_FALLTHROUGH,BIT_MACRO,CONST_STRUCT,\
 SPLIT_STRING,LONG_LINE_STRING,C99_COMMENT_TOLERANCE,\
 LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,\
-NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG"
+NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG,EXPORT_SYMBOL"
 options="$options $DPDK_CHECKPATCH_OPTIONS"
 
 print_usage () {
diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 21f8f8cd14..cbcbedfaf0 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -58,12 +58,12 @@ persists over multiple releases.
 
 .. code-block:: none
 
- $ head ./lib/acl/version.map
+ $ head ./build/lib/acl_exports.map
  DPDK_21 {
         global:
  ...
 
- $ head ./lib/eal/version.map
+ $ head ./build/lib/eal_exports.map
  DPDK_21 {
         global:
  ...
@@ -77,7 +77,7 @@ that library.
 
 .. code-block:: none
 
- $ head ./lib/acl/version.map
+ $ head ./build/lib/acl_exports.map
  DPDK_21 {
         global:
  ...
@@ -88,7 +88,7 @@ that library.
  } DPDK_21;
  ...
 
- $ head ./lib/eal/version.map
+ $ head ./build/lib/eal_exports.map
  DPDK_21 {
         global:
  ...
@@ -100,12 +100,12 @@ how this may be done.
 
 .. code-block:: none
 
- $ head ./lib/acl/version.map
+ $ head ./build/lib/acl_exports.map
  DPDK_22 {
         global:
  ...
 
- $ head ./lib/eal/version.map
+ $ head ./build/lib/eal_exports.map
  DPDK_22 {
         global:
  ...
@@ -134,8 +134,7 @@ linked to the DPDK.
 
 To support backward compatibility the ``rte_function_versioning.h``
 header file provides macros to use when updating exported functions. These
-macros are used in conjunction with the ``version.map`` file for
-a given library to allow multiple versions of a symbol to exist in a shared
+macros allow multiple versions of a symbol to exist in a shared
 library so that older binaries need not be immediately recompiled.
 
 The macros are:
@@ -169,6 +168,7 @@ Assume we have a function as follows
   * Create an acl context object for apps to
   * manipulate
   */
+ RTE_EXPORT_SYMBOL(rte_acl_create)
  int
  rte_acl_create(struct rte_acl_param *param)
  {
@@ -187,6 +187,7 @@ private, is safe), but it also requires modifying the code as follows
   * Create an acl context object for apps to
   * manipulate
   */
+ RTE_EXPORT_SYMBOL(rte_acl_create)
  int
  rte_acl_create(struct rte_acl_param *param, int debug)
  {
@@ -203,78 +204,16 @@ The addition of a parameter to the function is ABI breaking as the function is
 public, and existing application may use it in its current form. However, the
 compatibility macros in DPDK allow a developer to use symbol versioning so that
 multiple functions can be mapped to the same public symbol based on when an
-application was linked to it. To see how this is done, we start with the
-requisite libraries version map file. Initially the version map file for the acl
-library looks like this
+application was linked to it.
 
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_create;
-        rte_acl_dump;
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
-   };
-
-This file needs to be modified as follows
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_create;
-        rte_acl_dump;
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-        rte_acl_create;
-
-   } DPDK_21;
-
-The addition of the new block tells the linker that a new version node
-``DPDK_22`` is available, which contains the symbol rte_acl_create, and inherits
-the symbols from the DPDK_21 node. This list is directly translated into a
-list of exported symbols when DPDK is compiled as a shared library.
-
-Next, we need to specify in the code which function maps to the rte_acl_create
+We need to specify in the code which function maps to the rte_acl_create
 symbol at which versions.  First, at the site of the initial symbol definition,
 we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
 the function return type, the function name and its arguments.
 
 .. code-block:: c
 
+ -RTE_EXPORT_SYMBOL(rte_acl_create)
  -int
  -rte_acl_create(struct rte_acl_param *param)
  +RTE_VERSION_SYMBOL(21, int, rte_acl_create, (struct rte_acl_param *param))
@@ -314,9 +253,9 @@ The macro instructs the linker to create the new default symbol
 ``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
 (declared by the macro).
 
-And that's it, on the next shared library rebuild, there will be two versions of
-rte_acl_create, an old DPDK_21 version, used by previously built applications,
-and a new DPDK_22 version, used by future built applications.
+And that's it. On the next shared library rebuild, there will be two versions of rte_acl_create,
+an old DPDK_21 version, used by previously built applications, and a new DPDK_22 version,
+used by newly built applications.
 
 .. note::
 
@@ -366,6 +305,7 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
     * Create an acl context object for apps to
     * manipulate
     */
+   RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_acl_create)
    __rte_experimental
    int
    rte_acl_create(struct rte_acl_param *param)
@@ -373,27 +313,8 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
    ...
    }
 
-In the map file, experimental symbols are listed as part of the ``EXPERIMENTAL``
-version node.
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-        ...
-
-        local: *;
-   };
-
-   EXPERIMENTAL {
-        global:
-
-        rte_acl_create;
-   };
-
 When we promote the symbol to the stable ABI, we simply strip the
-``__rte_experimental`` annotation from the function and move the symbol from the
-``EXPERIMENTAL`` node, to the node of the next major ABI version as follow.
+``__rte_experimental`` annotation from the function.
 
 .. code-block:: c
 
@@ -401,31 +322,13 @@ When we promote the symbol to the stable ABI, we simply strip the
     * Create an acl context object for apps to
     * manipulate
     */
+   RTE_EXPORT_SYMBOL(rte_acl_create)
    int
    rte_acl_create(struct rte_acl_param *param)
    {
           ...
    }
 
-We then update the map file, adding the symbol ``rte_acl_create``
-to the ``DPDK_22`` version node.
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-        ...
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-
-        rte_acl_create;
-   } DPDK_21;
-
-
 Although there are strictly no guarantees or commitments associated with
 :ref:`experimental symbols <experimental_apis>`, a maintainer may wish to offer
 an alias to experimental. The process to add an alias to experimental,
@@ -452,30 +355,6 @@ and ``DPDK_22`` version nodes.
       return rte_acl_create(param);
    }
 
-In the map file, we map the symbol to both the ``EXPERIMENTAL``
-and ``DPDK_22`` version nodes.
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-        ...
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-
-        rte_acl_create;
-   } DPDK_21;
-
-   EXPERIMENTAL {
-        global:
-
-        rte_acl_create;
-   };
-
 .. _abi_deprecation:
 
 Deprecating part of a public API
@@ -484,38 +363,7 @@ ________________________________
 Lets assume that you've done the above updates, and in preparation for the next
 major ABI version you decide you would like to retire the old version of the
 function. After having gone through the ABI deprecation announcement process,
-removal is easy. Start by removing the symbol from the requisite version map
-file:
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_dump;
- -      rte_acl_create
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-        rte_acl_create;
-   } DPDK_21;
-
+removal is easy.
 
 Next remove the corresponding versioned export.
 
@@ -539,36 +387,7 @@ of a major ABI version. If a version node completely specifies an API, then
 removing part of it, typically makes it incomplete. In those cases it is better
 to remove the entire node.
 
-To do this, start by modifying the version map file, such that all symbols from
-the node to be removed are merged into the next node in the map.
-
-In the case of our map above, it would transform to look as follows
-
-.. code-block:: none
-
-   DPDK_22 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_dump;
-        rte_acl_create
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
- };
-
-Then any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
+Any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
 updated to point to the new version node in any header files for all affected
 symbols.
 
diff --git a/drivers/meson.build b/drivers/meson.build
index c42c7764bf..b5434715f3 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -275,14 +275,14 @@ foreach subpath:subdirs
                 dependencies: static_deps,
                 c_args: cflags)
         objs += tmp_lib.extract_all_objects(recursive: true)
-        sources = custom_target(out_filename,
+        sources_pmd_info = custom_target(out_filename,
                 command: [pmdinfo, tmp_lib.full_path(), '@OUTPUT@', pmdinfogen],
                 output: out_filename,
                 depends: [tmp_lib])
 
         # now build the static driver
         static_lib = static_library(lib_name,
-                sources,
+                sources_pmd_info,
                 objects: objs,
                 include_directories: includes,
                 dependencies: static_deps,
@@ -292,48 +292,70 @@ foreach subpath:subdirs
         # now build the shared driver
         version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), drv_path)
 
-        lk_deps = []
-        lk_args = []
         if not fs.is_file(version_map)
-            version_map = '@0@/version.map'.format(meson.current_source_dir())
-            lk_deps += [version_map]
-        else
-            lk_deps += [version_map]
-            if not is_windows and developer_mode
-                # on unix systems check the output of the
-                # check-symbols.sh script, using it as a
-                # dependency of the .so build
-                lk_deps += custom_target(lib_name + '.sym_chk',
-                        command: [check_symbols, version_map, '@INPUT@'],
-                        capture: true,
-                        input: static_lib,
-                        output: lib_name + '.sym_chk')
-            endif
-        endif
-
-        if is_windows
             if is_ms_linker
-                def_file = custom_target(lib_name + '_def',
-                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                        input: version_map,
-                        output: '@0@_exports.def'.format(lib_name))
-                lk_deps += [def_file]
-
-                lk_args = ['-Wl,/def:' + def_file.full_path()]
+                link_mode = 'mslinker'
+            elif is_windows
+                link_mode = 'mingw'
             else
-                mingw_map = custom_target(lib_name + '_mingw',
-                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                        input: version_map,
-                        output: '@0@_mingw.map'.format(lib_name))
-                lk_deps += [mingw_map]
-
-                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+                link_mode = 'gnu'
+            endif
+            version_map = custom_target(lib_name + '_map',
+                    command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
+                    input: sources + sources_avx2 + sources_avx512,
+                    output: '_'.join(class, name, 'exports.map'))
+            version_map_path = version_map.full_path()
+            version_map_dep = [version_map]
+            lk_deps = [version_map]
+
+            if is_ms_linker and is_ms_compiler
+                lk_args = ['/def:' + version_map.full_path()]
+            elif is_ms_linker
+                lk_args = ['-Wl,/def:' + version_map.full_path()]
+            else
+                lk_args = ['-Wl,--version-script=' + version_map.full_path()]
             endif
         else
-            lk_args = ['-Wl,--version-script=' + version_map]
+            version_map_path = version_map
+            version_map_dep = []
+            lk_deps = [version_map]
+
+            if is_windows
+                if is_ms_linker
+                    def_file = custom_target(lib_name + '_def',
+                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
+                            input: version_map,
+                            output: '@0@_exports.def'.format(lib_name))
+                    lk_deps += [def_file]
+
+                    lk_args = ['-Wl,/def:' + def_file.full_path()]
+                else
+                    mingw_map = custom_target(lib_name + '_mingw',
+                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
+                            input: version_map,
+                            output: '@0@_mingw.map'.format(lib_name))
+                    lk_deps += [mingw_map]
+
+                    lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+                endif
+            else
+                lk_args = ['-Wl,--version-script=' + version_map]
+            endif
+        endif
+
+        if not is_windows and developer_mode
+            # on unix systems check the output of the
+            # check-symbols.sh script, using it as a
+            # dependency of the .so build
+            lk_deps += custom_target(lib_name + '.sym_chk',
+                    command: [check_symbols, version_map_path, '@INPUT@'],
+                    capture: true,
+                    input: static_lib,
+                    output: lib_name + '.sym_chk',
+                    depends: version_map_dep)
         endif
 
-        shared_lib = shared_library(lib_name, sources,
+        shared_lib = shared_library(lib_name, sources_pmd_info,
                 objects: objs,
                 include_directories: includes,
                 dependencies: shared_deps,
diff --git a/drivers/version.map b/drivers/version.map
deleted file mode 100644
index 17cc97bda6..0000000000
--- a/drivers/version.map
+++ /dev/null
@@ -1,3 +0,0 @@
-DPDK_25 {
-	local: *;
-};
diff --git a/lib/meson.build b/lib/meson.build
index ce92cb5537..952ae4696f 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2019 Intel Corporation
 
+fs = import('fs')
 
 # process all libraries equally, as far as possible
 # "core" libs first, then others alphabetically as far as possible
@@ -254,42 +255,58 @@ foreach l:libraries
             include_directories: includes,
             dependencies: static_deps)
 
-    if not use_function_versioning or is_windows
-        # use pre-build objects to build shared lib
-        sources = []
-        objs += static_lib.extract_all_objects(recursive: false)
-    else
-        # for compat we need to rebuild with
-        # RTE_BUILD_SHARED_LIB defined
-        cflags += '-DRTE_BUILD_SHARED_LIB'
-    endif
-
-    version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
-    lk_deps = [version_map]
-
-    if is_ms_linker
-        def_file = custom_target(libname + '_def',
-                command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                input: version_map,
-                output: '@0@_exports.def'.format(libname))
-        lk_deps += [def_file]
+    if not fs.is_file('@0@/@1@/version.map'.format(meson.current_source_dir(), l))
+        if is_ms_linker
+            link_mode = 'mslinker'
+        elif is_windows
+            link_mode = 'mingw'
+        else
+            link_mode = 'gnu'
+        endif
+        version_map = custom_target(libname + '_map',
+                command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
+                input: sources,
+                output: '_'.join(name, 'exports.map'))
+        version_map_path = version_map.full_path()
+        version_map_dep = [version_map]
+        lk_deps = [version_map]
 
-        if is_ms_compiler
-            lk_args = ['/def:' + def_file.full_path()]
+        if is_ms_linker and is_ms_compiler
+            lk_args = ['/def:' + version_map.full_path()]
+        elif is_ms_linker
+            lk_args = ['-Wl,/def:' + version_map.full_path()]
         else
-            lk_args = ['-Wl,/def:' + def_file.full_path()]
+            lk_args = ['-Wl,--version-script=' + version_map.full_path()]
         endif
     else
-        if is_windows
-            mingw_map = custom_target(libname + '_mingw',
+        version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
+        version_map_path = version_map
+        version_map_dep = []
+        lk_deps = [version_map]
+        if is_ms_linker
+            def_file = custom_target(libname + '_def',
                     command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
                     input: version_map,
-                    output: '@0@_mingw.map'.format(libname))
-            lk_deps += [mingw_map]
+                    output: '@0@_exports.def'.format(libname))
+            lk_deps += [def_file]
 
-            lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+            if is_ms_compiler
+                lk_args = ['/def:' + def_file.full_path()]
+            else
+                lk_args = ['-Wl,/def:' + def_file.full_path()]
+            endif
         else
-            lk_args = ['-Wl,--version-script=' + version_map]
+            if is_windows
+                mingw_map = custom_target(libname + '_mingw',
+                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
+                        input: version_map,
+                        output: '@0@_mingw.map'.format(libname))
+                lk_deps += [mingw_map]
+
+                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+            else
+                lk_args = ['-Wl,--version-script=' + version_map]
+            endif
         endif
     endif
 
@@ -298,11 +315,21 @@ foreach l:libraries
         # check-symbols.sh script, using it as a
         # dependency of the .so build
         lk_deps += custom_target(name + '.sym_chk',
-                command: [check_symbols,
-                    version_map, '@INPUT@'],
+                command: [check_symbols, version_map_path, '@INPUT@'],
                 capture: true,
                 input: static_lib,
-                output: name + '.sym_chk')
+                output: name + '.sym_chk',
+                depends: version_map_dep)
+    endif
+
+    if not use_function_versioning or is_windows
+        # use pre-build objects to build shared lib
+        sources = []
+        objs += static_lib.extract_all_objects(recursive: false)
+    else
+        # for compat we need to rebuild with
+        # RTE_BUILD_SHARED_LIB defined
+        cflags += '-DRTE_BUILD_SHARED_LIB'
     endif
 
     shared_lib = shared_library(libname,
-- 
2.48.1


^ permalink raw reply	[relevance 18%]

* [RFC v4 3/8] eal: rework function versioning macros
  2025-03-17 15:42  3% ` [RFC v4 " David Marchand
@ 2025-03-17 15:42 16%   ` David Marchand
  2025-03-17 15:43 18%   ` [RFC v4 5/8] build: generate symbol maps David Marchand
  1 sibling, 0 replies; 200+ results
From: David Marchand @ 2025-03-17 15:42 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue, Tyler Retzlaff, Jasvinder Singh

For versioning symbols:
- MSVC uses pragmas on the symbol,
- GNU linker uses special asm directives,

To accommodate both GNU linker and MSVC linker, introduce new macros for
exporting and versioning symbols that will surround the whole function.

This has the advantage of hiding all the ugly details in the macros.
Now versioning a symbol is just a call to a single macro:
- RTE_VERSION_SYMBOL (resp. RTE_VERSION_EXPERIMENTAL_SYMBOL), for
  keeping an old implementation code under a versioned function (resp.
  experimental function),
- RTE_DEFAULT_SYMBOL, for declaring the new default versioned function,
  and handling the static link special case, instead of
  BIND_DEFAULT_SYMBOL + MAP_STATIC_SYMBOL,

Update lib/net accordingly.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since RFC v3:
- fixed documentation and simplified examples,

Changes since RFC v1:
- renamed and prefixed macros,
- reindented in prevision of second patch,

---
 doc/guides/contributing/abi_versioning.rst | 189 +++++----------------
 lib/eal/include/rte_function_versioning.h  |  96 ++++-------
 lib/net/net_crc.h                          |  15 --
 lib/net/rte_net_crc.c                      |  28 +--
 4 files changed, 88 insertions(+), 240 deletions(-)

diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 7afd1c1886..21f8f8cd14 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -138,27 +138,20 @@ macros are used in conjunction with the ``version.map`` file for
 a given library to allow multiple versions of a symbol to exist in a shared
 library so that older binaries need not be immediately recompiled.
 
-The macros exported are:
+The macros are:
 
-* ``VERSION_SYMBOL(b, e, n)``: Creates a symbol version table entry binding
-  versioned symbol ``b@DPDK_n`` to the internal function ``be``.
+* ``RTE_VERSION_SYMBOL(ver, type, name, args)``: Creates a symbol version table
+  entry binding symbol ``<name>@DPDK_<ver>`` to the internal function name
+  ``<name>_v<ver>``.
 
-* ``BIND_DEFAULT_SYMBOL(b, e, n)``: Creates a symbol version entry instructing
-  the linker to bind references to symbol ``b`` to the internal symbol
-  ``be``.
+* ``RTE_DEFAULT_SYMBOL(ver, type, name, args)``: Creates a symbol version entry
+  instructing the linker to bind references to symbol ``<name>`` to the internal
+  symbol ``<name>_v<ver>``.
 
-* ``MAP_STATIC_SYMBOL(f, p)``: Declare the prototype ``f``, and map it to the
-  fully qualified function ``p``, so that if a symbol becomes versioned, it
-  can still be mapped back to the public symbol name.
-
-* ``__vsym``:  Annotation to be used in a declaration of the internal symbol
-  ``be`` to signal that it is being used as an implementation of a particular
-  version of symbol ``b``.
-
-* ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
-  binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
-  The macro is used when a symbol matures to become part of the stable ABI, to
-  provide an alias to experimental until the next major ABI version.
+* ``RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args)``:  Similar to RTE_VERSION_SYMBOL
+  but for experimental API symbols. The macro is used when a symbol matures
+  to become part of the stable ABI, to provide an alias to experimental
+  until the next major ABI version.
 
 .. _example_abi_macro_usage:
 
@@ -176,8 +169,8 @@ Assume we have a function as follows
   * Create an acl context object for apps to
   * manipulate
   */
- struct rte_acl_ctx *
- rte_acl_create(const struct rte_acl_param *param)
+ int
+ rte_acl_create(struct rte_acl_param *param)
  {
         ...
  }
@@ -194,8 +187,8 @@ private, is safe), but it also requires modifying the code as follows
   * Create an acl context object for apps to
   * manipulate
   */
- struct rte_acl_ctx *
- rte_acl_create(const struct rte_acl_param *param, int debug)
+ int
+ rte_acl_create(struct rte_acl_param *param, int debug)
  {
         ...
  }
@@ -277,86 +270,49 @@ list of exported symbols when DPDK is compiled as a shared library.
 
 Next, we need to specify in the code which function maps to the rte_acl_create
 symbol at which versions.  First, at the site of the initial symbol definition,
-we need to update the function so that it is uniquely named, and not in conflict
-with the public symbol name
+we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
+the function return type, the function name and its arguments.
 
 .. code-block:: c
 
- -struct rte_acl_ctx *
- -rte_acl_create(const struct rte_acl_param *param)
- +struct rte_acl_ctx * __vsym
- +rte_acl_create_v21(const struct rte_acl_param *param)
+ -int
+ -rte_acl_create(struct rte_acl_param *param)
+ +RTE_VERSION_SYMBOL(21, int, rte_acl_create, (struct rte_acl_param *param))
  {
         size_t sz;
         struct rte_acl_ctx *ctx;
         ...
-
-Note that the base name of the symbol was kept intact, as this is conducive to
-the macros used for versioning symbols and we have annotated the function as
-``__vsym``, an implementation of a versioned symbol . That is our next step,
-mapping this new symbol name to the initial symbol name at version node 21.
-Immediately after the function, we add the VERSION_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   VERSION_SYMBOL(rte_acl_create, _v21, 21);
+ }
 
 Remembering to also add the rte_function_versioning.h header to the requisite c
 file where these changes are being made. The macro instructs the linker to
 create a new symbol ``rte_acl_create@DPDK_21``, which matches the symbol created
-in older builds, but now points to the above newly named function. We have now
-mapped the original rte_acl_create symbol to the original function (but with a
-new name).
+in older builds, but now points to the above newly named function ``rte_acl_create_v21``.
+We have now mapped the original rte_acl_create symbol to the original function
+(but with a new name).
 
 Please see the section :ref:`Enabling versioning macros
 <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
-Next, we need to create the new ``v22`` version of the symbol. We create a new
-function name, with the ``v22`` suffix, and implement it appropriately.
+
+Next, we need to create the new version of the symbol. We create a new
+function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
 
 .. code-block:: c
 
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
+   RTE_DEFAULT_SYMBOL(22, int, rte_acl_create, (struct rte_acl_param *param, int debug))
    {
-        struct rte_acl_ctx *ctx = rte_acl_create_v21(param);
+        int ret = rte_acl_create_v21(param);
 
-        ctx->debug = debug;
+        if (debug) {
+        ...
+        }
 
-        return ctx;
+        return ret;
    }
 
-This code serves as our new API call. Its the same as our old call, but adds the
-new parameter in place. Next we need to map this function to the new default
-symbol ``rte_acl_create@DPDK_22``. To do this, immediately after the function,
-we add the BIND_DEFAULT_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
-
 The macro instructs the linker to create the new default symbol
-``rte_acl_create@DPDK_22``, which points to the above newly named function.
-
-We finally modify the prototype of the call in the public header file,
-such that it contains both versions of the symbol and the public API.
-
-.. code-block:: c
-
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v21(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
-
+``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
+(declared by the macro).
 
 And that's it, on the next shared library rebuild, there will be two versions of
 rte_acl_create, an old DPDK_21 version, used by previously built applications,
@@ -365,43 +321,10 @@ and a new DPDK_22 version, used by future built applications.
 .. note::
 
    **Before you leave**, please take care reviewing the sections on
-   :ref:`mapping static symbols <mapping_static_symbols>`,
    :ref:`enabling versioning macros <enabling_versioning_macros>`,
    and :ref:`ABI deprecation <abi_deprecation>`.
 
 
-.. _mapping_static_symbols:
-
-Mapping static symbols
-______________________
-
-Now we've taken what was a public symbol, and duplicated it into two uniquely
-and differently named symbols. We've then mapped each of those back to the
-public symbol ``rte_acl_create`` with different version tags. This only applies
-to dynamic linking, as static linking has no notion of versioning. That leaves
-this code in a position of no longer having a symbol simply named
-``rte_acl_create`` and a static build will fail on that missing symbol.
-
-To correct this, we can simply map a function of our choosing back to the public
-symbol in the static build with the ``MAP_STATIC_SYMBOL`` macro.  Generally the
-assumption is that the most recent version of the symbol is the one you want to
-map.  So, back in the C file where, immediately after ``rte_acl_create_v22`` is
-defined, we add this
-
-
-.. code-block:: c
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug)
-   {
-        ...
-   }
-   MAP_STATIC_SYMBOL(struct rte_acl_ctx *rte_acl_create(const struct rte_acl_param *param, int debug), rte_acl_create_v22);
-
-That tells the compiler that, when building a static library, any calls to the
-symbol ``rte_acl_create`` should be linked to ``rte_acl_create_v22``
-
-
 .. _enabling_versioning_macros:
 
 Enabling versioning macros
@@ -444,8 +367,8 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
     * manipulate
     */
    __rte_experimental
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param)
+   int
+   rte_acl_create(struct rte_acl_param *param)
    {
    ...
    }
@@ -478,8 +401,8 @@ When we promote the symbol to the stable ABI, we simply strip the
     * Create an acl context object for apps to
     * manipulate
     */
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param)
+   int
+   rte_acl_create(struct rte_acl_param *param)
    {
           ...
    }
@@ -519,26 +442,15 @@ and ``DPDK_22`` version nodes.
     * Create an acl context object for apps to
     * manipulate
     */
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param)
+   RTE_DEFAULT_SYMBOL(22, int, rte_acl_create, (struct rte_acl_param *param))
    {
    ...
    }
 
-   __rte_experimental
-   struct rte_acl_ctx *
-   rte_acl_create_e(const struct rte_acl_param *param)
-   {
-      return rte_acl_create(param);
-   }
-   VERSION_SYMBOL_EXPERIMENTAL(rte_acl_create, _e);
-
-   struct rte_acl_ctx *
-   rte_acl_create_v22(const struct rte_acl_param *param)
+   RTE_VERSION_EXPERIMENTAL_SYMBOL(int, rte_acl_create, (struct rte_acl_param *param))
    {
       return rte_acl_create(param);
    }
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
 
 In the map file, we map the symbol to both the ``EXPERIMENTAL``
 and ``DPDK_22`` version nodes.
@@ -564,13 +476,6 @@ and ``DPDK_22`` version nodes.
         rte_acl_create;
    };
 
-.. note::
-
-   Please note, similar to :ref:`symbol versioning <example_abi_macro_usage>`,
-   when aliasing to experimental you will also need to take care of
-   :ref:`mapping static symbols <mapping_static_symbols>`.
-
-
 .. _abi_deprecation:
 
 Deprecating part of a public API
@@ -616,10 +521,10 @@ Next remove the corresponding versioned export.
 
 .. code-block:: c
 
- -VERSION_SYMBOL(rte_acl_create, _v21, 21);
+ -RTE_VERSION_SYMBOL(21, int, rte_acl_create, (struct rte_acl_param *param))
 
 
-Note that the internal function definition could also be removed, but its used
+Note that the internal function definition must also be removed, but it is used
 in our example by the newer version ``v22``, so we leave it in place and declare
 it as static. This is a coding style choice.
 
@@ -663,16 +568,16 @@ In the case of our map above, it would transform to look as follows
         local: *;
  };
 
-Then any uses of BIND_DEFAULT_SYMBOL that pointed to the old node should be
+Then any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
 updated to point to the new version node in any header files for all affected
 symbols.
 
 .. code-block:: c
 
- -BIND_DEFAULT_SYMBOL(rte_acl_create, _v21, 21);
- +BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
+ -RTE_DEFAULT_SYMBOL(21, int, rte_acl_create, (struct rte_acl_param *param, int debug))
+ +RTE_DEFAULT_SYMBOL(22, int, rte_acl_create, (struct rte_acl_param *param, int debug))
 
-Lastly, any VERSION_SYMBOL macros that point to the old version nodes
+Lastly, any RTE_VERSION_SYMBOL macros that point to the old version nodes
 should be removed, taking care to preserve any code that is shared
 with the new version node.
 
diff --git a/lib/eal/include/rte_function_versioning.h b/lib/eal/include/rte_function_versioning.h
index eb6dd2bc17..0020ce4885 100644
--- a/lib/eal/include/rte_function_versioning.h
+++ b/lib/eal/include/rte_function_versioning.h
@@ -11,8 +11,6 @@
 #error Use of function versioning disabled, is "use_function_versioning=true" in meson.build?
 #endif
 
-#ifdef RTE_BUILD_SHARED_LIB
-
 /*
  * Provides backwards compatibility when updating exported functions.
  * When a symbol is exported from a library to provide an API, it also provides a
@@ -20,80 +18,54 @@
  * arguments, etc.  On occasion that function may need to change to accommodate
  * new functionality, behavior, etc.  When that occurs, it is desirable to
  * allow for backwards compatibility for a time with older binaries that are
- * dynamically linked to the dpdk.  To support that, the __vsym and
- * VERSION_SYMBOL macros are created.  They, in conjunction with the
- * version.map file for a given library allow for multiple versions of
- * a symbol to exist in a shared library so that older binaries need not be
- * immediately recompiled.
- *
- * Refer to the guidelines document in the docs subdirectory for details on the
- * use of these macros
+ * dynamically linked to the dpdk.
  */
 
-/*
- * Macro Parameters:
- * b - function base name
- * e - function version extension, to be concatenated with base name
- * n - function symbol version string to be applied
- * f - function prototype
- * p - full function symbol name
- */
+#ifdef RTE_BUILD_SHARED_LIB
 
 /*
- * VERSION_SYMBOL
- * Creates a symbol version table entry binding symbol <b>@DPDK_<n> to the internal
- * function name <b><e>
+ * RTE_VERSION_SYMBOL
+ * Creates a symbol version table entry binding symbol <name>@DPDK_<ver> to the internal
+ * function name <name>_v<ver>.
  */
-#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@DPDK_" RTE_STR(n))
+#define RTE_VERSION_SYMBOL(ver, type, name, args) \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@DPDK_" RTE_STR(ver)); \
+__rte_used type name ## _v ## ver args; \
+type name ## _v ## ver args
 
 /*
- * VERSION_SYMBOL_EXPERIMENTAL
- * Creates a symbol version table entry binding the symbol <b>@EXPERIMENTAL to the internal
- * function name <b><e>. The macro is used when a symbol matures to become part of the stable ABI,
- * to provide an alias to experimental for some time.
+ * RTE_VERSION_EXPERIMENTAL_SYMBOL
+ * Similar to RTE_VERSION_SYMBOL but for experimental API symbols.
+ * This is mainly used for keeping compatibility for symbols that get promoted to stable ABI.
  */
-#define VERSION_SYMBOL_EXPERIMENTAL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@EXPERIMENTAL")
+#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
+__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL") \
+__rte_used type name ## _exp args; \
+type name ## _exp args
 
 /*
- * BIND_DEFAULT_SYMBOL
+ * RTE_DEFAULT_SYMBOL
  * Creates a symbol version entry instructing the linker to bind references to
- * symbol <b> to the internal symbol <b><e>
+ * symbol <name> to the internal symbol <name>_v<ver>.
  */
-#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
+#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@@DPDK_" RTE_STR(ver)); \
+__rte_used type name ## _v ## ver args; \
+type name ## _v ## ver args
 
-/*
- * __vsym
- * Annotation to be used in declaration of the internal symbol <b><e> to signal
- * that it is being used as an implementation of a particular version of symbol
- * <b>.
- */
-#define __vsym __rte_used
+#else /* !RTE_BUILD_SHARED_LIB */
 
-/*
- * MAP_STATIC_SYMBOL
- * If a function has been bifurcated into multiple versions, none of which
- * are defined as the exported symbol name in the map file, this macro can be
- * used to alias a specific version of the symbol to its exported name.  For
- * example, if you have 2 versions of a function foo_v1 and foo_v2, where the
- * former is mapped to foo@DPDK_1 and the latter is mapped to foo@DPDK_2 when
- * building a shared library, this macro can be used to map either foo_v1 or
- * foo_v2 to the symbol foo when building a static library, e.g.:
- * MAP_STATIC_SYMBOL(void foo(), foo_v2);
- */
-#define MAP_STATIC_SYMBOL(f, p)
+#define RTE_VERSION_SYMBOL(ver, type, name, args) \
+type name ## _v ## ver args; \
+type name ## _v ## ver args
 
-#else
-/*
- * No symbol versioning in use
- */
-#define VERSION_SYMBOL(b, e, n)
-#define VERSION_SYMBOL_EXPERIMENTAL(b, e)
-#define __vsym
-#define BIND_DEFAULT_SYMBOL(b, e, n)
-#define MAP_STATIC_SYMBOL(f, p) f __attribute__((alias(RTE_STR(p))))
-/*
- * RTE_BUILD_SHARED_LIB=n
- */
-#endif
+#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
+type name ## _exp args; \
+type name ## _exp args
+
+#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
+type name args
+
+#endif /* RTE_BUILD_SHARED_LIB */
 
 #endif /* _RTE_FUNCTION_VERSIONING_H_ */
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 4930e2f0b3..320b0edca8 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -7,21 +7,6 @@
 
 #include "rte_net_crc.h"
 
-void
-rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
-
-struct rte_net_crc *
-rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
-	enum rte_net_crc_type type);
-
-uint32_t
-rte_net_crc_calc_v25(const void *data,
-	uint32_t data_len, enum rte_net_crc_type type);
-
-uint32_t
-rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len);
-
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 2fb3eec231..1943d46295 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -345,8 +345,7 @@ handlers_init(enum rte_net_crc_alg alg)
 
 /* Public API */
 
-void
-rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
+RTE_VERSION_SYMBOL(25, void, rte_net_crc_set_alg, (enum rte_net_crc_alg alg))
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -373,10 +372,9 @@ rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
-VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
 
-struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
-	enum rte_net_crc_type type)
+RTE_DEFAULT_SYMBOL(26, struct rte_net_crc *, rte_net_crc_set_alg, (enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type))
 {
 	uint16_t max_simd_bitwidth;
 	struct rte_net_crc *crc;
@@ -414,20 +412,14 @@ struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
 	}
 	return crc;
 }
-BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
-MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
-	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
-	rte_net_crc_set_alg_v26);
 
 void rte_net_crc_free(struct rte_net_crc *crc)
 {
 	rte_free(crc);
 }
 
-uint32_t
-rte_net_crc_calc_v25(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type)
+RTE_VERSION_SYMBOL(25, uint32_t, rte_net_crc_calc, (const void *data, uint32_t data_len,
+	enum rte_net_crc_type type))
 {
 	uint32_t ret;
 	rte_net_crc_handler f_handle;
@@ -437,18 +429,12 @@ rte_net_crc_calc_v25(const void *data,
 
 	return ret;
 }
-VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
 
-uint32_t
-rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len)
+RTE_DEFAULT_SYMBOL(26, uint32_t, rte_net_crc_calc, (const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len))
 {
 	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
 }
-BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
-MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len),
-	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
-- 
2.48.1


^ permalink raw reply	[relevance 16%]

* [RFC v4 0/8] Symbol versioning and export rework
  2025-03-05 21:23  6% [RFC] eal: add new function versioning macros David Marchand
  2025-03-06 12:50  6% ` [RFC v2 1/2] " David Marchand
  2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
@ 2025-03-17 15:42  3% ` David Marchand
  2025-03-17 15:42 16%   ` [RFC v4 3/8] eal: rework function versioning macros David Marchand
  2025-03-17 15:43 18%   ` [RFC v4 5/8] build: generate symbol maps David Marchand
  2 siblings, 2 replies; 200+ results
From: David Marchand @ 2025-03-17 15:42 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue

So far, each DPDK library (or driver) exposing symbols in an ABI had to
maintain a version.map and use some macros for symbol versioning,
specially crafted with the GNU linker in mind.

This series proposes to rework the whole principle, and instead rely on
marking the symbol exports in the source code itself, then let it to the
build framework to produce a version script adapted to the linker in use
(think GNU linker vs MSVC linker).

This greatly simplifies versioning symbols: a developer does not need to
know anything about version.map, or that a versioned symbol must be
renamed with _v26, annotated with __vsym, exported in a header etc...

Checking symbol maps becomes unnecessary since generated by the build
framework.

Updating to a new ABI is just a matter of bumping the value in
ABI_VERSION.


Comments please.


-- 
David Marchand

Depends-on: series-34869 ("remove driver-specific logic for AVX builds")

Changes since RFC v3:
- fixed/simplified documentation,
- rebased on top of Bruce series for common handling of AVX sources,

Changes since RFC v2:
- updated RTE_VERSION_SYMBOL() (and friends) so that only the fonction
  signature is enclosed in the macro,
- dropped invalid exports for some dead symbols or inline helpers,
- updated documentation and tooling,
- converted the whole tree (via a local script of mine),

David Marchand (8):
  lib: remove incorrect exported symbols
  drivers: remove incorrect exported symbols
  eal: rework function versioning macros
  buildtools: display version when listing symbols
  build: generate symbol maps
  build: mark exported symbols
  build: use dynamically generated version maps
  build: remove static version maps

 .github/workflows/build.yml                   |   1 -
 MAINTAINERS                                   |   9 +-
 buildtools/check-symbols.sh                   |  33 +-
 buildtools/gen-version-map.py                 | 106 ++++
 buildtools/map-list-symbol.sh                 |  15 +-
 buildtools/map_to_win.py                      |  41 --
 buildtools/meson.build                        |   2 +-
 config/meson.build                            |   4 +-
 config/rte_export.h                           |  16 +
 devtools/check-spdx-tag.sh                    |   2 +-
 devtools/check-symbol-change.py               |  90 +++
 devtools/check-symbol-change.sh               | 186 ------
 devtools/check-symbol-maps.sh                 | 115 ----
 devtools/checkpatches.sh                      |   4 +-
 devtools/update-abi.sh                        |  46 --
 devtools/update_version_map_abi.py            | 210 -------
 doc/guides/contributing/abi_policy.rst        |  21 +-
 doc/guides/contributing/abi_versioning.rst    | 408 ++-----------
 doc/guides/contributing/coding_style.rst      |   7 -
 .../contributing/img/patch_cheatsheet.svg     | 303 +++++----
 doc/guides/contributing/patches.rst           |   6 +-
 drivers/baseband/acc/rte_acc100_pmd.c         |   1 +
 drivers/baseband/acc/version.map              |  10 -
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c         |   1 +
 drivers/baseband/fpga_5gnr_fec/version.map    |  11 -
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |   1 +
 drivers/baseband/fpga_lte_fec/version.map     |  10 -
 drivers/bus/auxiliary/auxiliary_common.c      |   2 +
 drivers/bus/auxiliary/version.map             |   8 -
 drivers/bus/cdx/cdx.c                         |   4 +
 drivers/bus/cdx/cdx_vfio.c                    |   4 +
 drivers/bus/cdx/version.map                   |  14 -
 drivers/bus/dpaa/dpaa_bus.c                   | 104 ++++
 drivers/bus/dpaa/version.map                  | 109 ----
 drivers/bus/fslmc/fslmc_bus.c                 |   4 +
 drivers/bus/fslmc/fslmc_vfio.c                |  12 +
 drivers/bus/fslmc/mc/dpbp.c                   |   6 +
 drivers/bus/fslmc/mc/dpci.c                   |   3 +
 drivers/bus/fslmc/mc/dpcon.c                  |   6 +
 drivers/bus/fslmc/mc/dpdmai.c                 |   8 +
 drivers/bus/fslmc/mc/dpio.c                   |  13 +
 drivers/bus/fslmc/mc/dpmng.c                  |   2 +
 drivers/bus/fslmc/mc/mc_sys.c                 |   1 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c      |   3 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpci.c      |   2 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  11 +
 drivers/bus/fslmc/qbman/qbman_debug.c         |   2 +
 drivers/bus/fslmc/qbman/qbman_portal.c        |  41 ++
 drivers/bus/fslmc/version.map                 | 129 ----
 drivers/bus/ifpga/ifpga_bus.c                 |   3 +
 drivers/bus/ifpga/version.map                 |   9 -
 drivers/bus/pci/bsd/pci.c                     |  10 +
 drivers/bus/pci/linux/pci.c                   |  10 +
 drivers/bus/pci/pci_common.c                  |  10 +
 drivers/bus/pci/version.map                   |  43 --
 drivers/bus/pci/windows/pci.c                 |  10 +
 drivers/bus/platform/platform.c               |   2 +
 drivers/bus/platform/version.map              |  10 -
 drivers/bus/uacce/uacce.c                     |   9 +
 drivers/bus/uacce/version.map                 |  15 -
 drivers/bus/vdev/vdev.c                       |   6 +
 drivers/bus/vdev/version.map                  |  17 -
 drivers/bus/vmbus/linux/vmbus_bus.c           |   6 +
 drivers/bus/vmbus/version.map                 |  33 -
 drivers/bus/vmbus/vmbus_channel.c             |  13 +
 drivers/bus/vmbus/vmbus_common.c              |   3 +
 drivers/common/cnxk/cnxk_security.c           |  12 +
 drivers/common/cnxk/cnxk_utils.c              |   1 +
 drivers/common/cnxk/roc_platform.c            | 559 +++++++++++++++++
 drivers/common/cnxk/roc_se.h                  |   1 -
 drivers/common/cnxk/version.map               | 578 ------------------
 drivers/common/cpt/cpt_fpm_tables.c           |   2 +
 drivers/common/cpt/cpt_pmd_ops_helper.c       |   3 +
 drivers/common/cpt/version.map                |  11 -
 drivers/common/dpaax/caamflib.c               |   1 +
 drivers/common/dpaax/dpaa_of.c                |  12 +
 drivers/common/dpaax/dpaax_iova_table.c       |   6 +
 drivers/common/dpaax/version.map              |  25 -
 drivers/common/ionic/ionic_common_uio.c       |   4 +
 drivers/common/ionic/version.map              |  10 -
 .../common/mlx5/linux/mlx5_common_auxiliary.c |   1 +
 drivers/common/mlx5/linux/mlx5_common_os.c    |   9 +
 drivers/common/mlx5/linux/mlx5_common_verbs.c |   3 +
 drivers/common/mlx5/linux/mlx5_glue.c         |   1 +
 drivers/common/mlx5/linux/mlx5_nl.c           |  21 +
 drivers/common/mlx5/mlx5_common.c             |   9 +
 drivers/common/mlx5/mlx5_common_devx.c        |   9 +
 drivers/common/mlx5/mlx5_common_mp.c          |   8 +
 drivers/common/mlx5/mlx5_common_mr.c          |  11 +
 drivers/common/mlx5/mlx5_common_pci.c         |   2 +
 drivers/common/mlx5/mlx5_common_utils.c       |  11 +
 drivers/common/mlx5/mlx5_devx_cmds.c          |  51 ++
 drivers/common/mlx5/mlx5_malloc.c             |   4 +
 drivers/common/mlx5/version.map               | 174 ------
 drivers/common/mlx5/windows/mlx5_common_os.c  |   5 +
 drivers/common/mlx5/windows/mlx5_glue.c       |   3 +-
 drivers/common/mvep/mvep_common.c             |   2 +
 drivers/common/mvep/version.map               |   8 -
 drivers/common/nfp/nfp_common.c               |   7 +
 drivers/common/nfp/nfp_common_pci.c           |   1 +
 drivers/common/nfp/nfp_dev.c                  |   1 +
 drivers/common/nfp/version.map                |  16 -
 drivers/common/nitrox/nitrox_device.c         |   1 +
 drivers/common/nitrox/nitrox_logs.c           |   1 +
 drivers/common/nitrox/nitrox_qp.c             |   2 +
 drivers/common/nitrox/version.map             |  10 -
 drivers/common/octeontx/octeontx_mbox.c       |   6 +
 drivers/common/octeontx/version.map           |  12 -
 drivers/common/sfc_efx/sfc_efx.c              | 273 +++++++++
 drivers/common/sfc_efx/sfc_efx_mcdi.c         |   2 +
 drivers/common/sfc_efx/version.map            | 302 ---------
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c     |   7 +
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c      |   2 +
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c      |   7 +
 drivers/crypto/cnxk/version.map               |  30 -
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |   2 +
 drivers/crypto/dpaa2_sec/version.map          |   8 -
 drivers/crypto/dpaa_sec/dpaa_sec.c            |   2 +
 drivers/crypto/dpaa_sec/version.map           |   8 -
 drivers/crypto/octeontx/otx_cryptodev_ops.c   |   2 +
 drivers/crypto/octeontx/version.map           |  12 -
 .../scheduler/rte_cryptodev_scheduler.c       |  10 +
 drivers/crypto/scheduler/version.map          |  16 -
 drivers/dma/cnxk/cnxk_dmadev_fp.c             |   4 +
 drivers/dma/cnxk/version.map                  |  10 -
 drivers/event/cnxk/cnxk_worker.c              |   2 +
 drivers/event/cnxk/version.map                |  11 -
 drivers/event/dlb2/rte_pmd_dlb2.c             |   1 +
 drivers/event/dlb2/version.map                |  10 -
 drivers/mempool/cnxk/cn10k_hwpool_ops.c       |   3 +
 drivers/mempool/cnxk/version.map              |  12 -
 drivers/mempool/dpaa/dpaa_mempool.c           |   2 +
 drivers/mempool/dpaa/version.map              |   8 -
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c      |   5 +
 drivers/mempool/dpaa2/version.map             |  16 -
 drivers/meson.build                           |  74 +--
 drivers/net/atlantic/rte_pmd_atlantic.c       |   6 +
 drivers/net/atlantic/version.map              |  15 -
 drivers/net/bnxt/rte_pmd_bnxt.c               |  16 +
 drivers/net/bnxt/version.map                  |  22 -
 drivers/net/bonding/rte_eth_bond_8023ad.c     |  12 +
 drivers/net/bonding/rte_eth_bond_api.c        |  15 +
 drivers/net/bonding/version.map               |  33 -
 drivers/net/cnxk/cnxk_ethdev.c                |   3 +
 drivers/net/cnxk/cnxk_ethdev_sec.c            |   9 +
 drivers/net/cnxk/version.map                  |  27 -
 drivers/net/dpaa/dpaa_ethdev.c                |   3 +
 drivers/net/dpaa/version.map                  |  14 -
 drivers/net/dpaa2/dpaa2_ethdev.c              |  11 +
 drivers/net/dpaa2/dpaa2_mux.c                 |   3 +
 drivers/net/dpaa2/dpaa2_rxtx.c                |   1 +
 drivers/net/dpaa2/version.map                 |  35 --
 drivers/net/intel/i40e/rte_pmd_i40e.c         |  39 ++
 drivers/net/intel/i40e/version.map            |  55 --
 drivers/net/intel/iavf/iavf_ethdev.c          |   9 +
 drivers/net/intel/iavf/iavf_rxtx.c            |   8 +
 drivers/net/intel/iavf/version.map            |  33 -
 drivers/net/intel/ice/ice_diagnose.c          |   3 +
 drivers/net/intel/ice/version.map             |  16 -
 drivers/net/intel/idpf/idpf_common_device.c   |  10 +
 drivers/net/intel/idpf/idpf_common_rxtx.c     |  24 +
 .../net/intel/idpf/idpf_common_rxtx_avx2.c    |   2 +
 .../net/intel/idpf/idpf_common_rxtx_avx512.c  |   5 +
 drivers/net/intel/idpf/idpf_common_virtchnl.c |  29 +
 drivers/net/intel/idpf/version.map            |  80 ---
 drivers/net/intel/ipn3ke/ipn3ke_ethdev.c      |   1 +
 drivers/net/intel/ipn3ke/version.map          |   9 -
 drivers/net/intel/ixgbe/rte_pmd_ixgbe.c       |  37 ++
 drivers/net/intel/ixgbe/version.map           |  49 --
 drivers/net/mlx5/mlx5.c                       |   1 +
 drivers/net/mlx5/mlx5_flow.c                  |   4 +
 drivers/net/mlx5/mlx5_rx.c                    |   2 +
 drivers/net/mlx5/mlx5_rxq.c                   |   2 +
 drivers/net/mlx5/mlx5_tx.c                    |   1 +
 drivers/net/mlx5/mlx5_txq.c                   |   3 +
 drivers/net/mlx5/version.map                  |  28 -
 drivers/net/octeontx/octeontx_ethdev.c        |   1 +
 drivers/net/octeontx/version.map              |   7 -
 drivers/net/ring/rte_eth_ring.c               |   2 +
 drivers/net/ring/version.map                  |   8 -
 drivers/net/softnic/rte_eth_softnic.c         |   1 +
 drivers/net/softnic/rte_eth_softnic_thread.c  |   1 +
 drivers/net/softnic/version.map               |   8 -
 drivers/net/vhost/rte_eth_vhost.c             |   2 +
 drivers/net/vhost/version.map                 |   8 -
 drivers/power/kvm_vm/guest_channel.c          |   2 +
 drivers/power/kvm_vm/version.map              |   8 -
 drivers/raw/cnxk_rvu_lf/cnxk_rvu_lf.c         |  10 +
 drivers/raw/cnxk_rvu_lf/version.map           |  16 -
 drivers/raw/ifpga/rte_pmd_ifpga.c             |  11 +
 drivers/raw/ifpga/version.map                 |  17 -
 drivers/version.map                           |   3 -
 lib/acl/acl_bld.c                             |   1 +
 lib/acl/acl_run_scalar.c                      |   1 +
 lib/acl/rte_acl.c                             |  11 +
 lib/acl/version.map                           |  19 -
 lib/argparse/rte_argparse.c                   |   2 +
 lib/argparse/version.map                      |   9 -
 lib/bbdev/bbdev_trace_points.c                |   2 +
 lib/bbdev/rte_bbdev.c                         |  31 +
 lib/bbdev/version.map                         |  47 --
 lib/bitratestats/rte_bitrate.c                |   4 +
 lib/bitratestats/version.map                  |  10 -
 lib/bpf/bpf.c                                 |   2 +
 lib/bpf/bpf_convert.c                         |   1 +
 lib/bpf/bpf_dump.c                            |   1 +
 lib/bpf/bpf_exec.c                            |   2 +
 lib/bpf/bpf_load.c                            |   1 +
 lib/bpf/bpf_load_elf.c                        |   1 +
 lib/bpf/bpf_pkt.c                             |   4 +
 lib/bpf/bpf_stub.c                            |   2 +
 lib/bpf/version.map                           |  18 -
 lib/cfgfile/rte_cfgfile.c                     |  17 +
 lib/cfgfile/version.map                       |  23 -
 lib/cmdline/cmdline.c                         |   9 +
 lib/cmdline/cmdline_cirbuf.c                  |  19 +
 lib/cmdline/cmdline_parse.c                   |   4 +
 lib/cmdline/cmdline_parse_bool.c              |   1 +
 lib/cmdline/cmdline_parse_etheraddr.c         |   3 +
 lib/cmdline/cmdline_parse_ipaddr.c            |   3 +
 lib/cmdline/cmdline_parse_num.c               |   3 +
 lib/cmdline/cmdline_parse_portlist.c          |   3 +
 lib/cmdline/cmdline_parse_string.c            |   5 +
 lib/cmdline/cmdline_rdline.c                  |  15 +
 lib/cmdline/cmdline_socket.c                  |   3 +
 lib/cmdline/cmdline_vt100.c                   |   2 +
 lib/cmdline/version.map                       |  82 ---
 lib/compressdev/rte_comp.c                    |   6 +
 lib/compressdev/rte_compressdev.c             |  25 +
 lib/compressdev/rte_compressdev_pmd.c         |   3 +
 lib/compressdev/version.map                   |  40 --
 lib/cryptodev/cryptodev_pmd.c                 |   7 +
 lib/cryptodev/cryptodev_trace_points.c        |   3 +
 lib/cryptodev/rte_cryptodev.c                 |  83 +++
 lib/cryptodev/version.map                     | 114 ----
 lib/dispatcher/rte_dispatcher.c               |  13 +
 lib/dispatcher/version.map                    |  20 -
 lib/distributor/rte_distributor.c             |   9 +
 lib/distributor/version.map                   |  15 -
 lib/dmadev/rte_dmadev.c                       |  19 +
 lib/dmadev/rte_dmadev_trace_points.c          |   7 +
 lib/dmadev/version.map                        |  47 --
 lib/eal/arm/rte_cpuflags.c                    |   3 +
 lib/eal/arm/rte_hypervisor.c                  |   1 +
 lib/eal/arm/rte_power_intrinsics.c            |   4 +
 lib/eal/common/eal_common_bus.c               |  10 +
 lib/eal/common/eal_common_class.c             |   4 +
 lib/eal/common/eal_common_config.c            |   7 +
 lib/eal/common/eal_common_cpuflags.c          |   1 +
 lib/eal/common/eal_common_debug.c             |   2 +
 lib/eal/common/eal_common_dev.c               |  19 +
 lib/eal/common/eal_common_devargs.c           |   9 +
 lib/eal/common/eal_common_errno.c             |   2 +
 lib/eal/common/eal_common_fbarray.c           |  26 +
 lib/eal/common/eal_common_hexdump.c           |   2 +
 lib/eal/common/eal_common_hypervisor.c        |   1 +
 lib/eal/common/eal_common_interrupts.c        |  27 +
 lib/eal/common/eal_common_launch.c            |   5 +
 lib/eal/common/eal_common_lcore.c             |  17 +
 lib/eal/common/eal_common_lcore_var.c         |   1 +
 lib/eal/common/eal_common_mcfg.c              |  20 +
 lib/eal/common/eal_common_memory.c            |  29 +
 lib/eal/common/eal_common_memzone.c           |   9 +
 lib/eal/common/eal_common_options.c           |   4 +
 lib/eal/common/eal_common_proc.c              |   8 +
 lib/eal/common/eal_common_string_fns.c        |   3 +
 lib/eal/common/eal_common_tailqs.c            |   3 +
 lib/eal/common/eal_common_thread.c            |  14 +
 lib/eal/common/eal_common_timer.c             |   4 +
 lib/eal/common/eal_common_trace.c             |  15 +
 lib/eal/common/eal_common_trace_ctf.c         |   1 +
 lib/eal/common/eal_common_trace_points.c      |  18 +
 lib/eal/common/eal_common_trace_utils.c       |   1 +
 lib/eal/common/eal_common_uuid.c              |   4 +
 lib/eal/common/rte_bitset.c                   |   1 +
 lib/eal/common/rte_keepalive.c                |   6 +
 lib/eal/common/rte_malloc.c                   |  22 +
 lib/eal/common/rte_random.c                   |   4 +
 lib/eal/common/rte_reciprocal.c               |   2 +
 lib/eal/common/rte_service.c                  |  31 +
 lib/eal/common/rte_version.c                  |   7 +
 lib/eal/freebsd/eal.c                         |  22 +
 lib/eal/freebsd/eal_alarm.c                   |   2 +
 lib/eal/freebsd/eal_dev.c                     |   4 +
 lib/eal/freebsd/eal_interrupts.c              |  19 +
 lib/eal/freebsd/eal_memory.c                  |   3 +
 lib/eal/freebsd/eal_thread.c                  |   2 +
 lib/eal/freebsd/eal_timer.c                   |   1 +
 lib/eal/include/rte_function_versioning.h     |  96 ++-
 lib/eal/linux/eal.c                           |   7 +
 lib/eal/linux/eal_alarm.c                     |   2 +
 lib/eal/linux/eal_dev.c                       |   4 +
 lib/eal/linux/eal_interrupts.c                |  19 +
 lib/eal/linux/eal_memory.c                    |   3 +
 lib/eal/linux/eal_thread.c                    |   2 +
 lib/eal/linux/eal_timer.c                     |   4 +
 lib/eal/linux/eal_vfio.c                      |  16 +
 lib/eal/loongarch/rte_cpuflags.c              |   3 +
 lib/eal/loongarch/rte_hypervisor.c            |   1 +
 lib/eal/loongarch/rte_power_intrinsics.c      |   4 +
 lib/eal/ppc/rte_cpuflags.c                    |   3 +
 lib/eal/ppc/rte_hypervisor.c                  |   1 +
 lib/eal/ppc/rte_power_intrinsics.c            |   4 +
 lib/eal/riscv/rte_cpuflags.c                  |   3 +
 lib/eal/riscv/rte_hypervisor.c                |   1 +
 lib/eal/riscv/rte_power_intrinsics.c          |   4 +
 lib/eal/unix/eal_debug.c                      |   2 +
 lib/eal/unix/eal_filesystem.c                 |   1 +
 lib/eal/unix/eal_firmware.c                   |   1 +
 lib/eal/unix/eal_unix_memory.c                |   4 +
 lib/eal/unix/eal_unix_timer.c                 |   1 +
 lib/eal/unix/rte_thread.c                     |  13 +
 lib/eal/version.map                           | 451 --------------
 lib/eal/windows/eal.c                         |  11 +
 lib/eal/windows/eal_alarm.c                   |   2 +
 lib/eal/windows/eal_debug.c                   |   1 +
 lib/eal/windows/eal_dev.c                     |   4 +
 lib/eal/windows/eal_interrupts.c              |  19 +
 lib/eal/windows/eal_memory.c                  |   7 +
 lib/eal/windows/eal_mp.c                      |   6 +
 lib/eal/windows/eal_thread.c                  |   1 +
 lib/eal/windows/eal_timer.c                   |   1 +
 lib/eal/windows/rte_thread.c                  |  14 +
 lib/eal/x86/rte_cpuflags.c                    |   3 +
 lib/eal/x86/rte_hypervisor.c                  |   1 +
 lib/eal/x86/rte_power_intrinsics.c            |   4 +
 lib/eal/x86/rte_spinlock.c                    |   1 +
 lib/efd/rte_efd.c                             |   7 +
 lib/efd/version.map                           |  13 -
 lib/ethdev/ethdev_driver.c                    |  24 +
 lib/ethdev/ethdev_linux_ethtool.c             |   3 +
 lib/ethdev/ethdev_private.c                   |   2 +
 lib/ethdev/ethdev_trace_points.c              |   6 +
 lib/ethdev/rte_ethdev.c                       | 168 +++++
 lib/ethdev/rte_ethdev_cman.c                  |   4 +
 lib/ethdev/rte_flow.c                         |  64 ++
 lib/ethdev/rte_mtr.c                          |  21 +
 lib/ethdev/rte_tm.c                           |  31 +
 lib/ethdev/version.map                        | 378 ------------
 lib/eventdev/eventdev_private.c               |   2 +
 lib/eventdev/eventdev_trace_points.c          |  11 +
 lib/eventdev/rte_event_crypto_adapter.c       |  15 +
 lib/eventdev/rte_event_dma_adapter.c          |  15 +
 lib/eventdev/rte_event_eth_rx_adapter.c       |  23 +
 lib/eventdev/rte_event_eth_tx_adapter.c       |  17 +
 lib/eventdev/rte_event_ring.c                 |   4 +
 lib/eventdev/rte_event_timer_adapter.c        |  11 +
 lib/eventdev/rte_eventdev.c                   |  46 ++
 lib/eventdev/version.map                      | 179 ------
 lib/fib/rte_fib.c                             |  10 +
 lib/fib/rte_fib6.c                            |   9 +
 lib/fib/version.map                           |  31 -
 lib/gpudev/gpudev.c                           |  32 +
 lib/gpudev/version.map                        |  44 --
 lib/graph/graph.c                             |  16 +
 lib/graph/graph_debug.c                       |   1 +
 lib/graph/graph_stats.c                       |   4 +
 lib/graph/node.c                              |  11 +
 lib/graph/rte_graph_model_mcore_dispatch.c    |   3 +
 lib/graph/rte_graph_worker.c                  |   3 +
 lib/graph/version.map                         |  61 --
 lib/gro/rte_gro.c                             |   6 +
 lib/gro/version.map                           |  12 -
 lib/gso/rte_gso.c                             |   1 +
 lib/gso/version.map                           |   7 -
 lib/hash/rte_cuckoo_hash.c                    |  27 +
 lib/hash/rte_fbk_hash.c                       |   3 +
 lib/hash/rte_hash_crc.c                       |   2 +
 lib/hash/rte_thash.c                          |  12 +
 lib/hash/rte_thash_gf2_poly_math.c            |   1 +
 lib/hash/rte_thash_gfni.c                     |   2 +
 lib/hash/version.map                          |  66 --
 lib/ip_frag/rte_ip_frag_common.c              |   5 +
 lib/ip_frag/rte_ipv4_fragmentation.c          |   2 +
 lib/ip_frag/rte_ipv4_reassembly.c             |   1 +
 lib/ip_frag/rte_ipv6_fragmentation.c          |   1 +
 lib/ip_frag/rte_ipv6_reassembly.c             |   1 +
 lib/ip_frag/version.map                       |  16 -
 lib/ipsec/ipsec_sad.c                         |   6 +
 lib/ipsec/ipsec_telemetry.c                   |   2 +
 lib/ipsec/sa.c                                |   4 +
 lib/ipsec/ses.c                               |   1 +
 lib/ipsec/version.map                         |  23 -
 lib/jobstats/rte_jobstats.c                   |  14 +
 lib/jobstats/version.map                      |  20 -
 lib/kvargs/rte_kvargs.c                       |   8 +
 lib/kvargs/version.map                        |  14 -
 lib/latencystats/rte_latencystats.c           |   5 +
 lib/latencystats/version.map                  |  11 -
 lib/log/log.c                                 |  22 +
 lib/log/log_color.c                           |   1 +
 lib/log/log_internal.h                        |   3 -
 lib/log/log_syslog.c                          |   1 +
 lib/log/log_timestamp.c                       |   1 +
 lib/log/version.map                           |  37 --
 lib/lpm/rte_lpm.c                             |   8 +
 lib/lpm/rte_lpm6.c                            |  10 +
 lib/lpm/version.map                           |  24 -
 lib/mbuf/rte_mbuf.c                           |  17 +
 lib/mbuf/rte_mbuf_dyn.c                       |   9 +
 lib/mbuf/rte_mbuf_pool_ops.c                  |   5 +
 lib/mbuf/rte_mbuf_ptype.c                     |   8 +
 lib/mbuf/version.map                          |  45 --
 lib/member/rte_member.c                       |  13 +
 lib/member/version.map                        |  19 -
 lib/mempool/mempool_trace_points.c            |  10 +
 lib/mempool/rte_mempool.c                     |  27 +
 lib/mempool/rte_mempool_ops.c                 |   4 +
 lib/mempool/rte_mempool_ops_default.c         |   4 +
 lib/mempool/version.map                       |  65 --
 lib/meson.build                               |  62 +-
 lib/meter/rte_meter.c                         |   6 +
 lib/meter/version.map                         |  12 -
 lib/metrics/rte_metrics.c                     |   8 +
 lib/metrics/rte_metrics_telemetry.c           |  11 +
 lib/metrics/version.map                       |  26 -
 lib/mldev/mldev_utils.c                       |   2 +
 lib/mldev/mldev_utils_neon.c                  |  18 +
 lib/mldev/mldev_utils_neon_bfloat16.c         |   2 +
 lib/mldev/mldev_utils_scalar.c                |  18 +
 lib/mldev/mldev_utils_scalar_bfloat16.c       |   2 +
 lib/mldev/rte_mldev.c                         |  37 ++
 lib/mldev/rte_mldev_pmd.c                     |   2 +
 lib/mldev/version.map                         |  74 ---
 lib/net/net_crc.h                             |  15 -
 lib/net/rte_arp.c                             |   1 +
 lib/net/rte_ether.c                           |   3 +
 lib/net/rte_net.c                             |   2 +
 lib/net/rte_net_crc.c                         |  29 +-
 lib/net/version.map                           |  23 -
 lib/node/ethdev_ctrl.c                        |   2 +
 lib/node/ip4_lookup.c                         |   1 +
 lib/node/ip4_reassembly.c                     |   1 +
 lib/node/ip4_rewrite.c                        |   1 +
 lib/node/ip6_lookup.c                         |   1 +
 lib/node/ip6_rewrite.c                        |   1 +
 lib/node/udp4_input.c                         |   2 +
 lib/node/version.map                          |  25 -
 lib/pcapng/rte_pcapng.c                       |   7 +
 lib/pcapng/version.map                        |  13 -
 lib/pci/rte_pci.c                             |   3 +
 lib/pci/version.map                           |   9 -
 lib/pdcp/rte_pdcp.c                           |   5 +
 lib/pdcp/version.map                          |  16 -
 lib/pdump/rte_pdump.c                         |   9 +
 lib/pdump/version.map                         |  15 -
 lib/pipeline/rte_pipeline.c                   |  23 +
 lib/pipeline/rte_port_in_action.c             |   8 +
 lib/pipeline/rte_swx_ctl.c                    |  17 +
 lib/pipeline/rte_swx_ipsec.c                  |   7 +
 lib/pipeline/rte_swx_pipeline.c               |  73 +++
 lib/pipeline/rte_table_action.c               |  16 +
 lib/pipeline/version.map                      | 172 ------
 lib/port/rte_port_ethdev.c                    |   3 +
 lib/port/rte_port_eventdev.c                  |   3 +
 lib/port/rte_port_fd.c                        |   3 +
 lib/port/rte_port_frag.c                      |   2 +
 lib/port/rte_port_ras.c                       |   2 +
 lib/port/rte_port_ring.c                      |   6 +
 lib/port/rte_port_sched.c                     |   2 +
 lib/port/rte_port_source_sink.c               |   2 +
 lib/port/rte_port_sym_crypto.c                |   3 +
 lib/port/rte_swx_port_ethdev.c                |   2 +
 lib/port/rte_swx_port_fd.c                    |   2 +
 lib/port/rte_swx_port_ring.c                  |   2 +
 lib/port/rte_swx_port_source_sink.c           |   3 +
 lib/port/version.map                          |  50 --
 lib/power/power_common.c                      |   8 +
 lib/power/rte_power_cpufreq.c                 |  18 +
 lib/power/rte_power_pmd_mgmt.c                |  10 +
 lib/power/rte_power_qos.c                     |   2 +
 lib/power/rte_power_uncore.c                  |  14 +
 lib/power/version.map                         |  71 ---
 lib/rawdev/rte_rawdev.c                       |  30 +
 lib/rawdev/version.map                        |  36 --
 lib/rcu/rte_rcu_qsbr.c                        |  11 +
 lib/rcu/version.map                           |  17 -
 lib/regexdev/rte_regexdev.c                   |  26 +
 lib/regexdev/version.map                      |  40 --
 lib/reorder/rte_reorder.c                     |  11 +
 lib/reorder/version.map                       |  27 -
 lib/rib/rte_rib.c                             |  14 +
 lib/rib/rte_rib6.c                            |  14 +
 lib/rib/version.map                           |  34 --
 lib/ring/rte_ring.c                           |  11 +
 lib/ring/rte_soring.c                         |   3 +
 lib/ring/soring.c                             |  16 +
 lib/ring/version.map                          |  42 --
 lib/sched/rte_approx.c                        |   1 +
 lib/sched/rte_pie.c                           |   2 +
 lib/sched/rte_red.c                           |   6 +
 lib/sched/rte_sched.c                         |  15 +
 lib/sched/version.map                         |  30 -
 lib/security/rte_security.c                   |  20 +
 lib/security/version.map                      |  37 --
 lib/stack/rte_stack.c                         |   3 +
 lib/stack/version.map                         |   9 -
 lib/table/rte_swx_table_em.c                  |   2 +
 lib/table/rte_swx_table_learner.c             |  10 +
 lib/table/rte_swx_table_selector.c            |   6 +
 lib/table/rte_swx_table_wm.c                  |   1 +
 lib/table/rte_table_acl.c                     |   1 +
 lib/table/rte_table_array.c                   |   1 +
 lib/table/rte_table_hash_cuckoo.c             |   1 +
 lib/table/rte_table_hash_ext.c                |   1 +
 lib/table/rte_table_hash_key16.c              |   2 +
 lib/table/rte_table_hash_key32.c              |   2 +
 lib/table/rte_table_hash_key8.c               |   2 +
 lib/table/rte_table_hash_lru.c                |   1 +
 lib/table/rte_table_lpm.c                     |   1 +
 lib/table/rte_table_lpm_ipv6.c                |   1 +
 lib/table/rte_table_stub.c                    |   1 +
 lib/table/version.map                         |  53 --
 lib/telemetry/telemetry.c                     |   3 +
 lib/telemetry/telemetry_data.c                |  17 +
 lib/telemetry/telemetry_legacy.c              |   1 +
 lib/telemetry/version.map                     |  40 --
 lib/timer/rte_timer.c                         |  18 +
 lib/timer/version.map                         |  24 -
 lib/vhost/socket.c                            |  16 +
 lib/vhost/vdpa.c                              |  11 +
 lib/vhost/version.map                         | 111 ----
 lib/vhost/vhost.c                             |  41 ++
 lib/vhost/vhost_crypto.c                      |   6 +
 lib/vhost/vhost_user.c                        |   2 +
 lib/vhost/virtio_net.c                        |   7 +
 526 files changed, 4661 insertions(+), 6528 deletions(-)
 create mode 100755 buildtools/gen-version-map.py
 delete mode 100644 buildtools/map_to_win.py
 create mode 100644 config/rte_export.h
 create mode 100755 devtools/check-symbol-change.py
 delete mode 100755 devtools/check-symbol-change.sh
 delete mode 100755 devtools/check-symbol-maps.sh
 delete mode 100755 devtools/update-abi.sh
 delete mode 100755 devtools/update_version_map_abi.py
 delete mode 100644 drivers/baseband/acc/version.map
 delete mode 100644 drivers/baseband/fpga_5gnr_fec/version.map
 delete mode 100644 drivers/baseband/fpga_lte_fec/version.map
 delete mode 100644 drivers/bus/auxiliary/version.map
 delete mode 100644 drivers/bus/cdx/version.map
 delete mode 100644 drivers/bus/dpaa/version.map
 delete mode 100644 drivers/bus/fslmc/version.map
 delete mode 100644 drivers/bus/ifpga/version.map
 delete mode 100644 drivers/bus/pci/version.map
 delete mode 100644 drivers/bus/platform/version.map
 delete mode 100644 drivers/bus/uacce/version.map
 delete mode 100644 drivers/bus/vdev/version.map
 delete mode 100644 drivers/bus/vmbus/version.map
 delete mode 100644 drivers/common/cnxk/version.map
 delete mode 100644 drivers/common/cpt/version.map
 delete mode 100644 drivers/common/dpaax/version.map
 delete mode 100644 drivers/common/ionic/version.map
 delete mode 100644 drivers/common/mlx5/version.map
 delete mode 100644 drivers/common/mvep/version.map
 delete mode 100644 drivers/common/nfp/version.map
 delete mode 100644 drivers/common/nitrox/version.map
 delete mode 100644 drivers/common/octeontx/version.map
 delete mode 100644 drivers/common/sfc_efx/version.map
 delete mode 100644 drivers/crypto/cnxk/version.map
 delete mode 100644 drivers/crypto/dpaa2_sec/version.map
 delete mode 100644 drivers/crypto/dpaa_sec/version.map
 delete mode 100644 drivers/crypto/octeontx/version.map
 delete mode 100644 drivers/crypto/scheduler/version.map
 delete mode 100644 drivers/dma/cnxk/version.map
 delete mode 100644 drivers/event/cnxk/version.map
 delete mode 100644 drivers/event/dlb2/version.map
 delete mode 100644 drivers/mempool/cnxk/version.map
 delete mode 100644 drivers/mempool/dpaa/version.map
 delete mode 100644 drivers/mempool/dpaa2/version.map
 delete mode 100644 drivers/net/atlantic/version.map
 delete mode 100644 drivers/net/bnxt/version.map
 delete mode 100644 drivers/net/bonding/version.map
 delete mode 100644 drivers/net/cnxk/version.map
 delete mode 100644 drivers/net/dpaa/version.map
 delete mode 100644 drivers/net/dpaa2/version.map
 delete mode 100644 drivers/net/intel/i40e/version.map
 delete mode 100644 drivers/net/intel/iavf/version.map
 delete mode 100644 drivers/net/intel/ice/version.map
 delete mode 100644 drivers/net/intel/idpf/version.map
 delete mode 100644 drivers/net/intel/ipn3ke/version.map
 delete mode 100644 drivers/net/intel/ixgbe/version.map
 delete mode 100644 drivers/net/mlx5/version.map
 delete mode 100644 drivers/net/octeontx/version.map
 delete mode 100644 drivers/net/ring/version.map
 delete mode 100644 drivers/net/softnic/version.map
 delete mode 100644 drivers/net/vhost/version.map
 delete mode 100644 drivers/power/kvm_vm/version.map
 delete mode 100644 drivers/raw/cnxk_rvu_lf/version.map
 delete mode 100644 drivers/raw/ifpga/version.map
 delete mode 100644 drivers/version.map
 delete mode 100644 lib/acl/version.map
 delete mode 100644 lib/argparse/version.map
 delete mode 100644 lib/bbdev/version.map
 delete mode 100644 lib/bitratestats/version.map
 delete mode 100644 lib/bpf/version.map
 delete mode 100644 lib/cfgfile/version.map
 delete mode 100644 lib/cmdline/version.map
 delete mode 100644 lib/compressdev/version.map
 delete mode 100644 lib/cryptodev/version.map
 delete mode 100644 lib/dispatcher/version.map
 delete mode 100644 lib/distributor/version.map
 delete mode 100644 lib/dmadev/version.map
 delete mode 100644 lib/eal/version.map
 delete mode 100644 lib/efd/version.map
 delete mode 100644 lib/ethdev/version.map
 delete mode 100644 lib/eventdev/version.map
 delete mode 100644 lib/fib/version.map
 delete mode 100644 lib/gpudev/version.map
 delete mode 100644 lib/graph/version.map
 delete mode 100644 lib/gro/version.map
 delete mode 100644 lib/gso/version.map
 delete mode 100644 lib/hash/version.map
 delete mode 100644 lib/ip_frag/version.map
 delete mode 100644 lib/ipsec/version.map
 delete mode 100644 lib/jobstats/version.map
 delete mode 100644 lib/kvargs/version.map
 delete mode 100644 lib/latencystats/version.map
 delete mode 100644 lib/log/version.map
 delete mode 100644 lib/lpm/version.map
 delete mode 100644 lib/mbuf/version.map
 delete mode 100644 lib/member/version.map
 delete mode 100644 lib/mempool/version.map
 delete mode 100644 lib/meter/version.map
 delete mode 100644 lib/metrics/version.map
 delete mode 100644 lib/mldev/version.map
 delete mode 100644 lib/net/version.map
 delete mode 100644 lib/node/version.map
 delete mode 100644 lib/pcapng/version.map
 delete mode 100644 lib/pci/version.map
 delete mode 100644 lib/pdcp/version.map
 delete mode 100644 lib/pdump/version.map
 delete mode 100644 lib/pipeline/version.map
 delete mode 100644 lib/port/version.map
 delete mode 100644 lib/power/version.map
 delete mode 100644 lib/rawdev/version.map
 delete mode 100644 lib/rcu/version.map
 delete mode 100644 lib/regexdev/version.map
 delete mode 100644 lib/reorder/version.map
 delete mode 100644 lib/rib/version.map
 delete mode 100644 lib/ring/version.map
 delete mode 100644 lib/sched/version.map
 delete mode 100644 lib/security/version.map
 delete mode 100644 lib/stack/version.map
 delete mode 100644 lib/table/version.map
 delete mode 100644 lib/telemetry/version.map
 delete mode 100644 lib/timer/version.map
 delete mode 100644 lib/vhost/version.map

-- 
2.48.1


^ permalink raw reply	[relevance 3%]

* Re: [RFC v3 5/8] build: generate symbol maps
  2025-03-14 15:27  0%     ` Andre Muezerie
@ 2025-03-14 15:51  4%       ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-14 15:51 UTC (permalink / raw)
  To: Andre Muezerie; +Cc: dev, thomas, bruce.richardson

On Fri, Mar 14, 2025 at 4:28 PM Andre Muezerie
<andremue@linux.microsoft.com> wrote:
>
> On Tue, Mar 11, 2025 at 10:56:03AM +0100, David Marchand wrote:
> > Rather than maintain a file in parallel of the code, symbols to be
> > exported can be marked with a token RTE_EXPORT_*SYMBOL.
> >
> > >From those marks, the build framework generates map files only for
> > symbols actually compiled (which means that the WINDOWS_NO_EXPORT hack
> > becomes unnecessary).
> >
> > The build framework directly creates a map file in the format that the
> > linker expects (rather than converting from GNU linker to MSVC linker).
> >
> > Empty maps are allowed again as a replacement for drivers/version.map.
> >
> > The symbol check is updated to only support the new format.
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > ---
> > Changes since RFC v2:
> > - because of MSVC limitations wrt macro passed via cmdline,
> >   used an internal header for defining RTE_EXPORT_* macros,
> > - updated documentation and tooling,
> >
> > ---
> >  MAINTAINERS                                |   2 +
> >  buildtools/gen-version-map.py              | 111 ++++++++++
> >  buildtools/map-list-symbol.sh              |  10 +-
> >  buildtools/meson.build                     |   1 +
> >  config/meson.build                         |   2 +
> >  config/rte_export.h                        |  16 ++
> >  devtools/check-symbol-change.py            |  90 +++++++++
> >  devtools/check-symbol-maps.sh              |  14 --
> >  devtools/checkpatches.sh                   |   2 +-
> >  doc/guides/contributing/abi_versioning.rst | 224 ++-------------------
> >  drivers/meson.build                        |  94 +++++----
> >  drivers/version.map                        |   3 -
> >  lib/meson.build                            |  91 ++++++---
> >  13 files changed, 371 insertions(+), 289 deletions(-)
> >  create mode 100755 buildtools/gen-version-map.py
> >  create mode 100644 config/rte_export.h
> >  create mode 100755 devtools/check-symbol-change.py
> >  delete mode 100644 drivers/version.map
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 312e6fcee5..04772951d3 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -95,6 +95,7 @@ F: devtools/check-maintainers.sh
> >  F: devtools/check-forbidden-tokens.awk
> >  F: devtools/check-git-log.sh
> >  F: devtools/check-spdx-tag.sh
> > +F: devtools/check-symbol-change.py
> >  F: devtools/check-symbol-change.sh
> >  F: devtools/check-symbol-maps.sh
> >  F: devtools/checkpatches.sh
> > @@ -127,6 +128,7 @@ F: config/
> >  F: buildtools/check-symbols.sh
> >  F: buildtools/chkincs/
> >  F: buildtools/call-sphinx-build.py
> > +F: buildtools/gen-version-map.py
> >  F: buildtools/get-cpu-count.py
> >  F: buildtools/get-numa-count.py
> >  F: buildtools/list-dir-globs.py
> > diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
> > new file mode 100755
> > index 0000000000..b160aa828b
> > --- /dev/null
> > +++ b/buildtools/gen-version-map.py
> > @@ -0,0 +1,111 @@
> > +#!/usr/bin/env python3
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright (c) 2024 Red Hat, Inc.
>
> 2025?

Well, technically, I had written the first version of this script in 2024 :-).
But I'll align to the rest of the patch.

> I appreciate that Python was chosen instead of sh/bash.
>
> > +
> > +"""Generate a version map file used by GNU or MSVC linker."""
> > +
> > +import re
> > +import sys
> > +
> > +# From rte_export.h
> > +export_exp_sym_regexp = re.compile(r"^RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+), ([0-9]+.[0-9]+)\)")
> > +export_int_sym_regexp = re.compile(r"^RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> > +export_sym_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^)]+)\)")
> > +# From rte_function_versioning.h
> > +ver_sym_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +ver_exp_sym_regexp = re.compile(r"^RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> > +default_sym_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +
> > +with open(sys.argv[2]) as f:
> > +    abi = 'DPDK_{}'.format(re.match("([0-9]+).[0-9]", f.readline()).group(1))
> > +
> > +symbols = {}
> > +
> > +for file in sys.argv[4:]:
> > +    with open(file, encoding="utf-8") as f:
> > +        for ln in f.readlines():
> > +            node = None
> > +            symbol = None
> > +            comment = None
> > +            if export_exp_sym_regexp.match(ln):
> > +                node = 'EXPERIMENTAL'
> > +                symbol = export_exp_sym_regexp.match(ln).group(1)
> > +                comment = ' # added in {}'.format(export_exp_sym_regexp.match(ln).group(2))
> > +            elif export_int_sym_regexp.match(ln):
> > +                node = 'INTERNAL'
> > +                symbol = export_int_sym_regexp.match(ln).group(1)
> > +            elif export_sym_regexp.match(ln):
> > +                node = abi
> > +                symbol = export_sym_regexp.match(ln).group(1)
> > +            elif ver_sym_regexp.match(ln):
> > +                node = 'DPDK_{}'.format(ver_sym_regexp.match(ln).group(1))
> > +                symbol = ver_sym_regexp.match(ln).group(2)
> > +            elif ver_exp_sym_regexp.match(ln):
> > +                node = 'EXPERIMENTAL'
> > +                symbol = ver_exp_sym_regexp.match(ln).group(1)
> > +            elif default_sym_regexp.match(ln):
> > +                node = 'DPDK_{}'.format(default_sym_regexp.match(ln).group(1))
> > +                symbol = default_sym_regexp.match(ln).group(2)
> > +
> > +            if not symbol:
> > +                continue
> > +
> > +            if node not in symbols:
> > +                symbols[node] = {}
> > +            symbols[node][symbol] = comment
> > +
> > +if sys.argv[1] == 'msvc':
> > +    with open(sys.argv[3], "w") as outfile:
> > +        outfile.writelines(f"EXPORTS\n")
> > +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> > +            if key not in symbols:
> > +                continue
> > +            for symbol in sorted(symbols[key].keys()):
> > +                outfile.writelines(f"\t{symbol}\n")
> > +            del symbols[key]
> > +else:
> > +    with open(sys.argv[3], "w") as outfile:
>
> Consider having output file samples documented, perhaps in this script itself, to make
> it easier to understand what this script it doing and highlight the differences between
> the formats supported (msvc, etc).

I am not sure I follow.

The differences between the format is not something "normal" DPDK
contributors/developers should care about.
DPDK documentation was giving (too much) details on the version.map
gnu linker stuff, and I would prefer we stop documenting this.
Instead, the focus should be on the new sets of export macros, which
serve as an abstraction.


>
> > +        local_token = False
> > +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> > +            if key not in symbols:
> > +                continue
> > +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> > +            for symbol in sorted(symbols[key].keys()):
> > +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> > +                    prefix = '__emutls_v.'
> > +                else:
> > +                    prefix = ''
> > +                outfile.writelines(f"\t{prefix}{symbol};")
> > +                comment = symbols[key][symbol]
> > +                if comment:
> > +                    outfile.writelines(f"{comment}")
> > +                outfile.writelines("\n")
> > +            outfile.writelines("\n")
> > +            if not local_token:
> > +                outfile.writelines("\tlocal: *;\n")
> > +                local_token = True
> > +            outfile.writelines("};\n")
> > +            del symbols[key]
> > +        for key in sorted(symbols.keys()):
> > +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> > +            for symbol in sorted(symbols[key].keys()):
> > +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> > +                    prefix = '__emutls_v.'
> > +                else:
> > +                    prefix = ''
> > +                outfile.writelines(f"\t{prefix}{symbol};")
> > +                comment = symbols[key][symbol]
> > +                if comment:
> > +                    outfile.writelines(f"{comment}")
> > +                outfile.writelines("\n")
> > +            outfile.writelines(f"}} {abi};\n")
> > +            if not local_token:
> > +                outfile.writelines("\tlocal: *;\n")
> > +                local_token = True
> > +            del symbols[key]
> > +        # No exported symbol, add a catch all
> > +        if not local_token:
> > +            outfile.writelines(f"{abi} {{\n")
> > +            outfile.writelines("\tlocal: *;\n")
> > +            local_token = True
> > +            outfile.writelines("};\n")
> > diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
> > index eb98451d8e..0829df4be5 100755
> > --- a/buildtools/map-list-symbol.sh
> > +++ b/buildtools/map-list-symbol.sh
> > @@ -62,10 +62,14 @@ for file in $@; do
> >               if (current_section == "") {
> >                       next;
> >               }
> > +             symbol_version = current_version
> > +             if (/^[^}].*[^:*]; # added in /) {
> > +                     symbol_version = $5
> > +             }
> >               if ("'$version'" != "") {
> > -                     if ("'$version'" == "unset" && current_version != "") {
> > +                     if ("'$version'" == "unset" && symbol_version != "") {
> >                               next;
> > -                     } else if ("'$version'" != "unset" && "'$version'" != current_version) {
> > +                     } else if ("'$version'" != "unset" && "'$version'" != symbol_version) {
> >                               next;
> >                       }
> >               }
> > @@ -73,7 +77,7 @@ for file in $@; do
> >               if ("'$symbol'" == "all" || $1 == "'$symbol'") {
> >                       ret = 0;
> >                       if ("'$quiet'" == "") {
> > -                             print "'$file' "current_section" "$1" "current_version;
> > +                             print "'$file' "current_section" "$1" "symbol_version;
> >                       }
> >                       if ("'$symbol'" != "all") {
> >                               exit 0;
> > diff --git a/buildtools/meson.build b/buildtools/meson.build
> > index 4e2c1217a2..b745e9afa4 100644
> > --- a/buildtools/meson.build
> > +++ b/buildtools/meson.build
> > @@ -16,6 +16,7 @@ else
> >      py3 = ['meson', 'runpython']
> >  endif
> >  echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
> > +gen_version_map = py3 + files('gen-version-map.py')
> >  list_dir_globs = py3 + files('list-dir-globs.py')
> >  map_to_win_cmd = py3 + files('map_to_win.py')
> >  sphinx_wrapper = py3 + files('call-sphinx-build.py')
> > diff --git a/config/meson.build b/config/meson.build
> > index f31fef216c..54657055fb 100644
> > --- a/config/meson.build
> > +++ b/config/meson.build
> > @@ -303,8 +303,10 @@ endif
> >  # add -include rte_config to cflags
> >  if is_ms_compiler
> >      add_project_arguments('/FI', 'rte_config.h', language: 'c')
> > +    add_project_arguments('/FI', 'rte_export.h', language: 'c')
> >  else
> >      add_project_arguments('-include', 'rte_config.h', language: 'c')
> > +    add_project_arguments('-include', 'rte_export.h', language: 'c')
> >  endif
> >
> >  # enable extra warnings and disable any unwanted warnings
> > diff --git a/config/rte_export.h b/config/rte_export.h
> > new file mode 100644
> > index 0000000000..83d871fe11
> > --- /dev/null
> > +++ b/config/rte_export.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2025 Red Hat, Inc.
> > + */
> > +
> > +#ifndef RTE_EXPORT_H
> > +#define RTE_EXPORT_H
> > +
> > +/* *Internal* macros for exporting symbols, used by the build system.
> > + * For RTE_EXPORT_EXPERIMENTAL_SYMBOL, ver indicates the
> > + * version this symbol was introduced in.
> > + */
> > +#define RTE_EXPORT_EXPERIMENTAL_SYMBOL(a, ver)
> > +#define RTE_EXPORT_INTERNAL_SYMBOL(a)
> > +#define RTE_EXPORT_SYMBOL(a)
> > +
> > +#endif /* RTE_EXPORT_H */
> > diff --git a/devtools/check-symbol-change.py b/devtools/check-symbol-change.py
> > new file mode 100755
> > index 0000000000..09709e4f06
> > --- /dev/null
> > +++ b/devtools/check-symbol-change.py
> > @@ -0,0 +1,90 @@
> > +#!/usr/bin/env python3
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright (c) 2025 Red Hat, Inc.
> > +
> > +"""Check exported symbols change in a patch."""
> > +
> > +import re
> > +import sys
> > +
> > +file_header_regexp = re.compile(r"^(\-\-\-|\+\+\+) [ab]/(lib|drivers)/([^/]+)/([^/]+)")
> > +# From rte_export.h
> > +export_exp_sym_regexp = re.compile(r"^.RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+),")
> > +export_int_sym_regexp = re.compile(r"^.RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> > +export_sym_regexp = re.compile(r"^.RTE_EXPORT_SYMBOL\(([^)]+)\)")
> > +# TODO, handle versioned symbols from rte_function_versioning.h
> > +# ver_sym_regexp = re.compile(r"^.RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +# ver_exp_sym_regexp = re.compile(r"^.RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> > +# default_sym_regexp = re.compile(r"^.RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +
> > +symbols = {}
> > +
> > +for file in sys.argv[1:]:
> > +    with open(file, encoding="utf-8") as f:
> > +        for ln in f.readlines():
> > +            if file_header_regexp.match(ln):
> > +                if file_header_regexp.match(ln).group(2) == "lib":
> > +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> > +                elif file_header_regexp.match(ln).group(3) == "intel":
> > +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3, 4))
> > +                else:
> > +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> > +
> > +                if lib not in symbols:
> > +                    symbols[lib] = {}
> > +                continue
> > +
> > +            if export_exp_sym_regexp.match(ln):
> > +                symbol = export_exp_sym_regexp.match(ln).group(1)
> > +                node = 'EXPERIMENTAL'
> > +            elif export_int_sym_regexp.match(ln):
> > +                node = 'INTERNAL'
> > +                symbol = export_int_sym_regexp.match(ln).group(1)
> > +            elif export_sym_regexp.match(ln):
> > +                symbol = export_sym_regexp.match(ln).group(1)
> > +                node = 'stable'
> > +            else:
> > +                continue
> > +
> > +            if symbol not in symbols[lib]:
> > +                symbols[lib][symbol] = {}
> > +            added = ln[0] == '+'
> > +            if added and 'added' in symbols[lib][symbol] and node != symbols[lib][symbol]['added']:
> > +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> > +            if not added and 'removed' in symbols[lib][symbol] and node != symbols[lib][symbol]['removed']:
> > +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> > +            if added:
> > +                symbols[lib][symbol]['added'] = node
> > +            else:
> > +                symbols[lib][symbol]['removed'] = node
> > +
> > +    for lib in sorted(symbols.keys()):
> > +        error = False
> > +        for symbol in sorted(symbols[lib].keys()):
> > +            if 'removed' not in symbols[lib][symbol]:
> > +                # Symbol addition
> > +                node = symbols[lib][symbol]['added']
> > +                if node == 'stable':
> > +                    print(f"ERROR: {symbol} in {lib} has been added directly to stable ABI.")
> > +                    error = True
> > +                else:
> > +                    print(f"INFO: {symbol} in {lib} has been added to {node} ABI.")
> > +                continue
> > +
> > +            if 'added' not in symbols[lib][symbol]:
> > +                # Symbol removal
> > +                node = symbols[lib][symbol]['added']
> > +                if node == 'stable':
> > +                    print(f"INFO: {symbol} in {lib} has been removed from stable ABI.")
>
> Some people would argue that WARN instead of INFO is more appropriate because some attention
> is needed from the user. INFO many times is just ignored.

True, though the ABI check is supposed to fail with a big ERROR :-).

I would have to remember why we put INFO initially (I am just
reimplementing the .sh check that existed on static maps).
I think Thomas was the one who wanted it as INFO...


>
> > +                    print(f"Please check it has gone though the deprecation process.")
> > +                continue
> > +
> > +            if symbols[lib][symbol]['added'] == symbols[lib][symbol]['removed']:
> > +                # Symbol was moved around
> > +                continue
> > +
> > +            # Symbol modifications
> > +            added = symbols[lib][symbol]['added']
> > +            removed = symbols[lib][symbol]['removed']
> > +            print(f"INFO: {symbol} in {lib} is moving from {removed} to {added}")
>
> Perhaps use WARN instead of INFO.

On this part, I disagree.
Moving from a non stable (like experimental) ABI to stable or other
non stable ABI (like internal) is not an issue.


>
> > +            print(f"Please check it has gone though the deprecation process.")

[snip]


-- 
David Marchand


^ permalink raw reply	[relevance 4%]

* Re: [RFC v3 5/8] build: generate symbol maps
  2025-03-13 17:26  0%     ` Bruce Richardson
@ 2025-03-14 15:38  0%       ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-14 15:38 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, thomas, andremue

On Thu, Mar 13, 2025 at 6:27 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Tue, Mar 11, 2025 at 10:56:03AM +0100, David Marchand wrote:
> > Rather than maintain a file in parallel of the code, symbols to be
> > exported can be marked with a token RTE_EXPORT_*SYMBOL.
> >
> > From those marks, the build framework generates map files only for
> > symbols actually compiled (which means that the WINDOWS_NO_EXPORT hack
> > becomes unnecessary).
> >
> > The build framework directly creates a map file in the format that the
> > linker expects (rather than converting from GNU linker to MSVC linker).
> >
> > Empty maps are allowed again as a replacement for drivers/version.map.
> >
> > The symbol check is updated to only support the new format.
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
>
> Some comments inline below.
> /Bruce
>
> > ---
> > Changes since RFC v2:
> > - because of MSVC limitations wrt macro passed via cmdline,
> >   used an internal header for defining RTE_EXPORT_* macros,
> > - updated documentation and tooling,
> >
> > ---
> >  MAINTAINERS                                |   2 +
> >  buildtools/gen-version-map.py              | 111 ++++++++++
> >  buildtools/map-list-symbol.sh              |  10 +-
> >  buildtools/meson.build                     |   1 +
> >  config/meson.build                         |   2 +
> >  config/rte_export.h                        |  16 ++
> >  devtools/check-symbol-change.py            |  90 +++++++++
> >  devtools/check-symbol-maps.sh              |  14 --
> >  devtools/checkpatches.sh                   |   2 +-
> >  doc/guides/contributing/abi_versioning.rst | 224 ++-------------------
> >  drivers/meson.build                        |  94 +++++----
> >  drivers/version.map                        |   3 -
> >  lib/meson.build                            |  91 ++++++---
> >  13 files changed, 371 insertions(+), 289 deletions(-)
> >  create mode 100755 buildtools/gen-version-map.py
> >  create mode 100644 config/rte_export.h
> >  create mode 100755 devtools/check-symbol-change.py
> >  delete mode 100644 drivers/version.map
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 312e6fcee5..04772951d3 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -95,6 +95,7 @@ F: devtools/check-maintainers.sh
> >  F: devtools/check-forbidden-tokens.awk
> >  F: devtools/check-git-log.sh
> >  F: devtools/check-spdx-tag.sh
> > +F: devtools/check-symbol-change.py
> >  F: devtools/check-symbol-change.sh
> >  F: devtools/check-symbol-maps.sh
> >  F: devtools/checkpatches.sh
> > @@ -127,6 +128,7 @@ F: config/
> >  F: buildtools/check-symbols.sh
> >  F: buildtools/chkincs/
> >  F: buildtools/call-sphinx-build.py
> > +F: buildtools/gen-version-map.py
> >  F: buildtools/get-cpu-count.py
> >  F: buildtools/get-numa-count.py
> >  F: buildtools/list-dir-globs.py
> > diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
> > new file mode 100755
> > index 0000000000..b160aa828b
> > --- /dev/null
> > +++ b/buildtools/gen-version-map.py
> > @@ -0,0 +1,111 @@
> > +#!/usr/bin/env python3
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright (c) 2024 Red Hat, Inc.
> > +
> > +"""Generate a version map file used by GNU or MSVC linker."""
> > +
>
> While it's an internal build script not to be run by users directly, I
> believe a short one-line usage here might be useful, since the code below
> is directly referencing sys.argv[N] values. That makes it easier for the
> user to know what they are.
>
> Alternatively, assign them to proper names at the top of the script e.g.:
>         scriptname, link_mode, abi_version_file, output, *input = sys.argv

I like this simple form.

>
> Final alternative (which may be a bit overkill) is to use argparse.
>
> > +import re
> > +import sys
> > +
> > +# From rte_export.h
> > +export_exp_sym_regexp = re.compile(r"^RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+), ([0-9]+.[0-9]+)\)")
> > +export_int_sym_regexp = re.compile(r"^RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> > +export_sym_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^)]+)\)")
> > +# From rte_function_versioning.h
> > +ver_sym_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +ver_exp_sym_regexp = re.compile(r"^RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> > +default_sym_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +
> > +with open(sys.argv[2]) as f:
> > +    abi = 'DPDK_{}'.format(re.match("([0-9]+).[0-9]", f.readline()).group(1))
> > +
> > +symbols = {}
> > +
> > +for file in sys.argv[4:]:
> > +    with open(file, encoding="utf-8") as f:
> > +        for ln in f.readlines():
> > +            node = None
> > +            symbol = None
> > +            comment = None
> > +            if export_exp_sym_regexp.match(ln):
> > +                node = 'EXPERIMENTAL'
> > +                symbol = export_exp_sym_regexp.match(ln).group(1)
> > +                comment = ' # added in {}'.format(export_exp_sym_regexp.match(ln).group(2))
> > +            elif export_int_sym_regexp.match(ln):
> > +                node = 'INTERNAL'
> > +                symbol = export_int_sym_regexp.match(ln).group(1)
> > +            elif export_sym_regexp.match(ln):
> > +                node = abi
> > +                symbol = export_sym_regexp.match(ln).group(1)
> > +            elif ver_sym_regexp.match(ln):
> > +                node = 'DPDK_{}'.format(ver_sym_regexp.match(ln).group(1))
> > +                symbol = ver_sym_regexp.match(ln).group(2)
> > +            elif ver_exp_sym_regexp.match(ln):
> > +                node = 'EXPERIMENTAL'
> > +                symbol = ver_exp_sym_regexp.match(ln).group(1)
> > +            elif default_sym_regexp.match(ln):
> > +                node = 'DPDK_{}'.format(default_sym_regexp.match(ln).group(1))
> > +                symbol = default_sym_regexp.match(ln).group(2)
> > +
> > +            if not symbol:
> > +                continue
> > +
> > +            if node not in symbols:
> > +                symbols[node] = {}
> > +            symbols[node][symbol] = comment
> > +
> > +if sys.argv[1] == 'msvc':
> > +    with open(sys.argv[3], "w") as outfile:
> > +        outfile.writelines(f"EXPORTS\n")
> > +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> > +            if key not in symbols:
> > +                continue
> > +            for symbol in sorted(symbols[key].keys()):
> > +                outfile.writelines(f"\t{symbol}\n")
> > +            del symbols[key]
> > +else:
> > +    with open(sys.argv[3], "w") as outfile:
> > +        local_token = False
> > +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> > +            if key not in symbols:
> > +                continue
> > +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> > +            for symbol in sorted(symbols[key].keys()):
> > +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> > +                    prefix = '__emutls_v.'
> > +                else:
> > +                    prefix = ''
> > +                outfile.writelines(f"\t{prefix}{symbol};")
> > +                comment = symbols[key][symbol]
> > +                if comment:
> > +                    outfile.writelines(f"{comment}")
> > +                outfile.writelines("\n")
>
> How about using "" rather than None for the default comment so you can
> always just do a print of "{prefix}{symbol};{comment}\n". The fact that
> writelines doesn't output a "\n" is a little confusing here, so maybe use
> "print" instead.
>
>         print("f\t{prefix}{symbol};{comment}", file=outfile)

Yes, better.


>
> > +            outfile.writelines("\n")
> > +            if not local_token:
> > +                outfile.writelines("\tlocal: *;\n")
> > +                local_token = True
> > +            outfile.writelines("};\n")
> > +            del symbols[key]
> > +        for key in sorted(symbols.keys()):
> > +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> > +            for symbol in sorted(symbols[key].keys()):
> > +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> > +                    prefix = '__emutls_v.'
> > +                else:
> > +                    prefix = ''
> > +                outfile.writelines(f"\t{prefix}{symbol};")
> > +                comment = symbols[key][symbol]
> > +                if comment:
> > +                    outfile.writelines(f"{comment}")
> > +                outfile.writelines("\n")
> > +            outfile.writelines(f"}} {abi};\n")
> > +            if not local_token:
> > +                outfile.writelines("\tlocal: *;\n")
> > +                local_token = True
> > +            del symbols[key]
> > +        # No exported symbol, add a catch all
> > +        if not local_token:
> > +            outfile.writelines(f"{abi} {{\n")
> > +            outfile.writelines("\tlocal: *;\n")
> > +            local_token = True
> > +            outfile.writelines("};\n")
> > diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
> > index eb98451d8e..0829df4be5 100755
> > --- a/buildtools/map-list-symbol.sh
> > +++ b/buildtools/map-list-symbol.sh
> > @@ -62,10 +62,14 @@ for file in $@; do
> >               if (current_section == "") {
> >                       next;
> >               }
> > +             symbol_version = current_version
> > +             if (/^[^}].*[^:*]; # added in /) {
> > +                     symbol_version = $5
> > +             }
> >               if ("'$version'" != "") {
> > -                     if ("'$version'" == "unset" && current_version != "") {
> > +                     if ("'$version'" == "unset" && symbol_version != "") {
> >                               next;
> > -                     } else if ("'$version'" != "unset" && "'$version'" != current_version) {
> > +                     } else if ("'$version'" != "unset" && "'$version'" != symbol_version) {
> >                               next;
> >                       }
> >               }
> > @@ -73,7 +77,7 @@ for file in $@; do
> >               if ("'$symbol'" == "all" || $1 == "'$symbol'") {
> >                       ret = 0;
> >                       if ("'$quiet'" == "") {
> > -                             print "'$file' "current_section" "$1" "current_version;
> > +                             print "'$file' "current_section" "$1" "symbol_version;
> >                       }
> >                       if ("'$symbol'" != "all") {
> >                               exit 0;
> > diff --git a/buildtools/meson.build b/buildtools/meson.build
> > index 4e2c1217a2..b745e9afa4 100644
> > --- a/buildtools/meson.build
> > +++ b/buildtools/meson.build
> > @@ -16,6 +16,7 @@ else
> >      py3 = ['meson', 'runpython']
> >  endif
> >  echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
> > +gen_version_map = py3 + files('gen-version-map.py')
> >  list_dir_globs = py3 + files('list-dir-globs.py')
> >  map_to_win_cmd = py3 + files('map_to_win.py')
> >  sphinx_wrapper = py3 + files('call-sphinx-build.py')
> > diff --git a/config/meson.build b/config/meson.build
> > index f31fef216c..54657055fb 100644
> > --- a/config/meson.build
> > +++ b/config/meson.build
> > @@ -303,8 +303,10 @@ endif
> >  # add -include rte_config to cflags
> >  if is_ms_compiler
> >      add_project_arguments('/FI', 'rte_config.h', language: 'c')
> > +    add_project_arguments('/FI', 'rte_export.h', language: 'c')
> >  else
> >      add_project_arguments('-include', 'rte_config.h', language: 'c')
> > +    add_project_arguments('-include', 'rte_export.h', language: 'c')
> >  endif
> >
> >  # enable extra warnings and disable any unwanted warnings
> > diff --git a/config/rte_export.h b/config/rte_export.h
> > new file mode 100644
> > index 0000000000..83d871fe11
> > --- /dev/null
> > +++ b/config/rte_export.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2025 Red Hat, Inc.
> > + */
> > +
> > +#ifndef RTE_EXPORT_H
> > +#define RTE_EXPORT_H
> > +
> > +/* *Internal* macros for exporting symbols, used by the build system.
> > + * For RTE_EXPORT_EXPERIMENTAL_SYMBOL, ver indicates the
> > + * version this symbol was introduced in.
> > + */
> > +#define RTE_EXPORT_EXPERIMENTAL_SYMBOL(a, ver)
> > +#define RTE_EXPORT_INTERNAL_SYMBOL(a)
> > +#define RTE_EXPORT_SYMBOL(a)
> > +
> > +#endif /* RTE_EXPORT_H */
> > diff --git a/devtools/check-symbol-change.py b/devtools/check-symbol-change.py
> > new file mode 100755
> > index 0000000000..09709e4f06
> > --- /dev/null
> > +++ b/devtools/check-symbol-change.py
> > @@ -0,0 +1,90 @@
> > +#!/usr/bin/env python3
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright (c) 2025 Red Hat, Inc.
> > +
> > +"""Check exported symbols change in a patch."""
> > +
> > +import re
> > +import sys
> > +
> > +file_header_regexp = re.compile(r"^(\-\-\-|\+\+\+) [ab]/(lib|drivers)/([^/]+)/([^/]+)")
> > +# From rte_export.h
> > +export_exp_sym_regexp = re.compile(r"^.RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+),")
> > +export_int_sym_regexp = re.compile(r"^.RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> > +export_sym_regexp = re.compile(r"^.RTE_EXPORT_SYMBOL\(([^)]+)\)")
> > +# TODO, handle versioned symbols from rte_function_versioning.h
> > +# ver_sym_regexp = re.compile(r"^.RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +# ver_exp_sym_regexp = re.compile(r"^.RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> > +# default_sym_regexp = re.compile(r"^.RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> > +
> > +symbols = {}
> > +
> > +for file in sys.argv[1:]:
> > +    with open(file, encoding="utf-8") as f:
> > +        for ln in f.readlines():
> > +            if file_header_regexp.match(ln):
> > +                if file_header_regexp.match(ln).group(2) == "lib":
> > +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> > +                elif file_header_regexp.match(ln).group(3) == "intel":
> > +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3, 4))
> > +                else:
> > +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> > +
> > +                if lib not in symbols:
> > +                    symbols[lib] = {}
> > +                continue
> > +
> > +            if export_exp_sym_regexp.match(ln):
> > +                symbol = export_exp_sym_regexp.match(ln).group(1)
> > +                node = 'EXPERIMENTAL'
> > +            elif export_int_sym_regexp.match(ln):
> > +                node = 'INTERNAL'
> > +                symbol = export_int_sym_regexp.match(ln).group(1)
> > +            elif export_sym_regexp.match(ln):
> > +                symbol = export_sym_regexp.match(ln).group(1)
> > +                node = 'stable'
> > +            else:
> > +                continue
> > +
> > +            if symbol not in symbols[lib]:
> > +                symbols[lib][symbol] = {}
> > +            added = ln[0] == '+'
> > +            if added and 'added' in symbols[lib][symbol] and node != symbols[lib][symbol]['added']:
> > +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> > +            if not added and 'removed' in symbols[lib][symbol] and node != symbols[lib][symbol]['removed']:
> > +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> > +            if added:
> > +                symbols[lib][symbol]['added'] = node
> > +            else:
> > +                symbols[lib][symbol]['removed'] = node
> > +
> > +    for lib in sorted(symbols.keys()):
> > +        error = False
> > +        for symbol in sorted(symbols[lib].keys()):
> > +            if 'removed' not in symbols[lib][symbol]:
> > +                # Symbol addition
> > +                node = symbols[lib][symbol]['added']
> > +                if node == 'stable':
> > +                    print(f"ERROR: {symbol} in {lib} has been added directly to stable ABI.")
> > +                    error = True
> > +                else:
> > +                    print(f"INFO: {symbol} in {lib} has been added to {node} ABI.")
> > +                continue
> > +
> > +            if 'added' not in symbols[lib][symbol]:
> > +                # Symbol removal
> > +                node = symbols[lib][symbol]['added']
> > +                if node == 'stable':
> > +                    print(f"INFO: {symbol} in {lib} has been removed from stable ABI.")
> > +                    print(f"Please check it has gone though the deprecation process.")
> > +                continue
> > +
> > +            if symbols[lib][symbol]['added'] == symbols[lib][symbol]['removed']:
> > +                # Symbol was moved around
> > +                continue
> > +
> > +            # Symbol modifications
> > +            added = symbols[lib][symbol]['added']
> > +            removed = symbols[lib][symbol]['removed']
> > +            print(f"INFO: {symbol} in {lib} is moving from {removed} to {added}")
> > +            print(f"Please check it has gone though the deprecation process.")
> > diff --git a/devtools/check-symbol-maps.sh b/devtools/check-symbol-maps.sh
> > index 6121f78ec6..fcd3931e5d 100755
> > --- a/devtools/check-symbol-maps.sh
> > +++ b/devtools/check-symbol-maps.sh
> > @@ -60,20 +60,6 @@ if [ -n "$local_miss_maps" ] ; then
> >      ret=1
> >  fi
> >
> > -find_empty_maps ()
> > -{
> > -    for map in $@ ; do
> > -        [ $(buildtools/map-list-symbol.sh $map | wc -l) != '0' ] || echo $map
> > -    done
> > -}
> > -
> > -empty_maps=$(find_empty_maps $@)
> > -if [ -n "$empty_maps" ] ; then
> > -    echo "Found empty maps:"
> > -    echo "$empty_maps"
> > -    ret=1
> > -fi
> > -
> >  find_bad_format_maps ()
> >  {
> >      abi_version=$(cut -d'.' -f 1 ABI_VERSION)
> > diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
> > index 003bb49e04..7dcac7c8c9 100755
> > --- a/devtools/checkpatches.sh
> > +++ b/devtools/checkpatches.sh
> > @@ -33,7 +33,7 @@ VOLATILE,PREFER_PACKED,PREFER_ALIGNED,PREFER_PRINTF,STRLCPY,\
> >  PREFER_KERNEL_TYPES,PREFER_FALLTHROUGH,BIT_MACRO,CONST_STRUCT,\
> >  SPLIT_STRING,LONG_LINE_STRING,C99_COMMENT_TOLERANCE,\
> >  LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,\
> > -NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG"
> > +NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG,EXPORT_SYMBOL"
> >  options="$options $DPDK_CHECKPATCH_OPTIONS"
> >
> >  print_usage () {
> > diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
> > index 88dd776b4c..addbb24b9e 100644
> > --- a/doc/guides/contributing/abi_versioning.rst
> > +++ b/doc/guides/contributing/abi_versioning.rst
> > @@ -58,12 +58,12 @@ persists over multiple releases.
> >
> >  .. code-block:: none
> >
> > - $ head ./lib/acl/version.map
> > + $ head ./build/lib/librte_acl_exports.map
>
> I must admit I'm not a fan of these long filenames. How about just
> "acl_exports.map"?

I used the same prefix as other targets in ninja, I usually rely on
auto-completion.
But other than that, I don't mind.


>
> >   DPDK_21 {
> >          global:
> >   ...
> >
> > - $ head ./lib/eal/version.map
> > + $ head ./build/lib/librte_eal_exports.map
> >   DPDK_21 {
> >          global:
> >   ...
> > @@ -77,7 +77,7 @@ that library.
> >
> >  .. code-block:: none
> >
> > - $ head ./lib/acl/version.map
> > + $ head ./build/lib/librte_acl_exports.map
> >   DPDK_21 {
> >          global:
> >   ...
> > @@ -88,7 +88,7 @@ that library.
> >   } DPDK_21;
> >   ...
> >
> > - $ head ./lib/eal/version.map
> > + $ head ./build/lib/librte_eal_exports.map
> >   DPDK_21 {
> >          global:
> >   ...
> > @@ -100,12 +100,12 @@ how this may be done.
> >
> >  .. code-block:: none
> >
> > - $ head ./lib/acl/version.map
> > + $ head ./build/lib/librte_acl_exports.map
> >   DPDK_22 {
> >          global:
> >   ...
> >
> > - $ head ./lib/eal/version.map
> > + $ head ./build/lib/librte_eal_exports.map
> >   DPDK_22 {
> >          global:
> >   ...
> > @@ -134,8 +134,7 @@ linked to the DPDK.
> >
> >  To support backward compatibility the ``rte_function_versioning.h``
> >  header file provides macros to use when updating exported functions. These
> > -macros are used in conjunction with the ``version.map`` file for
> > -a given library to allow multiple versions of a symbol to exist in a shared
> > +macros allow multiple versions of a symbol to exist in a shared
> >  library so that older binaries need not be immediately recompiled.
> >
> >  The macros are:
> > @@ -169,6 +168,7 @@ Assume we have a function as follows
> >    * Create an acl context object for apps to
> >    * manipulate
> >    */
> > + RTE_EXPORT_SYMBOL(rte_acl_create)
> >   struct rte_acl_ctx *
> >   rte_acl_create(const struct rte_acl_param *param)
> >   {
> > @@ -187,6 +187,7 @@ private, is safe), but it also requires modifying the code as follows
> >    * Create an acl context object for apps to
> >    * manipulate
> >    */
> > + RTE_EXPORT_SYMBOL(rte_acl_create)
> >   struct rte_acl_ctx *
> >   rte_acl_create(const struct rte_acl_param *param, int debug)
> >   {
> > @@ -203,78 +204,16 @@ The addition of a parameter to the function is ABI breaking as the function is
> >  public, and existing application may use it in its current form. However, the
> >  compatibility macros in DPDK allow a developer to use symbol versioning so that
> >  multiple functions can be mapped to the same public symbol based on when an
> > -application was linked to it. To see how this is done, we start with the
> > -requisite libraries version map file. Initially the version map file for the acl
> > -library looks like this
> > +application was linked to it.
> >
> > -.. code-block:: none
> > -
> > -   DPDK_21 {
> > -        global:
> > -
> > -        rte_acl_add_rules;
> > -        rte_acl_build;
> > -        rte_acl_classify;
> > -        rte_acl_classify_alg;
> > -        rte_acl_classify_scalar;
> > -        rte_acl_create;
> > -        rte_acl_dump;
> > -        rte_acl_find_existing;
> > -        rte_acl_free;
> > -        rte_acl_ipv4vlan_add_rules;
> > -        rte_acl_ipv4vlan_build;
> > -        rte_acl_list_dump;
> > -        rte_acl_reset;
> > -        rte_acl_reset_rules;
> > -        rte_acl_set_ctx_classify;
> > -
> > -        local: *;
> > -   };
> > -
> > -This file needs to be modified as follows
> > -
> > -.. code-block:: none
> > -
> > -   DPDK_21 {
> > -        global:
> > -
> > -        rte_acl_add_rules;
> > -        rte_acl_build;
> > -        rte_acl_classify;
> > -        rte_acl_classify_alg;
> > -        rte_acl_classify_scalar;
> > -        rte_acl_create;
> > -        rte_acl_dump;
> > -        rte_acl_find_existing;
> > -        rte_acl_free;
> > -        rte_acl_ipv4vlan_add_rules;
> > -        rte_acl_ipv4vlan_build;
> > -        rte_acl_list_dump;
> > -        rte_acl_reset;
> > -        rte_acl_reset_rules;
> > -        rte_acl_set_ctx_classify;
> > -
> > -        local: *;
> > -   };
> > -
> > -   DPDK_22 {
> > -        global:
> > -        rte_acl_create;
> > -
> > -   } DPDK_21;
> > -
> > -The addition of the new block tells the linker that a new version node
> > -``DPDK_22`` is available, which contains the symbol rte_acl_create, and inherits
> > -the symbols from the DPDK_21 node. This list is directly translated into a
> > -list of exported symbols when DPDK is compiled as a shared library.
> > -
> > -Next, we need to specify in the code which function maps to the rte_acl_create
> > +We need to specify in the code which function maps to the rte_acl_create
> >  symbol at which versions.  First, at the site of the initial symbol definition,
> >  we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
> > -the function return type, and the function name and its arguments.
> > +the function return type, the function name and its arguments.
>
> Good fix, though technically not relevant to this patch.

Indeed..
>
> >
> >  .. code-block:: c
> >
> > + -RTE_EXPORT_SYMBOL(rte_acl_create)
> >   -struct rte_acl_ctx *
> >   -rte_acl_create(const struct rte_acl_param *param)
> >   +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
> > @@ -293,6 +232,7 @@ We have now mapped the original rte_acl_create symbol to the original function
> >
> >  Please see the section :ref:`Enabling versioning macros
> >  <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
> > +
>
> Ditto.
>
> >  Next, we need to create the new version of the symbol. We create a new
> >  function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
> >
> > @@ -312,9 +252,9 @@ The macro instructs the linker to create the new default symbol
> >  ``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
> >  (declared by the macro).
> >
> > -And that's it, on the next shared library rebuild, there will be two versions of
> > -rte_acl_create, an old DPDK_21 version, used by previously built applications,
> > -and a new DPDK_22 version, used by future built applications.
> > +And that's it. On the next shared library rebuild, there will be two versions of rte_acl_create,
> > +an old DPDK_21 version, used by previously built applications, and a new DPDK_22 version,
> > +used by future built applications.
>
> nit: not sure what others think but "future built" sounds strange to me?
> How about "later built" or "newly built"?

newly sounds better to me.

>
> >
> >  .. note::
> >
> > @@ -364,6 +304,7 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
> >      * Create an acl context object for apps to
> >      * manipulate
> >      */
> <snip>
>


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [RFC v3 5/8] build: generate symbol maps
  2025-03-11  9:56 18%   ` [RFC v3 5/8] build: generate symbol maps David Marchand
  2025-03-13 17:26  0%     ` Bruce Richardson
@ 2025-03-14 15:27  0%     ` Andre Muezerie
  2025-03-14 15:51  4%       ` David Marchand
  1 sibling, 1 reply; 200+ results
From: Andre Muezerie @ 2025-03-14 15:27 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, thomas, bruce.richardson

On Tue, Mar 11, 2025 at 10:56:03AM +0100, David Marchand wrote:
> Rather than maintain a file in parallel of the code, symbols to be
> exported can be marked with a token RTE_EXPORT_*SYMBOL.
> 
> >From those marks, the build framework generates map files only for
> symbols actually compiled (which means that the WINDOWS_NO_EXPORT hack
> becomes unnecessary).
> 
> The build framework directly creates a map file in the format that the
> linker expects (rather than converting from GNU linker to MSVC linker).
> 
> Empty maps are allowed again as a replacement for drivers/version.map.
> 
> The symbol check is updated to only support the new format.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> Changes since RFC v2:
> - because of MSVC limitations wrt macro passed via cmdline,
>   used an internal header for defining RTE_EXPORT_* macros,
> - updated documentation and tooling,
> 
> ---
>  MAINTAINERS                                |   2 +
>  buildtools/gen-version-map.py              | 111 ++++++++++
>  buildtools/map-list-symbol.sh              |  10 +-
>  buildtools/meson.build                     |   1 +
>  config/meson.build                         |   2 +
>  config/rte_export.h                        |  16 ++
>  devtools/check-symbol-change.py            |  90 +++++++++
>  devtools/check-symbol-maps.sh              |  14 --
>  devtools/checkpatches.sh                   |   2 +-
>  doc/guides/contributing/abi_versioning.rst | 224 ++-------------------
>  drivers/meson.build                        |  94 +++++----
>  drivers/version.map                        |   3 -
>  lib/meson.build                            |  91 ++++++---
>  13 files changed, 371 insertions(+), 289 deletions(-)
>  create mode 100755 buildtools/gen-version-map.py
>  create mode 100644 config/rte_export.h
>  create mode 100755 devtools/check-symbol-change.py
>  delete mode 100644 drivers/version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 312e6fcee5..04772951d3 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -95,6 +95,7 @@ F: devtools/check-maintainers.sh
>  F: devtools/check-forbidden-tokens.awk
>  F: devtools/check-git-log.sh
>  F: devtools/check-spdx-tag.sh
> +F: devtools/check-symbol-change.py
>  F: devtools/check-symbol-change.sh
>  F: devtools/check-symbol-maps.sh
>  F: devtools/checkpatches.sh
> @@ -127,6 +128,7 @@ F: config/
>  F: buildtools/check-symbols.sh
>  F: buildtools/chkincs/
>  F: buildtools/call-sphinx-build.py
> +F: buildtools/gen-version-map.py
>  F: buildtools/get-cpu-count.py
>  F: buildtools/get-numa-count.py
>  F: buildtools/list-dir-globs.py
> diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
> new file mode 100755
> index 0000000000..b160aa828b
> --- /dev/null
> +++ b/buildtools/gen-version-map.py
> @@ -0,0 +1,111 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright (c) 2024 Red Hat, Inc.

2025?
I appreciate that Python was chosen instead of sh/bash.

> +
> +"""Generate a version map file used by GNU or MSVC linker."""
> +
> +import re
> +import sys
> +
> +# From rte_export.h
> +export_exp_sym_regexp = re.compile(r"^RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+), ([0-9]+.[0-9]+)\)")
> +export_int_sym_regexp = re.compile(r"^RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> +export_sym_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^)]+)\)")
> +# From rte_function_versioning.h
> +ver_sym_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +ver_exp_sym_regexp = re.compile(r"^RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> +default_sym_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +
> +with open(sys.argv[2]) as f:
> +    abi = 'DPDK_{}'.format(re.match("([0-9]+).[0-9]", f.readline()).group(1))
> +
> +symbols = {}
> +
> +for file in sys.argv[4:]:
> +    with open(file, encoding="utf-8") as f:
> +        for ln in f.readlines():
> +            node = None
> +            symbol = None
> +            comment = None
> +            if export_exp_sym_regexp.match(ln):
> +                node = 'EXPERIMENTAL'
> +                symbol = export_exp_sym_regexp.match(ln).group(1)
> +                comment = ' # added in {}'.format(export_exp_sym_regexp.match(ln).group(2))
> +            elif export_int_sym_regexp.match(ln):
> +                node = 'INTERNAL'
> +                symbol = export_int_sym_regexp.match(ln).group(1)
> +            elif export_sym_regexp.match(ln):
> +                node = abi
> +                symbol = export_sym_regexp.match(ln).group(1)
> +            elif ver_sym_regexp.match(ln):
> +                node = 'DPDK_{}'.format(ver_sym_regexp.match(ln).group(1))
> +                symbol = ver_sym_regexp.match(ln).group(2)
> +            elif ver_exp_sym_regexp.match(ln):
> +                node = 'EXPERIMENTAL'
> +                symbol = ver_exp_sym_regexp.match(ln).group(1)
> +            elif default_sym_regexp.match(ln):
> +                node = 'DPDK_{}'.format(default_sym_regexp.match(ln).group(1))
> +                symbol = default_sym_regexp.match(ln).group(2)
> +
> +            if not symbol:
> +                continue
> +
> +            if node not in symbols:
> +                symbols[node] = {}
> +            symbols[node][symbol] = comment
> +
> +if sys.argv[1] == 'msvc':
> +    with open(sys.argv[3], "w") as outfile:
> +        outfile.writelines(f"EXPORTS\n")
> +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> +            if key not in symbols:
> +                continue
> +            for symbol in sorted(symbols[key].keys()):
> +                outfile.writelines(f"\t{symbol}\n")
> +            del symbols[key]
> +else:
> +    with open(sys.argv[3], "w") as outfile:

Consider having output file samples documented, perhaps in this script itself, to make
it easier to understand what this script it doing and highlight the differences between
the formats supported (msvc, etc).

> +        local_token = False
> +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> +            if key not in symbols:
> +                continue
> +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> +            for symbol in sorted(symbols[key].keys()):
> +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> +                    prefix = '__emutls_v.'
> +                else:
> +                    prefix = ''
> +                outfile.writelines(f"\t{prefix}{symbol};")
> +                comment = symbols[key][symbol]
> +                if comment:
> +                    outfile.writelines(f"{comment}")
> +                outfile.writelines("\n")
> +            outfile.writelines("\n")
> +            if not local_token:
> +                outfile.writelines("\tlocal: *;\n")
> +                local_token = True
> +            outfile.writelines("};\n")
> +            del symbols[key]
> +        for key in sorted(symbols.keys()):
> +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> +            for symbol in sorted(symbols[key].keys()):
> +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> +                    prefix = '__emutls_v.'
> +                else:
> +                    prefix = ''
> +                outfile.writelines(f"\t{prefix}{symbol};")
> +                comment = symbols[key][symbol]
> +                if comment:
> +                    outfile.writelines(f"{comment}")
> +                outfile.writelines("\n")
> +            outfile.writelines(f"}} {abi};\n")
> +            if not local_token:
> +                outfile.writelines("\tlocal: *;\n")
> +                local_token = True
> +            del symbols[key]
> +        # No exported symbol, add a catch all
> +        if not local_token:
> +            outfile.writelines(f"{abi} {{\n")
> +            outfile.writelines("\tlocal: *;\n")
> +            local_token = True
> +            outfile.writelines("};\n")
> diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
> index eb98451d8e..0829df4be5 100755
> --- a/buildtools/map-list-symbol.sh
> +++ b/buildtools/map-list-symbol.sh
> @@ -62,10 +62,14 @@ for file in $@; do
>  		if (current_section == "") {
>  			next;
>  		}
> +		symbol_version = current_version
> +		if (/^[^}].*[^:*]; # added in /) {
> +			symbol_version = $5
> +		}
>  		if ("'$version'" != "") {
> -			if ("'$version'" == "unset" && current_version != "") {
> +			if ("'$version'" == "unset" && symbol_version != "") {
>  				next;
> -			} else if ("'$version'" != "unset" && "'$version'" != current_version) {
> +			} else if ("'$version'" != "unset" && "'$version'" != symbol_version) {
>  				next;
>  			}
>  		}
> @@ -73,7 +77,7 @@ for file in $@; do
>  		if ("'$symbol'" == "all" || $1 == "'$symbol'") {
>  			ret = 0;
>  			if ("'$quiet'" == "") {
> -				print "'$file' "current_section" "$1" "current_version;
> +				print "'$file' "current_section" "$1" "symbol_version;
>  			}
>  			if ("'$symbol'" != "all") {
>  				exit 0;
> diff --git a/buildtools/meson.build b/buildtools/meson.build
> index 4e2c1217a2..b745e9afa4 100644
> --- a/buildtools/meson.build
> +++ b/buildtools/meson.build
> @@ -16,6 +16,7 @@ else
>      py3 = ['meson', 'runpython']
>  endif
>  echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
> +gen_version_map = py3 + files('gen-version-map.py')
>  list_dir_globs = py3 + files('list-dir-globs.py')
>  map_to_win_cmd = py3 + files('map_to_win.py')
>  sphinx_wrapper = py3 + files('call-sphinx-build.py')
> diff --git a/config/meson.build b/config/meson.build
> index f31fef216c..54657055fb 100644
> --- a/config/meson.build
> +++ b/config/meson.build
> @@ -303,8 +303,10 @@ endif
>  # add -include rte_config to cflags
>  if is_ms_compiler
>      add_project_arguments('/FI', 'rte_config.h', language: 'c')
> +    add_project_arguments('/FI', 'rte_export.h', language: 'c')
>  else
>      add_project_arguments('-include', 'rte_config.h', language: 'c')
> +    add_project_arguments('-include', 'rte_export.h', language: 'c')
>  endif
>  
>  # enable extra warnings and disable any unwanted warnings
> diff --git a/config/rte_export.h b/config/rte_export.h
> new file mode 100644
> index 0000000000..83d871fe11
> --- /dev/null
> +++ b/config/rte_export.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2025 Red Hat, Inc.
> + */
> +
> +#ifndef RTE_EXPORT_H
> +#define RTE_EXPORT_H
> +
> +/* *Internal* macros for exporting symbols, used by the build system.
> + * For RTE_EXPORT_EXPERIMENTAL_SYMBOL, ver indicates the
> + * version this symbol was introduced in.
> + */
> +#define RTE_EXPORT_EXPERIMENTAL_SYMBOL(a, ver)
> +#define RTE_EXPORT_INTERNAL_SYMBOL(a)
> +#define RTE_EXPORT_SYMBOL(a)
> +
> +#endif /* RTE_EXPORT_H */
> diff --git a/devtools/check-symbol-change.py b/devtools/check-symbol-change.py
> new file mode 100755
> index 0000000000..09709e4f06
> --- /dev/null
> +++ b/devtools/check-symbol-change.py
> @@ -0,0 +1,90 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright (c) 2025 Red Hat, Inc.
> +
> +"""Check exported symbols change in a patch."""
> +
> +import re
> +import sys
> +
> +file_header_regexp = re.compile(r"^(\-\-\-|\+\+\+) [ab]/(lib|drivers)/([^/]+)/([^/]+)")
> +# From rte_export.h
> +export_exp_sym_regexp = re.compile(r"^.RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+),")
> +export_int_sym_regexp = re.compile(r"^.RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> +export_sym_regexp = re.compile(r"^.RTE_EXPORT_SYMBOL\(([^)]+)\)")
> +# TODO, handle versioned symbols from rte_function_versioning.h
> +# ver_sym_regexp = re.compile(r"^.RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +# ver_exp_sym_regexp = re.compile(r"^.RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> +# default_sym_regexp = re.compile(r"^.RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +
> +symbols = {}
> +
> +for file in sys.argv[1:]:
> +    with open(file, encoding="utf-8") as f:
> +        for ln in f.readlines():
> +            if file_header_regexp.match(ln):
> +                if file_header_regexp.match(ln).group(2) == "lib":
> +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> +                elif file_header_regexp.match(ln).group(3) == "intel":
> +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3, 4))
> +                else:
> +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> +
> +                if lib not in symbols:
> +                    symbols[lib] = {}
> +                continue
> +
> +            if export_exp_sym_regexp.match(ln):
> +                symbol = export_exp_sym_regexp.match(ln).group(1)
> +                node = 'EXPERIMENTAL'
> +            elif export_int_sym_regexp.match(ln):
> +                node = 'INTERNAL'
> +                symbol = export_int_sym_regexp.match(ln).group(1)
> +            elif export_sym_regexp.match(ln):
> +                symbol = export_sym_regexp.match(ln).group(1)
> +                node = 'stable'
> +            else:
> +                continue
> +
> +            if symbol not in symbols[lib]:
> +                symbols[lib][symbol] = {}
> +            added = ln[0] == '+'
> +            if added and 'added' in symbols[lib][symbol] and node != symbols[lib][symbol]['added']:
> +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> +            if not added and 'removed' in symbols[lib][symbol] and node != symbols[lib][symbol]['removed']:
> +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> +            if added:
> +                symbols[lib][symbol]['added'] = node
> +            else:
> +                symbols[lib][symbol]['removed'] = node
> +
> +    for lib in sorted(symbols.keys()):
> +        error = False
> +        for symbol in sorted(symbols[lib].keys()):
> +            if 'removed' not in symbols[lib][symbol]:
> +                # Symbol addition
> +                node = symbols[lib][symbol]['added']
> +                if node == 'stable':
> +                    print(f"ERROR: {symbol} in {lib} has been added directly to stable ABI.")
> +                    error = True
> +                else:
> +                    print(f"INFO: {symbol} in {lib} has been added to {node} ABI.")
> +                continue
> +
> +            if 'added' not in symbols[lib][symbol]:
> +                # Symbol removal
> +                node = symbols[lib][symbol]['added']
> +                if node == 'stable':
> +                    print(f"INFO: {symbol} in {lib} has been removed from stable ABI.")

Some people would argue that WARN instead of INFO is more appropriate because some attention
is needed from the user. INFO many times is just ignored.

> +                    print(f"Please check it has gone though the deprecation process.")
> +                continue
> +
> +            if symbols[lib][symbol]['added'] == symbols[lib][symbol]['removed']:
> +                # Symbol was moved around
> +                continue
> +
> +            # Symbol modifications
> +            added = symbols[lib][symbol]['added']
> +            removed = symbols[lib][symbol]['removed']
> +            print(f"INFO: {symbol} in {lib} is moving from {removed} to {added}")

Perhaps use WARN instead of INFO.

> +            print(f"Please check it has gone though the deprecation process.")
> diff --git a/devtools/check-symbol-maps.sh b/devtools/check-symbol-maps.sh
> index 6121f78ec6..fcd3931e5d 100755
> --- a/devtools/check-symbol-maps.sh
> +++ b/devtools/check-symbol-maps.sh
> @@ -60,20 +60,6 @@ if [ -n "$local_miss_maps" ] ; then
>      ret=1
>  fi
>  
> -find_empty_maps ()
> -{
> -    for map in $@ ; do
> -        [ $(buildtools/map-list-symbol.sh $map | wc -l) != '0' ] || echo $map
> -    done
> -}
> -
> -empty_maps=$(find_empty_maps $@)
> -if [ -n "$empty_maps" ] ; then
> -    echo "Found empty maps:"
> -    echo "$empty_maps"
> -    ret=1
> -fi
> -
>  find_bad_format_maps ()
>  {
>      abi_version=$(cut -d'.' -f 1 ABI_VERSION)
> diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
> index 003bb49e04..7dcac7c8c9 100755
> --- a/devtools/checkpatches.sh
> +++ b/devtools/checkpatches.sh
> @@ -33,7 +33,7 @@ VOLATILE,PREFER_PACKED,PREFER_ALIGNED,PREFER_PRINTF,STRLCPY,\
>  PREFER_KERNEL_TYPES,PREFER_FALLTHROUGH,BIT_MACRO,CONST_STRUCT,\
>  SPLIT_STRING,LONG_LINE_STRING,C99_COMMENT_TOLERANCE,\
>  LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,\
> -NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG"
> +NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG,EXPORT_SYMBOL"
>  options="$options $DPDK_CHECKPATCH_OPTIONS"
>  
>  print_usage () {
> diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
> index 88dd776b4c..addbb24b9e 100644
> --- a/doc/guides/contributing/abi_versioning.rst
> +++ b/doc/guides/contributing/abi_versioning.rst
> @@ -58,12 +58,12 @@ persists over multiple releases.
>  
>  .. code-block:: none
>  
> - $ head ./lib/acl/version.map
> + $ head ./build/lib/librte_acl_exports.map

I like the new file names, they are much better.

>   DPDK_21 {
>          global:
>   ...
>  
> - $ head ./lib/eal/version.map
> + $ head ./build/lib/librte_eal_exports.map
>   DPDK_21 {
>          global:
>   ...
> @@ -77,7 +77,7 @@ that library.
>  
>  .. code-block:: none
>  
> - $ head ./lib/acl/version.map
> + $ head ./build/lib/librte_acl_exports.map
>   DPDK_21 {
>          global:
>   ...
> @@ -88,7 +88,7 @@ that library.
>   } DPDK_21;
>   ...
>  
> - $ head ./lib/eal/version.map
> + $ head ./build/lib/librte_eal_exports.map
>   DPDK_21 {
>          global:
>   ...
> @@ -100,12 +100,12 @@ how this may be done.
>  
>  .. code-block:: none
>  
> - $ head ./lib/acl/version.map
> + $ head ./build/lib/librte_acl_exports.map
>   DPDK_22 {
>          global:
>   ...
>  
> - $ head ./lib/eal/version.map
> + $ head ./build/lib/librte_eal_exports.map
>   DPDK_22 {
>          global:
>   ...
> @@ -134,8 +134,7 @@ linked to the DPDK.
>  
>  To support backward compatibility the ``rte_function_versioning.h``
>  header file provides macros to use when updating exported functions. These
> -macros are used in conjunction with the ``version.map`` file for
> -a given library to allow multiple versions of a symbol to exist in a shared
> +macros allow multiple versions of a symbol to exist in a shared
>  library so that older binaries need not be immediately recompiled.
>  
>  The macros are:
> @@ -169,6 +168,7 @@ Assume we have a function as follows
>    * Create an acl context object for apps to
>    * manipulate
>    */
> + RTE_EXPORT_SYMBOL(rte_acl_create)
>   struct rte_acl_ctx *
>   rte_acl_create(const struct rte_acl_param *param)
>   {
> @@ -187,6 +187,7 @@ private, is safe), but it also requires modifying the code as follows
>    * Create an acl context object for apps to
>    * manipulate
>    */
> + RTE_EXPORT_SYMBOL(rte_acl_create)
>   struct rte_acl_ctx *
>   rte_acl_create(const struct rte_acl_param *param, int debug)
>   {
> @@ -203,78 +204,16 @@ The addition of a parameter to the function is ABI breaking as the function is
>  public, and existing application may use it in its current form. However, the
>  compatibility macros in DPDK allow a developer to use symbol versioning so that
>  multiple functions can be mapped to the same public symbol based on when an
> -application was linked to it. To see how this is done, we start with the
> -requisite libraries version map file. Initially the version map file for the acl
> -library looks like this
> +application was linked to it.
>  
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -
> -        rte_acl_add_rules;
> -        rte_acl_build;
> -        rte_acl_classify;
> -        rte_acl_classify_alg;
> -        rte_acl_classify_scalar;
> -        rte_acl_create;
> -        rte_acl_dump;
> -        rte_acl_find_existing;
> -        rte_acl_free;
> -        rte_acl_ipv4vlan_add_rules;
> -        rte_acl_ipv4vlan_build;
> -        rte_acl_list_dump;
> -        rte_acl_reset;
> -        rte_acl_reset_rules;
> -        rte_acl_set_ctx_classify;
> -
> -        local: *;
> -   };
> -
> -This file needs to be modified as follows
> -
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -
> -        rte_acl_add_rules;
> -        rte_acl_build;
> -        rte_acl_classify;
> -        rte_acl_classify_alg;
> -        rte_acl_classify_scalar;
> -        rte_acl_create;
> -        rte_acl_dump;
> -        rte_acl_find_existing;
> -        rte_acl_free;
> -        rte_acl_ipv4vlan_add_rules;
> -        rte_acl_ipv4vlan_build;
> -        rte_acl_list_dump;
> -        rte_acl_reset;
> -        rte_acl_reset_rules;
> -        rte_acl_set_ctx_classify;
> -
> -        local: *;
> -   };
> -
> -   DPDK_22 {
> -        global:
> -        rte_acl_create;
> -
> -   } DPDK_21;
> -
> -The addition of the new block tells the linker that a new version node
> -``DPDK_22`` is available, which contains the symbol rte_acl_create, and inherits
> -the symbols from the DPDK_21 node. This list is directly translated into a
> -list of exported symbols when DPDK is compiled as a shared library.
> -
> -Next, we need to specify in the code which function maps to the rte_acl_create
> +We need to specify in the code which function maps to the rte_acl_create
>  symbol at which versions.  First, at the site of the initial symbol definition,
>  we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
> -the function return type, and the function name and its arguments.
> +the function return type, the function name and its arguments.
>  
>  .. code-block:: c
>  
> + -RTE_EXPORT_SYMBOL(rte_acl_create)
>   -struct rte_acl_ctx *
>   -rte_acl_create(const struct rte_acl_param *param)
>   +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
> @@ -293,6 +232,7 @@ We have now mapped the original rte_acl_create symbol to the original function
>  
>  Please see the section :ref:`Enabling versioning macros
>  <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
> +
>  Next, we need to create the new version of the symbol. We create a new
>  function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
>  
> @@ -312,9 +252,9 @@ The macro instructs the linker to create the new default symbol
>  ``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
>  (declared by the macro).
>  
> -And that's it, on the next shared library rebuild, there will be two versions of
> -rte_acl_create, an old DPDK_21 version, used by previously built applications,
> -and a new DPDK_22 version, used by future built applications.
> +And that's it. On the next shared library rebuild, there will be two versions of rte_acl_create,
> +an old DPDK_21 version, used by previously built applications, and a new DPDK_22 version,
> +used by future built applications.
>  
>  .. note::
>  
> @@ -364,6 +304,7 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
>      * Create an acl context object for apps to
>      * manipulate
>      */
> +   RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_acl_create)
>     __rte_experimental
>     struct rte_acl_ctx *
>     rte_acl_create(const struct rte_acl_param *param)
> @@ -371,27 +312,8 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
>     ...
>     }
>  
> -In the map file, experimental symbols are listed as part of the ``EXPERIMENTAL``
> -version node.
> -
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -        ...
> -
> -        local: *;
> -   };
> -
> -   EXPERIMENTAL {
> -        global:
> -
> -        rte_acl_create;
> -   };
> -
>  When we promote the symbol to the stable ABI, we simply strip the
> -``__rte_experimental`` annotation from the function and move the symbol from the
> -``EXPERIMENTAL`` node, to the node of the next major ABI version as follow.
> +``__rte_experimental`` annotation from the function.
>  
>  .. code-block:: c
>  
> @@ -399,31 +321,13 @@ When we promote the symbol to the stable ABI, we simply strip the
>      * Create an acl context object for apps to
>      * manipulate
>      */
> +   RTE_EXPORT_SYMBOL(rte_acl_create)
>     struct rte_acl_ctx *
>     rte_acl_create(const struct rte_acl_param *param)
>     {
>            ...
>     }
>  
> -We then update the map file, adding the symbol ``rte_acl_create``
> -to the ``DPDK_22`` version node.
> -
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -        ...
> -
> -        local: *;
> -   };
> -
> -   DPDK_22 {
> -        global:
> -
> -        rte_acl_create;
> -   } DPDK_21;
> -
> -
>  Although there are strictly no guarantees or commitments associated with
>  :ref:`experimental symbols <experimental_apis>`, a maintainer may wish to offer
>  an alias to experimental. The process to add an alias to experimental,
> @@ -452,30 +356,6 @@ and ``DPDK_22`` version nodes.
>        return rte_acl_create(param);
>     }
>  
> -In the map file, we map the symbol to both the ``EXPERIMENTAL``
> -and ``DPDK_22`` version nodes.
> -
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -        ...
> -
> -        local: *;
> -   };
> -
> -   DPDK_22 {
> -        global:
> -
> -        rte_acl_create;
> -   } DPDK_21;
> -
> -   EXPERIMENTAL {
> -        global:
> -
> -        rte_acl_create;
> -   };
> -
>  .. _abi_deprecation:
>  
>  Deprecating part of a public API
> @@ -484,38 +364,7 @@ ________________________________
>  Lets assume that you've done the above updates, and in preparation for the next
>  major ABI version you decide you would like to retire the old version of the
>  function. After having gone through the ABI deprecation announcement process,
> -removal is easy. Start by removing the symbol from the requisite version map
> -file:
> -
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -
> -        rte_acl_add_rules;
> -        rte_acl_build;
> -        rte_acl_classify;
> -        rte_acl_classify_alg;
> -        rte_acl_classify_scalar;
> -        rte_acl_dump;
> - -      rte_acl_create
> -        rte_acl_find_existing;
> -        rte_acl_free;
> -        rte_acl_ipv4vlan_add_rules;
> -        rte_acl_ipv4vlan_build;
> -        rte_acl_list_dump;
> -        rte_acl_reset;
> -        rte_acl_reset_rules;
> -        rte_acl_set_ctx_classify;
> -
> -        local: *;
> -   };
> -
> -   DPDK_22 {
> -        global:
> -        rte_acl_create;
> -   } DPDK_21;
> -
> +removal is easy.
>  
>  Next remove the corresponding versioned export.
>  
> @@ -539,36 +388,7 @@ of a major ABI version. If a version node completely specifies an API, then
>  removing part of it, typically makes it incomplete. In those cases it is better
>  to remove the entire node.
>  
> -To do this, start by modifying the version map file, such that all symbols from
> -the node to be removed are merged into the next node in the map.
> -
> -In the case of our map above, it would transform to look as follows
> -
> -.. code-block:: none
> -
> -   DPDK_22 {
> -        global:
> -
> -        rte_acl_add_rules;
> -        rte_acl_build;
> -        rte_acl_classify;
> -        rte_acl_classify_alg;
> -        rte_acl_classify_scalar;
> -        rte_acl_dump;
> -        rte_acl_create
> -        rte_acl_find_existing;
> -        rte_acl_free;
> -        rte_acl_ipv4vlan_add_rules;
> -        rte_acl_ipv4vlan_build;
> -        rte_acl_list_dump;
> -        rte_acl_reset;
> -        rte_acl_reset_rules;
> -        rte_acl_set_ctx_classify;
> -
> -        local: *;
> - };
> -
> -Then any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
> +Any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
>  updated to point to the new version node in any header files for all affected
>  symbols.
>  
> diff --git a/drivers/meson.build b/drivers/meson.build
> index 05391a575d..c8bc556f1a 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -245,14 +245,14 @@ foreach subpath:subdirs
>                  dependencies: static_deps,
>                  c_args: cflags)
>          objs += tmp_lib.extract_all_objects(recursive: true)
> -        sources = custom_target(out_filename,
> +        sources_pmd_info = custom_target(out_filename,
>                  command: [pmdinfo, tmp_lib.full_path(), '@OUTPUT@', pmdinfogen],
>                  output: out_filename,
>                  depends: [tmp_lib])
>  
>          # now build the static driver
>          static_lib = static_library(lib_name,
> -                sources,
> +                sources_pmd_info,
>                  objects: objs,
>                  include_directories: includes,
>                  dependencies: static_deps,
> @@ -262,48 +262,72 @@ foreach subpath:subdirs
>          # now build the shared driver
>          version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), drv_path)
>  
> -        lk_deps = []
> -        lk_args = []
>          if not fs.is_file(version_map)
> -            version_map = '@0@/version.map'.format(meson.current_source_dir())
> -            lk_deps += [version_map]
> -        else
> -            lk_deps += [version_map]
> -            if not is_windows and developer_mode
> -                # on unix systems check the output of the
> -                # check-symbols.sh script, using it as a
> -                # dependency of the .so build
> -                lk_deps += custom_target(lib_name + '.sym_chk',
> -                        command: [check_symbols, version_map, '@INPUT@'],
> -                        capture: true,
> -                        input: static_lib,
> -                        output: lib_name + '.sym_chk')
> +            if is_ms_linker
> +                link_mode = 'msvc'
> +            elif is_windows
> +                link_mode = 'mingw'
> +            else
> +                link_mode = 'gnu'
>              endif
> -        endif
> +            version_map = custom_target(lib_name + '_map',
> +                    command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
> +                    input: sources,
> +                    output: 'lib@0@_exports.map'.format(lib_name))
> +            version_map_path = version_map.full_path()
> +            version_map_dep = [version_map]
> +            lk_deps = [version_map]
>  
> -        if is_windows
>              if is_ms_linker
> -                def_file = custom_target(lib_name + '_def',
> -                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
> -                        input: version_map,
> -                        output: '@0@_exports.def'.format(lib_name))
> -                lk_deps += [def_file]
> -
> -                lk_args = ['-Wl,/def:' + def_file.full_path()]
> +                if is_ms_compiler
> +                    lk_args = ['/def:' + version_map.full_path()]
> +                else
> +                    lk_args = ['-Wl,/def:' + version_map.full_path()]
> +                endif
>              else
> -                mingw_map = custom_target(lib_name + '_mingw',
> -                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
> -                        input: version_map,
> -                        output: '@0@_mingw.map'.format(lib_name))
> -                lk_deps += [mingw_map]
> -
> -                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
> +                lk_args = ['-Wl,--version-script=' + version_map.full_path()]
>              endif
>          else
> -            lk_args = ['-Wl,--version-script=' + version_map]
> +            version_map_path = version_map
> +            version_map_dep = []
> +            lk_deps = [version_map]
> +
> +            if is_windows
> +                if is_ms_linker
> +                    def_file = custom_target(lib_name + '_def',
> +                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
> +                            input: version_map,
> +                            output: '@0@_exports.def'.format(lib_name))
> +                    lk_deps += [def_file]
> +
> +                    lk_args = ['-Wl,/def:' + def_file.full_path()]
> +                else
> +                    mingw_map = custom_target(lib_name + '_mingw',
> +                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
> +                            input: version_map,
> +                            output: '@0@_mingw.map'.format(lib_name))
> +                    lk_deps += [mingw_map]
> +
> +                    lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
> +                endif
> +            else
> +                lk_args = ['-Wl,--version-script=' + version_map]
> +            endif
> +        endif
> +
> +        if not is_windows and developer_mode
> +            # on unix systems check the output of the
> +            # check-symbols.sh script, using it as a
> +            # dependency of the .so build
> +            lk_deps += custom_target(lib_name + '.sym_chk',
> +                    command: [check_symbols, version_map_path, '@INPUT@'],
> +                    capture: true,
> +                    input: static_lib,
> +                    output: lib_name + '.sym_chk',
> +                    depends: version_map_dep)
>          endif
>  
> -        shared_lib = shared_library(lib_name, sources,
> +        shared_lib = shared_library(lib_name, sources_pmd_info,
>                  objects: objs,
>                  include_directories: includes,
>                  dependencies: shared_deps,
> diff --git a/drivers/version.map b/drivers/version.map
> deleted file mode 100644
> index 17cc97bda6..0000000000
> --- a/drivers/version.map
> +++ /dev/null
> @@ -1,3 +0,0 @@
> -DPDK_25 {
> -	local: *;
> -};
> diff --git a/lib/meson.build b/lib/meson.build
> index ce92cb5537..b6bac02b48 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -1,6 +1,7 @@
>  # SPDX-License-Identifier: BSD-3-Clause
>  # Copyright(c) 2017-2019 Intel Corporation
>  
> +fs = import('fs')
>  
>  # process all libraries equally, as far as possible
>  # "core" libs first, then others alphabetically as far as possible
> @@ -254,42 +255,60 @@ foreach l:libraries
>              include_directories: includes,
>              dependencies: static_deps)
>  
> -    if not use_function_versioning or is_windows
> -        # use pre-build objects to build shared lib
> -        sources = []
> -        objs += static_lib.extract_all_objects(recursive: false)
> -    else
> -        # for compat we need to rebuild with
> -        # RTE_BUILD_SHARED_LIB defined
> -        cflags += '-DRTE_BUILD_SHARED_LIB'
> -    endif
> -
> -    version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
> -    lk_deps = [version_map]
> -
> -    if is_ms_linker
> -        def_file = custom_target(libname + '_def',
> -                command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
> -                input: version_map,
> -                output: '@0@_exports.def'.format(libname))
> -        lk_deps += [def_file]
> +    if not fs.is_file('@0@/@1@/version.map'.format(meson.current_source_dir(), l))
> +        if is_ms_linker
> +            link_mode = 'msvc'
> +        elif is_windows
> +            link_mode = 'mingw'
> +        else
> +            link_mode = 'gnu'
> +        endif
> +        version_map = custom_target(libname + '_map',
> +                command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
> +                input: sources,
> +                output: 'lib@0@_exports.map'.format(libname))
> +        version_map_path = version_map.full_path()
> +        version_map_dep = [version_map]
> +        lk_deps = [version_map]
>  
> -        if is_ms_compiler
> -            lk_args = ['/def:' + def_file.full_path()]
> +        if is_ms_linker
> +            if is_ms_compiler
> +                lk_args = ['/def:' + version_map.full_path()]
> +            else
> +                lk_args = ['-Wl,/def:' + version_map.full_path()]
> +            endif
>          else
> -            lk_args = ['-Wl,/def:' + def_file.full_path()]
> +            lk_args = ['-Wl,--version-script=' + version_map.full_path()]
>          endif
>      else
> -        if is_windows
> -            mingw_map = custom_target(libname + '_mingw',
> +        version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
> +        version_map_path = version_map
> +        version_map_dep = []
> +        lk_deps = [version_map]
> +        if is_ms_linker
> +            def_file = custom_target(libname + '_def',
>                      command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
>                      input: version_map,
> -                    output: '@0@_mingw.map'.format(libname))
> -            lk_deps += [mingw_map]
> +                    output: '@0@_exports.def'.format(libname))
> +            lk_deps += [def_file]
>  
> -            lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
> +            if is_ms_compiler
> +                lk_args = ['/def:' + def_file.full_path()]
> +            else
> +                lk_args = ['-Wl,/def:' + def_file.full_path()]
> +            endif
>          else
> -            lk_args = ['-Wl,--version-script=' + version_map]
> +            if is_windows
> +                mingw_map = custom_target(libname + '_mingw',
> +                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
> +                        input: version_map,
> +                        output: '@0@_mingw.map'.format(libname))
> +                lk_deps += [mingw_map]
> +
> +                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
> +            else
> +                lk_args = ['-Wl,--version-script=' + version_map]
> +            endif
>          endif
>      endif
>  
> @@ -298,11 +317,21 @@ foreach l:libraries
>          # check-symbols.sh script, using it as a
>          # dependency of the .so build
>          lk_deps += custom_target(name + '.sym_chk',
> -                command: [check_symbols,
> -                    version_map, '@INPUT@'],
> +                command: [check_symbols, version_map_path, '@INPUT@'],
>                  capture: true,
>                  input: static_lib,
> -                output: name + '.sym_chk')
> +                output: name + '.sym_chk',
> +                depends: version_map_dep)
> +    endif
> +
> +    if not use_function_versioning or is_windows
> +        # use pre-build objects to build shared lib
> +        sources = []
> +        objs += static_lib.extract_all_objects(recursive: false)
> +    else
> +        # for compat we need to rebuild with
> +        # RTE_BUILD_SHARED_LIB defined
> +        cflags += '-DRTE_BUILD_SHARED_LIB'
>      endif
>  
>      shared_lib = shared_library(libname,
> -- 
> 2.48.1

^ permalink raw reply	[relevance 0%]

* [PATCH] raw/cnxk_gpio: switch to character based GPIO interface
@ 2025-03-14 12:57  1% Tomasz Duszynski
  0 siblings, 0 replies; 200+ results
From: Tomasz Duszynski @ 2025-03-14 12:57 UTC (permalink / raw)
  To: dev, Jakub Palider, Tomasz Duszynski; +Cc: jerinj

The direct passthrough interrupt mechanism, which allowed bypassing the
kernel, was obscure and is no longer supported. So this driver won't
work with latest SDK kernels. Additionally, the sysfs GPIO control
interface has been deprecated by Linux kernel itself.

That said, this change updates the PMD to use the current GPIO interface
ensuring compatibility with current kernel standards while improving
maintainability and security.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
---
 doc/guides/rawdevs/cnxk_gpio.rst           |  37 +-
 drivers/raw/cnxk_gpio/cnxk_gpio.c          | 518 ++++++++++++---------
 drivers/raw/cnxk_gpio/cnxk_gpio.h          |  17 +-
 drivers/raw/cnxk_gpio/cnxk_gpio_irq.c      | 216 ---------
 drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c | 234 ++--------
 drivers/raw/cnxk_gpio/meson.build          |   1 -
 drivers/raw/cnxk_gpio/rte_pmd_cnxk_gpio.h  |  57 ++-
 7 files changed, 427 insertions(+), 653 deletions(-)
 delete mode 100644 drivers/raw/cnxk_gpio/cnxk_gpio_irq.c

diff --git a/doc/guides/rawdevs/cnxk_gpio.rst b/doc/guides/rawdevs/cnxk_gpio.rst
index 954d3b8905..8084dd4adb 100644
--- a/doc/guides/rawdevs/cnxk_gpio.rst
+++ b/doc/guides/rawdevs/cnxk_gpio.rst
@@ -6,31 +6,20 @@ Marvell CNXK GPIO Driver
 
 CNXK GPIO PMD configures and manages GPIOs available on the system using
 standard enqueue/dequeue mechanism offered by raw device abstraction. PMD relies
-both on standard sysfs GPIO interface provided by the Linux kernel and GPIO
-kernel driver custom interface allowing one to install userspace interrupt
-handlers.
+on standard kernel GPIO character device interface.
 
 Features
 --------
 
 Following features are available:
 
-- export/unexport a GPIO
-- read/write specific value from/to exported GPIO
+- read/write specific value from/to GPIO
 - set GPIO direction
 - set GPIO edge that triggers interrupt
 - set GPIO active low
 - register interrupt handler for specific GPIO
 - multiprocess aware
 
-Requirements
-------------
-
-PMD relies on modified kernel GPIO driver which exposes ``ioctl()`` interface
-for installing interrupt handlers for low latency signal processing.
-
-Driver is shipped with Marvell SDK.
-
 Limitations
 -----------
 
@@ -43,20 +32,20 @@ Device Setup
 CNXK GPIO PMD binds to virtual device which gets created by passing
 `--vdev=cnxk_gpio,gpiochip=<number>` command line to EAL. `gpiochip` parameter
 tells PMD which GPIO controller should be used. Available controllers are
-available under `/sys/class/gpio`. For further details on how Linux represents
-GPIOs in userspace please refer to
-`sysfs.txt <https://www.kernel.org/doc/Documentation/gpio/sysfs.txt>`_.
+`/dev/gpiochipN` character devices. For further details on
+how Linux represents GPIOs in userspace please refer to
+`gpio-cdev <https://www.kernel.org/doc/Documentation/ABI/testing/gpio-cdev>`_.
 
 If `gpiochip=<number>` was omitted then first gpiochip from the alphabetically
 sort list of available gpiochips is used.
 
 .. code-block:: console
 
-   $ ls /sys/class/gpio
-   export gpiochip448 unexport
+   $ ls /dev/gpiochip*
+   /dev/gpiochip0
 
 In above scenario only one GPIO controller is present hence
-`--vdev=cnxk_gpio,gpiochip=448` should be passed to EAL.
+`--vdev=cnxk_gpio,gpiochip=0` should be passed to EAL.
 
 Before performing actual data transfer one needs to call
 ``rte_rawdev_queue_count()`` followed by ``rte_rawdev_queue_conf_get()``. The
@@ -65,7 +54,7 @@ being controllable or not. Thus it is user responsibility to pick the proper
 ones. The latter call simply returns queue capacity.
 
 In order to allow using only subset of available GPIOs `allowlist` PMD param may
-be used. For example passing `--vdev=cnxk_gpio,gpiochip=448,allowlist=[0,1,2,3]`
+be used. For example passing `--vdev=cnxk_gpio,gpiochip=0,allowlist=[0,1,2,3]`
 to EAL will deny using all GPIOs except those specified explicitly in the
 `allowlist`.
 
@@ -179,12 +168,12 @@ Request interrupt
 
 Message is used to install custom interrupt handler.
 
-Message must have type set to ``CNXK_GPIO_MSG_TYPE_REGISTER_IRQ``.
+Message must have type set to ``CNXK_GPIO_MSG_TYPE_REGISTER_IRQ2``.
 
-Payload needs to be set to ``struct cnxk_gpio_irq`` which describes interrupt
+Payload needs to be set to ``struct cnxk_gpio_irq2`` which describes interrupt
 being requested.
 
-Consider using ``rte_pmd_gpio_register_gpio()`` wrapper.
+Consider using ``rte_pmd_gpio_register_irq2()`` wrapper.
 
 Free interrupt
 ~~~~~~~~~~~~~~
@@ -193,7 +182,7 @@ Message is used to remove installed interrupt handler.
 
 Message must have type set to ``CNXK_GPIO_MSG_TYPE_UNREGISTER_IRQ``.
 
-Consider using ``rte_pmd_gpio_unregister_gpio()`` wrapper.
+Consider using ``rte_pmd_gpio_unregister_irq()`` wrapper.
 
 Self test
 ---------
diff --git a/drivers/raw/cnxk_gpio/cnxk_gpio.c b/drivers/raw/cnxk_gpio/cnxk_gpio.c
index 329ac28a27..800f2cf1c3 100644
--- a/drivers/raw/cnxk_gpio/cnxk_gpio.c
+++ b/drivers/raw/cnxk_gpio/cnxk_gpio.c
@@ -3,11 +3,19 @@
  */
 
 #include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/gpio.h>
+#include <regex.h>
 #include <string.h>
+#include <sys/ioctl.h>
 #include <sys/stat.h>
+#include <sys/types.h>
 
 #include <bus_vdev_driver.h>
 #include <rte_eal.h>
+#include <rte_errno.h>
+#include <rte_interrupts.h>
 #include <rte_kvargs.h>
 #include <rte_lcore.h>
 #include <rte_rawdev_pmd.h>
@@ -17,9 +25,8 @@
 #include "cnxk_gpio.h"
 #include "rte_pmd_cnxk_gpio.h"
 
-#define CNXK_GPIO_BUFSZ 128
-#define CNXK_GPIO_CLASS_PATH "/sys/class/gpio"
 #define CNXK_GPIO_PARAMS_MZ_NAME "cnxk_gpio_params_mz"
+#define CNXK_GPIO_INVALID_FD (-1)
 
 struct cnxk_gpio_params {
 	unsigned int num;
@@ -40,6 +47,31 @@ cnxk_gpio_format_name(char *name, size_t len)
 	snprintf(name, len, "cnxk_gpio");
 }
 
+static int
+cnxk_gpio_ioctl(struct cnxk_gpio *gpio, unsigned long cmd, void *arg)
+{
+	return ioctl(gpio->fd, cmd, arg) ? -errno : 0;
+}
+
+static int
+cnxk_gpio_gpiochip_ioctl(struct cnxk_gpiochip *gpiochip, unsigned long cmd, void *arg)
+{
+	char path[PATH_MAX];
+	int ret = 0, fd;
+
+	snprintf(path, sizeof(path), "/dev/gpiochip%d", gpiochip->num);
+	fd = open(path, O_RDONLY);
+	if (fd == -1)
+		return -errno;
+
+	if (ioctl(fd, cmd, arg))
+		ret = -errno;
+
+	close(fd);
+
+	return ret;
+}
+
 static int
 cnxk_gpio_filter_gpiochip(const struct dirent *dirent)
 {
@@ -54,8 +86,7 @@ cnxk_gpio_set_defaults(struct cnxk_gpio_params *params)
 	struct dirent **namelist;
 	int ret = 0, n;
 
-	n = scandir(CNXK_GPIO_CLASS_PATH, &namelist, cnxk_gpio_filter_gpiochip,
-		    alphasort);
+	n = scandir("/dev", &namelist, cnxk_gpio_filter_gpiochip, alphasort);
 	if (n < 0 || n == 0)
 		return -ENODEV;
 
@@ -143,7 +174,7 @@ cnxk_gpio_parse_arg(struct rte_kvargs *kvlist, const char *arg, arg_handler_t ha
 static int
 cnxk_gpio_parse_store_args(struct cnxk_gpio_params **params, const char *args)
 {
-	size_t len = sizeof(**params);
+	size_t len = sizeof(**params) + 1;
 	const char *allowlist = NULL;
 	struct rte_kvargs *kvlist;
 	int ret;
@@ -163,11 +194,13 @@ cnxk_gpio_parse_store_args(struct cnxk_gpio_params **params, const char *args)
 
 	ret = cnxk_gpio_parse_arg(kvlist, CNXK_GPIO_ARG_ALLOWLIST, cnxk_gpio_parse_arg_allowlist,
 				  &allowlist);
-	if (ret < 0)
+	if (ret < 0) {
+		ret = -EINVAL;
 		goto out;
+	}
 
 	if (allowlist)
-		len += strlen(allowlist) + 1;
+		len += strlen(allowlist);
 
 	*params = cnxk_gpio_params_reserve(len);
 	if (!(*params)) {
@@ -175,7 +208,8 @@ cnxk_gpio_parse_store_args(struct cnxk_gpio_params **params, const char *args)
 		goto out;
 	}
 
-	strlcpy((*params)->allowlist, allowlist, strlen(allowlist) + 1);
+	if (allowlist)
+		strlcpy((*params)->allowlist, allowlist, strlen(allowlist) + 1);
 
 	ret = cnxk_gpio_parse_arg(kvlist, CNXK_GPIO_ARG_GPIOCHIP, cnxk_gpio_parse_arg_gpiochip,
 				  &(*params)->num);
@@ -188,6 +222,24 @@ cnxk_gpio_parse_store_args(struct cnxk_gpio_params **params, const char *args)
 	return ret;
 }
 
+static bool
+cnxk_gpio_allowlist_valid(const char *allowlist)
+{
+	bool ret = false;
+	regex_t regex;
+
+	/* [gpio0<,gpio1,...,gpioN>], where '<...>' is optional part */
+	if (regcomp(&regex, "^\\[[0-9]+(,[0-9]+)*\\]$", REG_EXTENDED))
+		return ret;
+
+	if (!regexec(&regex, allowlist, 0, NULL, 0))
+		ret = true;
+
+	regfree(&regex);
+
+	return ret;
+}
+
 static int
 cnxk_gpio_parse_allowlist(struct cnxk_gpiochip *gpiochip, char *allowlist)
 {
@@ -199,6 +251,17 @@ cnxk_gpio_parse_allowlist(struct cnxk_gpiochip *gpiochip, char *allowlist)
 	if (!list)
 		return -ENOMEM;
 
+	/* no gpios provided so allow all */
+	if (!*allowlist) {
+		for (i = 0; i < gpiochip->num_gpios; i++)
+			list[queue++] = i;
+
+		goto out_done;
+	}
+
+	if (!cnxk_gpio_allowlist_valid(allowlist))
+		return -EINVAL;
+
 	allowlist = strdup(allowlist);
 	if (!allowlist) {
 		ret = -ENOMEM;
@@ -239,6 +302,7 @@ cnxk_gpio_parse_allowlist(struct cnxk_gpiochip *gpiochip, char *allowlist)
 	} while ((token = strtok(NULL, ",")));
 
 	free(allowlist);
+out_done:
 	gpiochip->allowlist = list;
 	gpiochip->num_queues = queue;
 
@@ -250,88 +314,6 @@ cnxk_gpio_parse_allowlist(struct cnxk_gpiochip *gpiochip, char *allowlist)
 	return ret;
 }
 
-static int
-cnxk_gpio_read_attr(char *attr, char *val)
-{
-	int ret, ret2;
-	FILE *fp;
-
-	fp = fopen(attr, "r");
-	if (!fp)
-		return -errno;
-
-	ret = fscanf(fp, "%s", val);
-	if (ret < 0) {
-		ret = -errno;
-		goto out;
-	}
-	if (ret != 1) {
-		ret = -EIO;
-		goto out;
-	}
-
-	ret = 0;
-out:
-	ret2 = fclose(fp);
-	if (!ret)
-		ret = ret2;
-
-	return ret;
-}
-
-static int
-cnxk_gpio_read_attr_int(char *attr, int *val)
-{
-	char buf[CNXK_GPIO_BUFSZ];
-	int ret;
-
-	ret = cnxk_gpio_read_attr(attr, buf);
-	if (ret)
-		return ret;
-
-	ret = sscanf(buf, "%d", val);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
-}
-
-static int
-cnxk_gpio_write_attr(const char *attr, const char *val)
-{
-	FILE *fp;
-	int ret;
-
-	if (!val)
-		return -EINVAL;
-
-	fp = fopen(attr, "w");
-	if (!fp)
-		return -errno;
-
-	ret = fprintf(fp, "%s", val);
-	if (ret < 0) {
-		fclose(fp);
-		return ret;
-	}
-
-	ret = fclose(fp);
-	if (ret)
-		return -errno;
-
-	return 0;
-}
-
-static int
-cnxk_gpio_write_attr_int(const char *attr, int val)
-{
-	char buf[CNXK_GPIO_BUFSZ];
-
-	snprintf(buf, sizeof(buf), "%d", val);
-
-	return cnxk_gpio_write_attr(attr, buf);
-}
-
 static bool
 cnxk_gpio_queue_valid(struct cnxk_gpiochip *gpiochip, uint16_t queue)
 {
@@ -353,14 +335,16 @@ cnxk_gpio_lookup(struct cnxk_gpiochip *gpiochip, uint16_t queue)
 }
 
 static bool
-cnxk_gpio_exists(int num)
+cnxk_gpio_available(struct cnxk_gpio *gpio)
 {
-	char buf[CNXK_GPIO_BUFSZ];
-	struct stat st;
+	struct gpio_v2_line_info info = { .offset = gpio->num };
+	int ret;
 
-	snprintf(buf, sizeof(buf), "%s/gpio%d", CNXK_GPIO_CLASS_PATH, num);
+	ret = cnxk_gpio_gpiochip_ioctl(gpio->gpiochip, GPIO_V2_GET_LINEINFO_IOCTL, &info);
+	if (ret)
+		return false;
 
-	return !stat(buf, &st);
+	return !(info.flags & GPIO_V2_LINE_FLAG_USED);
 }
 
 static int
@@ -368,9 +352,9 @@ cnxk_gpio_queue_setup(struct rte_rawdev *dev, uint16_t queue_id,
 		      rte_rawdev_obj_t queue_conf, size_t queue_conf_size)
 {
 	struct cnxk_gpiochip *gpiochip = dev->dev_private;
-	char buf[CNXK_GPIO_BUFSZ];
+	struct gpio_v2_line_request req = {0};
 	struct cnxk_gpio *gpio;
-	int num, ret;
+	int ret;
 
 	RTE_SET_USED(queue_conf);
 	RTE_SET_USED(queue_conf_size);
@@ -386,33 +370,87 @@ cnxk_gpio_queue_setup(struct rte_rawdev *dev, uint16_t queue_id,
 	if (!gpio)
 		return -ENOMEM;
 
-	num = cnxk_queue_to_gpio(gpiochip, queue_id);
-	gpio->num = num + gpiochip->base;
+	gpio->num = cnxk_queue_to_gpio(gpiochip, queue_id);
+	gpio->fd = CNXK_GPIO_INVALID_FD;
 	gpio->gpiochip = gpiochip;
 
-	if (!cnxk_gpio_exists(gpio->num)) {
-		snprintf(buf, sizeof(buf), "%s/export", CNXK_GPIO_CLASS_PATH);
-		ret = cnxk_gpio_write_attr_int(buf, gpio->num);
-		if (ret) {
-			rte_free(gpio);
-			return ret;
-		}
-	} else {
-		CNXK_GPIO_LOG(WARNING, "using existing gpio%d", gpio->num);
+	if (!cnxk_gpio_available(gpio)) {
+		rte_free(gpio);
+		return -EBUSY;
+	}
+
+	cnxk_gpio_format_name(req.consumer, sizeof(req.consumer));
+	req.offsets[req.num_lines] = gpio->num;
+	req.num_lines = 1;
+
+	ret = cnxk_gpio_gpiochip_ioctl(gpio->gpiochip, GPIO_V2_GET_LINE_IOCTL, &req);
+	if (ret) {
+		rte_free(gpio);
+		return ret;
+	}
+
+	gpio->fd = req.fd;
+	gpiochip->gpios[gpio->num] = gpio;
+
+	return 0;
+}
+
+static void
+cnxk_gpio_intr_handler(void *data)
+{
+	struct gpio_v2_line_event event;
+	struct cnxk_gpio *gpio = data;
+	int ret;
+
+	ret = read(gpio->fd, &event, sizeof(event));
+	if (ret != sizeof(event)) {
+		CNXK_GPIO_LOG(ERR, "failed to read gpio%d event data", gpio->num);
+		goto out;
+	}
+	if ((unsigned int)gpio->num != event.offset) {
+		CNXK_GPIO_LOG(ERR, "expected event from gpio%d, received from gpio%d",
+			      gpio->num, event.offset);
+		goto out;
 	}
 
-	gpiochip->gpios[num] = gpio;
+	if (gpio->intr.handler2)
+		(gpio->intr.handler2)(&event, gpio->intr.data);
+	else if (gpio->intr.handler)
+		(gpio->intr.handler)(gpio->num, gpio->intr.data);
+out:
+	rte_intr_ack(gpio->intr.intr_handle);
+}
+
+static int
+cnxk_gpio_unregister_irq(struct cnxk_gpio *gpio)
+{
+	int ret;
+
+	if (!gpio->intr.intr_handle)
+		return 0;
+
+	ret = rte_intr_disable(gpio->intr.intr_handle);
+	if (ret)
+		return ret;
+
+	ret = rte_intr_callback_unregister_sync(gpio->intr.intr_handle, cnxk_gpio_intr_handler,
+						(void *)-1);
+	if (ret)
+		return ret;
+
+	rte_intr_instance_free(gpio->intr.intr_handle);
+	gpio->intr.intr_handle = NULL;
 
 	return 0;
 }
 
+
 static int
 cnxk_gpio_queue_release(struct rte_rawdev *dev, uint16_t queue_id)
 {
 	struct cnxk_gpiochip *gpiochip = dev->dev_private;
-	char buf[CNXK_GPIO_BUFSZ];
 	struct cnxk_gpio *gpio;
-	int num, ret;
+	int num;
 
 	if (!cnxk_gpio_queue_valid(gpiochip, queue_id))
 		return -EINVAL;
@@ -421,10 +459,11 @@ cnxk_gpio_queue_release(struct rte_rawdev *dev, uint16_t queue_id)
 	if (!gpio)
 		return -ENODEV;
 
-	snprintf(buf, sizeof(buf), "%s/unexport", CNXK_GPIO_CLASS_PATH);
-	ret = cnxk_gpio_write_attr_int(buf, gpio->num);
-	if (ret)
-		return ret;
+	if (gpio->intr.intr_handle)
+		cnxk_gpio_unregister_irq(gpio);
+
+	if (gpio->fd != CNXK_GPIO_INVALID_FD)
+		close(gpio->fd);
 
 	num = cnxk_queue_to_gpio(gpiochip, queue_id);
 	gpiochip->gpios[num] = NULL;
@@ -462,134 +501,218 @@ cnxk_gpio_queue_count(struct rte_rawdev *dev)
 
 static const struct {
 	enum cnxk_gpio_pin_edge edge;
-	const char *name;
-} cnxk_gpio_edge_name[] = {
-	{ CNXK_GPIO_PIN_EDGE_NONE, "none" },
-	{ CNXK_GPIO_PIN_EDGE_FALLING, "falling" },
-	{ CNXK_GPIO_PIN_EDGE_RISING, "rising" },
-	{ CNXK_GPIO_PIN_EDGE_BOTH, "both" },
+	enum gpio_v2_line_flag flag;
+} cnxk_gpio_edge_flag[] = {
+	{ CNXK_GPIO_PIN_EDGE_NONE, 0 },
+	{ CNXK_GPIO_PIN_EDGE_FALLING, GPIO_V2_LINE_FLAG_EDGE_FALLING },
+	{ CNXK_GPIO_PIN_EDGE_RISING, GPIO_V2_LINE_FLAG_EDGE_RISING },
+	{ CNXK_GPIO_PIN_EDGE_BOTH, GPIO_V2_LINE_FLAG_EDGE_FALLING | GPIO_V2_LINE_FLAG_EDGE_RISING },
 };
 
-static const char *
-cnxk_gpio_edge_to_name(enum cnxk_gpio_pin_edge edge)
+static enum gpio_v2_line_flag
+cnxk_gpio_edge_to_flag(enum cnxk_gpio_pin_edge edge)
 {
 	unsigned int i;
 
-	for (i = 0; i < RTE_DIM(cnxk_gpio_edge_name); i++) {
-		if (cnxk_gpio_edge_name[i].edge == edge)
-			return cnxk_gpio_edge_name[i].name;
+	for (i = 0; i < RTE_DIM(cnxk_gpio_edge_flag); i++) {
+		if (cnxk_gpio_edge_flag[i].edge == edge)
+			break;
 	}
 
-	return NULL;
+	return cnxk_gpio_edge_flag[i].flag;
 }
 
 static enum cnxk_gpio_pin_edge
-cnxk_gpio_name_to_edge(const char *name)
+cnxk_gpio_flag_to_edge(enum gpio_v2_line_flag flag)
 {
 	unsigned int i;
 
-	for (i = 0; i < RTE_DIM(cnxk_gpio_edge_name); i++) {
-		if (!strcmp(cnxk_gpio_edge_name[i].name, name))
+	for (i = 0; i < RTE_DIM(cnxk_gpio_edge_flag); i++) {
+		if ((cnxk_gpio_edge_flag[i].flag & flag) == cnxk_gpio_edge_flag[i].flag)
 			break;
 	}
 
-	return cnxk_gpio_edge_name[i].edge;
+	return cnxk_gpio_edge_flag[i].edge;
 }
 
 static const struct {
 	enum cnxk_gpio_pin_dir dir;
-	const char *name;
-} cnxk_gpio_dir_name[] = {
-	{ CNXK_GPIO_PIN_DIR_IN, "in" },
-	{ CNXK_GPIO_PIN_DIR_OUT, "out" },
-	{ CNXK_GPIO_PIN_DIR_HIGH, "high" },
-	{ CNXK_GPIO_PIN_DIR_LOW, "low" },
+	enum gpio_v2_line_flag flag;
+} cnxk_gpio_dir_flag[] = {
+	{ CNXK_GPIO_PIN_DIR_IN, GPIO_V2_LINE_FLAG_INPUT },
+	{ CNXK_GPIO_PIN_DIR_OUT, GPIO_V2_LINE_FLAG_OUTPUT },
+	{ CNXK_GPIO_PIN_DIR_HIGH, GPIO_V2_LINE_FLAG_OUTPUT },
+	{ CNXK_GPIO_PIN_DIR_LOW, GPIO_V2_LINE_FLAG_OUTPUT },
 };
 
-static const char *
-cnxk_gpio_dir_to_name(enum cnxk_gpio_pin_dir dir)
+static enum gpio_v2_line_flag
+cnxk_gpio_dir_to_flag(enum cnxk_gpio_pin_dir dir)
 {
 	unsigned int i;
 
-	for (i = 0; i < RTE_DIM(cnxk_gpio_dir_name); i++) {
-		if (cnxk_gpio_dir_name[i].dir == dir)
-			return cnxk_gpio_dir_name[i].name;
+	for (i = 0; i < RTE_DIM(cnxk_gpio_dir_flag); i++) {
+		if (cnxk_gpio_dir_flag[i].dir == dir)
+			break;
 	}
 
-	return NULL;
+	return cnxk_gpio_dir_flag[i].flag;
 }
 
 static enum cnxk_gpio_pin_dir
-cnxk_gpio_name_to_dir(const char *name)
+cnxk_gpio_flag_to_dir(enum gpio_v2_line_flag flag)
 {
 	unsigned int i;
 
-	for (i = 0; i < RTE_DIM(cnxk_gpio_dir_name); i++) {
-		if (!strcmp(cnxk_gpio_dir_name[i].name, name))
+	for (i = 0; i < RTE_DIM(cnxk_gpio_dir_flag); i++) {
+		if ((cnxk_gpio_dir_flag[i].flag & flag) == cnxk_gpio_dir_flag[i].flag)
 			break;
 	}
 
-	return cnxk_gpio_dir_name[i].dir;
+	return cnxk_gpio_dir_flag[i].dir;
 }
 
 static int
-cnxk_gpio_register_irq(struct cnxk_gpio *gpio, struct cnxk_gpio_irq *irq)
+cnxk_gpio_register_irq_compat(struct cnxk_gpio *gpio, struct cnxk_gpio_irq *irq,
+			      struct cnxk_gpio_irq2 *irq2)
 {
+	struct rte_intr_handle *intr_handle;
 	int ret;
 
-	ret = cnxk_gpio_irq_request(gpio->num - gpio->gpiochip->base, irq->cpu);
-	if (ret)
-		return ret;
+	if (!irq && !irq2)
+		return -EINVAL;
+
+	if ((irq && !irq->handler) || (irq2 && !irq2->handler))
+		return -EINVAL;
+
+	if (gpio->intr.intr_handle)
+		return -EEXIST;
+
+	intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+	if (!intr_handle)
+		return -ENOMEM;
+
+	if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_VDEV)) {
+		ret = -rte_errno;
+		goto out;
+	}
+
+	if (rte_intr_fd_set(intr_handle, gpio->fd)) {
+		ret = -rte_errno;
+		goto out;
+	}
+
+	if (rte_intr_callback_register(intr_handle, cnxk_gpio_intr_handler, gpio)) {
+		ret = -rte_errno;
+		goto out;
+	}
 
-	gpio->handler = irq->handler;
-	gpio->data = irq->data;
-	gpio->cpu = irq->cpu;
+	gpio->intr.intr_handle = intr_handle;
+
+	if (irq) {
+		gpio->intr.data = irq->data;
+		gpio->intr.handler = irq->handler;
+	} else {
+		gpio->intr.data = irq2->data;
+		gpio->intr.handler2 = irq2->handler;
+	}
+
+	if (rte_intr_enable(gpio->intr.intr_handle)) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	return 0;
+out:
+	rte_intr_instance_free(intr_handle);
+
+	return ret;
 }
 
 static int
-cnxk_gpio_unregister_irq(struct cnxk_gpio *gpio)
+cnxk_gpio_register_irq(struct cnxk_gpio *gpio, struct cnxk_gpio_irq *irq)
+{
+	CNXK_GPIO_LOG(WARNING, "using deprecated interrupt registration api");
+
+	return cnxk_gpio_register_irq_compat(gpio, irq, NULL);
+}
+
+static int
+cnxk_gpio_register_irq2(struct cnxk_gpio *gpio, struct cnxk_gpio_irq2 *irq)
 {
-	return cnxk_gpio_irq_free(gpio->num - gpio->gpiochip->base);
+	return cnxk_gpio_register_irq_compat(gpio, NULL, irq);
 }
 
 static int
 cnxk_gpio_process_buf(struct cnxk_gpio *gpio, struct rte_rawdev_buf *rbuf)
 {
 	struct cnxk_gpio_msg *msg = rbuf->buf_addr;
+	struct gpio_v2_line_values values = {0};
+	struct gpio_v2_line_config config = {0};
+	struct gpio_v2_line_info info = {0};
 	enum cnxk_gpio_pin_edge edge;
 	enum cnxk_gpio_pin_dir dir;
-	char buf[CNXK_GPIO_BUFSZ];
 	void *rsp = NULL;
-	int ret, val, n;
+	int ret;
+
+	info.offset = gpio->num;
+	ret = cnxk_gpio_gpiochip_ioctl(gpio->gpiochip, GPIO_V2_GET_LINEINFO_IOCTL, &info);
+	if (ret)
+		return ret;
 
-	n = snprintf(buf, sizeof(buf), "%s/gpio%d", CNXK_GPIO_CLASS_PATH,
-		     gpio->num);
+	info.flags &= ~GPIO_V2_LINE_FLAG_USED;
 
 	switch (msg->type) {
 	case CNXK_GPIO_MSG_TYPE_SET_PIN_VALUE:
-		snprintf(buf + n, sizeof(buf) - n, "/value");
-		ret = cnxk_gpio_write_attr_int(buf, !!*(int *)msg->data);
+		values.bits = *(int *)msg->data ?  RTE_BIT64(gpio->num) : 0;
+		values.mask = RTE_BIT64(gpio->num);
+
+		ret = cnxk_gpio_ioctl(gpio, GPIO_V2_LINE_SET_VALUES_IOCTL, &values);
 		break;
 	case CNXK_GPIO_MSG_TYPE_SET_PIN_EDGE:
-		snprintf(buf + n, sizeof(buf) - n, "/edge");
 		edge = *(enum cnxk_gpio_pin_edge *)msg->data;
-		ret = cnxk_gpio_write_attr(buf, cnxk_gpio_edge_to_name(edge));
+		info.flags &= ~(GPIO_V2_LINE_FLAG_EDGE_RISING | GPIO_V2_LINE_FLAG_EDGE_FALLING);
+		info.flags |= cnxk_gpio_edge_to_flag(edge);
+
+		config.attrs[config.num_attrs].attr.id = GPIO_V2_LINE_ATTR_ID_FLAGS;
+		config.attrs[config.num_attrs].attr.flags = info.flags;
+		config.attrs[config.num_attrs].mask = RTE_BIT64(gpio->num);
+		config.num_attrs++;
+
+		ret = cnxk_gpio_ioctl(gpio, GPIO_V2_LINE_SET_CONFIG_IOCTL, &config);
 		break;
 	case CNXK_GPIO_MSG_TYPE_SET_PIN_DIR:
-		snprintf(buf + n, sizeof(buf) - n, "/direction");
 		dir = *(enum cnxk_gpio_pin_dir *)msg->data;
-		ret = cnxk_gpio_write_attr(buf, cnxk_gpio_dir_to_name(dir));
+		config.attrs[config.num_attrs].attr.id = GPIO_V2_LINE_ATTR_ID_FLAGS;
+		config.attrs[config.num_attrs].attr.flags = cnxk_gpio_dir_to_flag(dir);
+		config.attrs[config.num_attrs].mask = RTE_BIT64(gpio->num);
+		config.num_attrs++;
+
+		if (dir == CNXK_GPIO_PIN_DIR_HIGH || dir == CNXK_GPIO_PIN_DIR_LOW) {
+			config.attrs[config.num_attrs].attr.id = GPIO_V2_LINE_ATTR_ID_OUTPUT_VALUES;
+			config.attrs[config.num_attrs].attr.values = dir == CNXK_GPIO_PIN_DIR_HIGH ?
+								     RTE_BIT64(gpio->num) : 0;
+			config.attrs[config.num_attrs].mask = RTE_BIT64(gpio->num);
+			config.num_attrs++;
+		}
+
+		ret = cnxk_gpio_ioctl(gpio, GPIO_V2_LINE_SET_CONFIG_IOCTL, &config);
 		break;
 	case CNXK_GPIO_MSG_TYPE_SET_PIN_ACTIVE_LOW:
-		snprintf(buf + n, sizeof(buf) - n, "/active_low");
-		val = *(int *)msg->data;
-		ret = cnxk_gpio_write_attr_int(buf, val);
+		if (*(int *)msg->data)
+			info.flags |= GPIO_V2_LINE_FLAG_ACTIVE_LOW;
+		else
+			info.flags &= ~GPIO_V2_LINE_FLAG_ACTIVE_LOW;
+
+		config.attrs[config.num_attrs].attr.id = GPIO_V2_LINE_ATTR_ID_FLAGS;
+		config.attrs[config.num_attrs].attr.flags = info.flags;
+		config.attrs[config.num_attrs].mask = RTE_BIT64(gpio->num);
+		config.num_attrs++;
+
+		ret = cnxk_gpio_ioctl(gpio, GPIO_V2_LINE_SET_CONFIG_IOCTL, &config);
 		break;
 	case CNXK_GPIO_MSG_TYPE_GET_PIN_VALUE:
-		snprintf(buf + n, sizeof(buf) - n, "/value");
-		ret = cnxk_gpio_read_attr_int(buf, &val);
+		values.mask = RTE_BIT64(gpio->num);
+		ret = cnxk_gpio_ioctl(gpio, GPIO_V2_LINE_GET_VALUES_IOCTL, &values);
 		if (ret)
 			break;
 
@@ -597,47 +720,35 @@ cnxk_gpio_process_buf(struct cnxk_gpio *gpio, struct rte_rawdev_buf *rbuf)
 		if (!rsp)
 			return -ENOMEM;
 
-		*(int *)rsp = val;
+		*(int *)rsp = !!(values.bits & RTE_BIT64(gpio->num));
 		break;
 	case CNXK_GPIO_MSG_TYPE_GET_PIN_EDGE:
-		snprintf(buf + n, sizeof(buf) - n, "/edge");
-		ret = cnxk_gpio_read_attr(buf, buf);
-		if (ret)
-			break;
-
 		rsp = rte_zmalloc(NULL, sizeof(enum cnxk_gpio_pin_edge), 0);
 		if (!rsp)
 			return -ENOMEM;
 
-		*(enum cnxk_gpio_pin_edge *)rsp = cnxk_gpio_name_to_edge(buf);
+		*(enum cnxk_gpio_pin_edge *)rsp = cnxk_gpio_flag_to_edge(info.flags);
 		break;
 	case CNXK_GPIO_MSG_TYPE_GET_PIN_DIR:
-		snprintf(buf + n, sizeof(buf) - n, "/direction");
-		ret = cnxk_gpio_read_attr(buf, buf);
-		if (ret)
-			break;
-
-		rsp = rte_zmalloc(NULL, sizeof(enum cnxk_gpio_pin_dir), 0);
+		rsp = rte_zmalloc(NULL, sizeof(enum cnxk_gpio_pin_edge), 0);
 		if (!rsp)
 			return -ENOMEM;
 
-		*(enum cnxk_gpio_pin_dir *)rsp = cnxk_gpio_name_to_dir(buf);
+		*(enum cnxk_gpio_pin_dir *)rsp = cnxk_gpio_flag_to_dir(info.flags);
 		break;
 	case CNXK_GPIO_MSG_TYPE_GET_PIN_ACTIVE_LOW:
-		snprintf(buf + n, sizeof(buf) - n, "/active_low");
-		ret = cnxk_gpio_read_attr_int(buf, &val);
-		if (ret)
-			break;
-
 		rsp = rte_zmalloc(NULL, sizeof(int), 0);
 		if (!rsp)
 			return -ENOMEM;
 
-		*(int *)rsp = val;
+		*(int *)rsp = !!(info.flags & GPIO_V2_LINE_FLAG_ACTIVE_LOW);
 		break;
 	case CNXK_GPIO_MSG_TYPE_REGISTER_IRQ:
 		ret = cnxk_gpio_register_irq(gpio, (struct cnxk_gpio_irq *)msg->data);
 		break;
+	case CNXK_GPIO_MSG_TYPE_REGISTER_IRQ2:
+		ret = cnxk_gpio_register_irq2(gpio, (struct cnxk_gpio_irq2 *)msg->data);
+		break;
 	case CNXK_GPIO_MSG_TYPE_UNREGISTER_IRQ:
 		ret = cnxk_gpio_unregister_irq(gpio);
 		break;
@@ -731,11 +842,11 @@ static const struct rte_rawdev_ops cnxk_gpio_rawdev_ops = {
 static int
 cnxk_gpio_probe(struct rte_vdev_device *dev)
 {
+	struct gpiochip_info gpiochip_info;
 	char name[RTE_RAWDEV_NAME_MAX_LEN];
 	struct cnxk_gpio_params *params;
 	struct cnxk_gpiochip *gpiochip;
 	struct rte_rawdev *rawdev;
-	char buf[CNXK_GPIO_BUFSZ];
 	int ret;
 
 	cnxk_gpio_format_name(name, sizeof(name));
@@ -762,25 +873,14 @@ cnxk_gpio_probe(struct rte_vdev_device *dev)
 
 	gpiochip->num = params->num;
 
-	ret = cnxk_gpio_irq_init(gpiochip);
-	if (ret)
-		goto out;
-
-	/* read gpio base */
-	snprintf(buf, sizeof(buf), "%s/gpiochip%d/base", CNXK_GPIO_CLASS_PATH, gpiochip->num);
-	ret = cnxk_gpio_read_attr_int(buf, &gpiochip->base);
-	if (ret) {
-		CNXK_GPIO_LOG(ERR, "failed to read %s", buf);
-		goto out;
-	}
-
 	/* read number of available gpios */
-	snprintf(buf, sizeof(buf), "%s/gpiochip%d/ngpio", CNXK_GPIO_CLASS_PATH, gpiochip->num);
-	ret = cnxk_gpio_read_attr_int(buf, &gpiochip->num_gpios);
+	ret = cnxk_gpio_gpiochip_ioctl(gpiochip, GPIO_GET_CHIPINFO_IOCTL, &gpiochip_info);
 	if (ret) {
-		CNXK_GPIO_LOG(ERR, "failed to read %s", buf);
+		CNXK_GPIO_LOG(ERR, "failed to read /dev/gpiochip%d info", gpiochip->num);
 		goto out;
 	}
+
+	gpiochip->num_gpios = gpiochip_info.lines;
 	gpiochip->num_queues = gpiochip->num_gpios;
 
 	ret = cnxk_gpio_parse_allowlist(gpiochip, params->allowlist);
@@ -827,15 +927,11 @@ cnxk_gpio_remove(struct rte_vdev_device *dev)
 		if (!gpio)
 			continue;
 
-		if (gpio->handler)
-			cnxk_gpio_unregister_irq(gpio);
-
 		cnxk_gpio_queue_release(rawdev, gpio->num);
 	}
 
 	rte_free(gpiochip->allowlist);
 	rte_free(gpiochip->gpios);
-	cnxk_gpio_irq_fini();
 	cnxk_gpio_params_release();
 	rte_rawdev_pmd_release(rawdev);
 
diff --git a/drivers/raw/cnxk_gpio/cnxk_gpio.h b/drivers/raw/cnxk_gpio/cnxk_gpio.h
index 94c8e36977..adc7f90936 100644
--- a/drivers/raw/cnxk_gpio/cnxk_gpio.h
+++ b/drivers/raw/cnxk_gpio/cnxk_gpio.h
@@ -11,20 +11,24 @@ extern int cnxk_logtype_gpio;
 #define CNXK_GPIO_LOG(level, ...) \
 	RTE_LOG_LINE(level, CNXK_GPIO, __VA_ARGS__)
 
+struct gpio_v2_line_event;
 struct cnxk_gpiochip;
 
 struct cnxk_gpio {
 	struct cnxk_gpiochip *gpiochip;
+	struct {
+		struct rte_intr_handle *intr_handle;
+		void (*handler)(int gpio, void *data);
+		void (*handler2)(struct gpio_v2_line_event *event, void *data);
+		void *data;
+	} intr;
 	void *rsp;
 	int num;
-	void (*handler)(int gpio, void *data);
-	void *data;
-	int cpu;
+	int fd;
 };
 
 struct cnxk_gpiochip {
 	int num;
-	int base;
 	int num_gpios;
 	int num_queues;
 	struct cnxk_gpio **gpios;
@@ -33,9 +37,4 @@ struct cnxk_gpiochip {
 
 int cnxk_gpio_selftest(uint16_t dev_id);
 
-int cnxk_gpio_irq_init(struct cnxk_gpiochip *gpiochip);
-void cnxk_gpio_irq_fini(void);
-int cnxk_gpio_irq_request(int gpio, int cpu);
-int cnxk_gpio_irq_free(int gpio);
-
 #endif /* _CNXK_GPIO_H_ */
diff --git a/drivers/raw/cnxk_gpio/cnxk_gpio_irq.c b/drivers/raw/cnxk_gpio/cnxk_gpio_irq.c
deleted file mode 100644
index 2fa8e69899..0000000000
--- a/drivers/raw/cnxk_gpio/cnxk_gpio_irq.c
+++ /dev/null
@@ -1,216 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(C) 2021 Marvell.
- */
-
-#include <fcntl.h>
-#include <pthread.h>
-#include <sys/ioctl.h>
-#include <sys/mman.h>
-#include <sys/queue.h>
-#include <unistd.h>
-
-#include <rte_rawdev_pmd.h>
-
-#include <roc_api.h>
-
-#include "cnxk_gpio.h"
-
-#define OTX_IOC_MAGIC 0xF2
-#define OTX_IOC_SET_GPIO_HANDLER                                               \
-	_IOW(OTX_IOC_MAGIC, 1, struct otx_gpio_usr_data)
-#define OTX_IOC_CLR_GPIO_HANDLER                                               \
-	_IO(OTX_IOC_MAGIC, 2)
-
-struct otx_gpio_usr_data {
-	uint64_t isr_base;
-	uint64_t sp;
-	uint64_t cpu;
-	uint64_t gpio_num;
-};
-
-struct cnxk_gpio_irq_stack {
-	LIST_ENTRY(cnxk_gpio_irq_stack) next;
-	void *sp_buffer;
-	int cpu;
-	int inuse;
-};
-
-struct cnxk_gpio_irqchip {
-	int fd;
-	/* serialize access to this struct */
-	pthread_mutex_t lock;
-	LIST_HEAD(, cnxk_gpio_irq_stack) stacks;
-
-	struct cnxk_gpiochip *gpiochip;
-};
-
-static struct cnxk_gpio_irqchip *irqchip;
-
-static void
-cnxk_gpio_irq_stack_free(int cpu)
-{
-	struct cnxk_gpio_irq_stack *stack;
-
-	LIST_FOREACH(stack, &irqchip->stacks, next) {
-		if (stack->cpu == cpu)
-			break;
-	}
-
-	if (!stack)
-		return;
-
-	if (stack->inuse)
-		stack->inuse--;
-
-	if (stack->inuse == 0) {
-		LIST_REMOVE(stack, next);
-		rte_free(stack->sp_buffer);
-		rte_free(stack);
-	}
-}
-
-static void *
-cnxk_gpio_irq_stack_alloc(int cpu)
-{
-#define ARM_STACK_ALIGNMENT (2 * sizeof(void *))
-#define IRQ_STACK_SIZE 0x200000
-
-	struct cnxk_gpio_irq_stack *stack;
-
-	LIST_FOREACH(stack, &irqchip->stacks, next) {
-		if (stack->cpu == cpu)
-			break;
-	}
-
-	if (stack) {
-		stack->inuse++;
-		return (char *)stack->sp_buffer + IRQ_STACK_SIZE;
-	}
-
-	stack = rte_malloc(NULL, sizeof(*stack), 0);
-	if (!stack)
-		return NULL;
-
-	stack->sp_buffer =
-		rte_zmalloc(NULL, IRQ_STACK_SIZE * 2, ARM_STACK_ALIGNMENT);
-	if (!stack->sp_buffer) {
-		rte_free(stack);
-		return NULL;
-	}
-
-	stack->cpu = cpu;
-	stack->inuse = 1;
-	LIST_INSERT_HEAD(&irqchip->stacks, stack, next);
-
-	return (char *)stack->sp_buffer + IRQ_STACK_SIZE;
-}
-
-static void
-cnxk_gpio_irq_handler(int gpio_num)
-{
-	struct cnxk_gpiochip *gpiochip = irqchip->gpiochip;
-	struct cnxk_gpio *gpio;
-
-	if (gpio_num >= gpiochip->num_gpios)
-		goto out;
-
-	gpio = gpiochip->gpios[gpio_num];
-	if (likely(gpio->handler))
-		gpio->handler(gpio_num, gpio->data);
-
-out:
-	roc_atf_ret();
-}
-
-int
-cnxk_gpio_irq_init(struct cnxk_gpiochip *gpiochip)
-{
-	if (irqchip)
-		return 0;
-
-	irqchip = rte_zmalloc(NULL, sizeof(*irqchip), 0);
-	if (!irqchip)
-		return -ENOMEM;
-
-	irqchip->fd = open("/dev/otx-gpio-ctr", O_RDWR | O_SYNC);
-	if (irqchip->fd < 0) {
-		rte_free(irqchip);
-		return -errno;
-	}
-
-	pthread_mutex_init(&irqchip->lock, NULL);
-	LIST_INIT(&irqchip->stacks);
-	irqchip->gpiochip = gpiochip;
-
-	return 0;
-}
-
-void
-cnxk_gpio_irq_fini(void)
-{
-	if (!irqchip)
-		return;
-
-	close(irqchip->fd);
-	rte_free(irqchip);
-	irqchip = NULL;
-}
-
-int
-cnxk_gpio_irq_request(int gpio, int cpu)
-{
-	struct otx_gpio_usr_data data;
-	void *sp;
-	int ret;
-
-	pthread_mutex_lock(&irqchip->lock);
-
-	sp = cnxk_gpio_irq_stack_alloc(cpu);
-	if (!sp) {
-		ret = -ENOMEM;
-		goto out_unlock;
-	}
-
-	data.isr_base = (uint64_t)cnxk_gpio_irq_handler;
-	data.sp = (uint64_t)sp;
-	data.cpu = (uint64_t)cpu;
-	data.gpio_num = (uint64_t)gpio;
-
-	mlockall(MCL_CURRENT | MCL_FUTURE);
-	ret = ioctl(irqchip->fd, OTX_IOC_SET_GPIO_HANDLER, &data);
-	if (ret) {
-		ret = -errno;
-		goto out_free_stack;
-	}
-
-	pthread_mutex_unlock(&irqchip->lock);
-
-	return 0;
-
-out_free_stack:
-	cnxk_gpio_irq_stack_free(cpu);
-out_unlock:
-	pthread_mutex_unlock(&irqchip->lock);
-
-	return ret;
-}
-
-int
-cnxk_gpio_irq_free(int gpio)
-{
-	int ret;
-
-	pthread_mutex_lock(&irqchip->lock);
-
-	ret = ioctl(irqchip->fd, OTX_IOC_CLR_GPIO_HANDLER, gpio);
-	if (ret) {
-		pthread_mutex_unlock(&irqchip->lock);
-		return -errno;
-	}
-
-	cnxk_gpio_irq_stack_free(irqchip->gpiochip->gpios[gpio]->cpu);
-
-	pthread_mutex_unlock(&irqchip->lock);
-
-	return 0;
-}
diff --git a/drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c b/drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c
index a0d9942f20..4e04811fe5 100644
--- a/drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c
+++ b/drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c
@@ -8,97 +8,27 @@
 #include <unistd.h>
 
 #include <rte_cycles.h>
+#include <rte_io.h>
 #include <rte_rawdev.h>
 #include <rte_rawdev_pmd.h>
-#include <rte_service.h>
 
 #include "cnxk_gpio.h"
 #include "rte_pmd_cnxk_gpio.h"
 
-#define CNXK_GPIO_BUFSZ 128
-
-#define OTX_IOC_MAGIC 0xF2
-#define OTX_IOC_TRIGGER_GPIO_HANDLER                                           \
-	_IO(OTX_IOC_MAGIC, 3)
-
-static int fd;
-
-static int
-cnxk_gpio_attr_exists(const char *attr)
-{
-	struct stat st;
-
-	return !stat(attr, &st);
-}
-
-static int
-cnxk_gpio_read_attr(char *attr, char *val)
-{
-	int ret, ret2;
-	FILE *fp;
-
-	fp = fopen(attr, "r");
-	if (!fp)
-		return -errno;
-
-	ret = fscanf(fp, "%s", val);
-	if (ret < 0) {
-		ret = -errno;
-		goto out;
-	}
-	if (ret != 1) {
-		ret = -EIO;
-		goto out;
-	}
-
-	ret = 0;
-out:
-	ret2 = fclose(fp);
-	if (!ret)
-		ret = ret2;
-
-	return ret;
-}
-
-#define CNXK_GPIO_ERR_STR(err, str, ...) do {                                  \
-	if (err) {                                                             \
-		CNXK_GPIO_LOG(ERR, "%s:%d: " str " (%d)", __func__, __LINE__, \
-			##__VA_ARGS__, err);                                   \
-		goto out;                                                      \
-	}                                                                      \
+#define CNXK_GPIO_ERR_STR(err, str, ...) do {                                                      \
+	if (err) {                                                                                 \
+		CNXK_GPIO_LOG(ERR, "%s:%d: " str " (%d)", __func__, __LINE__, ##__VA_ARGS__, err); \
+		goto out;                                                                          \
+	}                                                                                          \
 } while (0)
 
 static int
-cnxk_gpio_validate_attr(char *attr, const char *expected)
+cnxk_gpio_test_input(uint16_t dev_id, int gpio)
 {
-	char buf[CNXK_GPIO_BUFSZ];
 	int ret;
 
-	ret = cnxk_gpio_read_attr(attr, buf);
-	if (ret)
-		return ret;
-
-	if (strncmp(buf, expected, sizeof(buf)))
-		return -EIO;
-
-	return 0;
-}
-
-#define CNXK_GPIO_PATH_FMT "/sys/class/gpio/gpio%d"
-
-static int
-cnxk_gpio_test_input(uint16_t dev_id, int base, int gpio)
-{
-	char buf[CNXK_GPIO_BUFSZ];
-	int ret, n;
-
-	n = snprintf(buf, sizeof(buf), CNXK_GPIO_PATH_FMT, base + gpio);
-	snprintf(buf + n, sizeof(buf) - n, "/direction");
-
 	ret = rte_pmd_gpio_set_pin_dir(dev_id, gpio, CNXK_GPIO_PIN_DIR_IN);
 	CNXK_GPIO_ERR_STR(ret, "failed to set dir to input");
-	ret = cnxk_gpio_validate_attr(buf, "in");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	ret = rte_pmd_gpio_set_pin_value(dev_id, gpio, 1) |
 	      rte_pmd_gpio_set_pin_value(dev_id, gpio, 0);
@@ -107,29 +37,17 @@ cnxk_gpio_test_input(uint16_t dev_id, int base, int gpio)
 		CNXK_GPIO_ERR_STR(ret, "input pin overwritten");
 	}
 
-	snprintf(buf + n, sizeof(buf) - n, "/edge");
-
-	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio,
-					CNXK_GPIO_PIN_EDGE_FALLING);
+	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_FALLING);
 	CNXK_GPIO_ERR_STR(ret, "failed to set edge to falling");
-	ret = cnxk_gpio_validate_attr(buf, "falling");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
-	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio,
-					CNXK_GPIO_PIN_EDGE_RISING);
+	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_RISING);
 	CNXK_GPIO_ERR_STR(ret, "failed to change edge to rising");
-	ret = cnxk_gpio_validate_attr(buf, "rising");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_BOTH);
 	CNXK_GPIO_ERR_STR(ret, "failed to change edge to both");
-	ret = cnxk_gpio_validate_attr(buf, "both");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_NONE);
 	CNXK_GPIO_ERR_STR(ret, "failed to set edge to none");
-	ret = cnxk_gpio_validate_attr(buf, "none");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	/*
 	 * calling this makes sure kernel driver switches off inverted
@@ -141,44 +59,36 @@ cnxk_gpio_test_input(uint16_t dev_id, int base, int gpio)
 	return ret;
 }
 
-static int
-cnxk_gpio_trigger_irq(int gpio)
-{
-	int ret;
-
-	ret = ioctl(fd, OTX_IOC_TRIGGER_GPIO_HANDLER, gpio);
-
-	return ret == -1 ? -errno : 0;
-}
+static uint32_t triggered;
 
 static void
-cnxk_gpio_irq_handler(int gpio, void *data)
+cnxk_gpio_irq_handler(struct gpio_v2_line_event *event, void *data)
 {
-	*(int *)data = gpio;
+	int gpio = (int)(size_t)data;
+
+	if ((int)event->offset != gpio)
+		CNXK_GPIO_LOG(ERR, "event from gpio%d instead of gpio%d", event->offset, gpio);
+
+	rte_write32(1, &triggered);
 }
 
 static int
 cnxk_gpio_test_irq(uint16_t dev_id, int gpio)
 {
-	int irq_data, ret;
+	int ret;
 
 	ret = rte_pmd_gpio_set_pin_dir(dev_id, gpio, CNXK_GPIO_PIN_DIR_IN);
 	CNXK_GPIO_ERR_STR(ret, "failed to set dir to input");
 
-	irq_data = 0;
-	ret = rte_pmd_gpio_register_irq(dev_id, gpio, rte_lcore_id(),
-					cnxk_gpio_irq_handler, &irq_data);
+	ret = rte_pmd_gpio_register_irq2(dev_id, gpio, cnxk_gpio_irq_handler, (int *)(size_t)gpio);
 	CNXK_GPIO_ERR_STR(ret, "failed to register irq handler");
 
-	ret = rte_pmd_gpio_enable_interrupt(dev_id, gpio,
-					    CNXK_GPIO_PIN_EDGE_RISING);
+	ret = rte_pmd_gpio_enable_interrupt(dev_id, gpio, CNXK_GPIO_PIN_EDGE_RISING);
 	CNXK_GPIO_ERR_STR(ret, "failed to enable interrupt");
 
-	ret = cnxk_gpio_trigger_irq(gpio);
-	CNXK_GPIO_ERR_STR(ret, "failed to trigger irq");
-	rte_delay_ms(1);
-	ret = *(volatile int *)&irq_data == gpio ? 0 : -EIO;
-	CNXK_GPIO_ERR_STR(ret, "failed to test irq");
+	rte_delay_ms(2);
+	rte_read32(&triggered);
+	CNXK_GPIO_ERR_STR(!triggered, "failed to trigger irq");
 
 	ret = rte_pmd_gpio_disable_interrupt(dev_id, gpio);
 	CNXK_GPIO_ERR_STR(ret, "failed to disable interrupt");
@@ -193,24 +103,15 @@ cnxk_gpio_test_irq(uint16_t dev_id, int gpio)
 }
 
 static int
-cnxk_gpio_test_output(uint16_t dev_id, int base, int gpio)
+cnxk_gpio_test_output(uint16_t dev_id, int gpio)
 {
-	char buf[CNXK_GPIO_BUFSZ];
-	int ret, val, n;
+	int ret, val;
 
-	n = snprintf(buf, sizeof(buf), CNXK_GPIO_PATH_FMT, base + gpio);
-
-	snprintf(buf + n, sizeof(buf) - n, "/direction");
 	ret = rte_pmd_gpio_set_pin_dir(dev_id, gpio, CNXK_GPIO_PIN_DIR_OUT);
 	CNXK_GPIO_ERR_STR(ret, "failed to set dir to out");
-	ret = cnxk_gpio_validate_attr(buf, "out");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
-	snprintf(buf + n, sizeof(buf) - n, "/value");
 	ret = rte_pmd_gpio_set_pin_value(dev_id, gpio, 0);
 	CNXK_GPIO_ERR_STR(ret, "failed to set value to 0");
-	ret = cnxk_gpio_validate_attr(buf, "0");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 	ret = rte_pmd_gpio_get_pin_value(dev_id, gpio, &val);
 	CNXK_GPIO_ERR_STR(ret, "failed to read value");
 	if (val)
@@ -219,64 +120,41 @@ cnxk_gpio_test_output(uint16_t dev_id, int base, int gpio)
 
 	ret = rte_pmd_gpio_set_pin_value(dev_id, gpio, 1);
 	CNXK_GPIO_ERR_STR(ret, "failed to set value to 1");
-	ret = cnxk_gpio_validate_attr(buf, "1");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 	ret = rte_pmd_gpio_get_pin_value(dev_id, gpio, &val);
 	CNXK_GPIO_ERR_STR(ret, "failed to read value");
 	if (val != 1)
 		ret = -EIO;
 	CNXK_GPIO_ERR_STR(ret, "read %d instead of 1", val);
 
-	snprintf(buf + n, sizeof(buf) - n, "/direction");
 	ret = rte_pmd_gpio_set_pin_dir(dev_id, gpio, CNXK_GPIO_PIN_DIR_LOW);
 	CNXK_GPIO_ERR_STR(ret, "failed to set dir to low");
-	ret = cnxk_gpio_validate_attr(buf, "out");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
-	snprintf(buf + n, sizeof(buf) - n, "/value");
-	ret = cnxk_gpio_validate_attr(buf, "0");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
-	snprintf(buf + n, sizeof(buf) - n, "/direction");
 	ret = rte_pmd_gpio_set_pin_dir(dev_id, gpio, CNXK_GPIO_PIN_DIR_HIGH);
 	CNXK_GPIO_ERR_STR(ret, "failed to set dir to high");
-	ret = cnxk_gpio_validate_attr(buf, "out");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
-	snprintf(buf + n, sizeof(buf) - n, "/value");
-	ret = cnxk_gpio_validate_attr(buf, "1");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
-
-	snprintf(buf + n, sizeof(buf) - n, "/edge");
-	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio,
-					CNXK_GPIO_PIN_EDGE_FALLING);
+	ret = rte_pmd_gpio_get_pin_value(dev_id, gpio, &val);
+	CNXK_GPIO_ERR_STR(ret, "failed to read value");
+	if (val != 1)
+		ret = -EIO;
+	CNXK_GPIO_ERR_STR(ret, "read %d instead of 1", val);
+
+	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_FALLING);
 	ret = ret == 0 ? -EIO : 0;
 	CNXK_GPIO_ERR_STR(ret, "changed edge to falling");
-	ret = cnxk_gpio_validate_attr(buf, "none");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
-	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio,
-					CNXK_GPIO_PIN_EDGE_RISING);
+	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_RISING);
 	ret = ret == 0 ? -EIO : 0;
 	CNXK_GPIO_ERR_STR(ret, "changed edge to rising");
-	ret = cnxk_gpio_validate_attr(buf, "none");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_BOTH);
 	ret = ret == 0 ? -EIO : 0;
 	CNXK_GPIO_ERR_STR(ret, "changed edge to both");
-	ret = cnxk_gpio_validate_attr(buf, "none");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	/* this one should succeed */
 	ret = rte_pmd_gpio_set_pin_edge(dev_id, gpio, CNXK_GPIO_PIN_EDGE_NONE);
 	CNXK_GPIO_ERR_STR(ret, "failed to change edge to none");
-	ret = cnxk_gpio_validate_attr(buf, "none");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
-	snprintf(buf + n, sizeof(buf) - n, "/active_low");
 	ret = rte_pmd_gpio_set_pin_active_low(dev_id, gpio, 1);
 	CNXK_GPIO_ERR_STR(ret, "failed to set active_low to 1");
-	ret = cnxk_gpio_validate_attr(buf, "1");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
 
 	ret = rte_pmd_gpio_get_pin_active_low(dev_id, gpio, &val);
 	CNXK_GPIO_ERR_STR(ret, "failed to read active_low");
@@ -284,23 +162,24 @@ cnxk_gpio_test_output(uint16_t dev_id, int base, int gpio)
 		ret = -EIO;
 	CNXK_GPIO_ERR_STR(ret, "read %d instead of 1", val);
 
-	snprintf(buf + n, sizeof(buf) - n, "/value");
 	ret = rte_pmd_gpio_set_pin_value(dev_id, gpio, 1);
 	CNXK_GPIO_ERR_STR(ret, "failed to set value to 1");
-	ret = cnxk_gpio_validate_attr(buf, "1");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
+	ret = rte_pmd_gpio_get_pin_value(dev_id, gpio, &val);
+	CNXK_GPIO_ERR_STR(ret, "failed to read value");
+	if (val != 1)
+		ret = -EIO;
+	CNXK_GPIO_ERR_STR(ret, "read %d instead of 1", val);
 
 	ret = rte_pmd_gpio_set_pin_value(dev_id, gpio, 0);
 	CNXK_GPIO_ERR_STR(ret, "failed to set value to 0");
-	ret = cnxk_gpio_validate_attr(buf, "0");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
+	ret = rte_pmd_gpio_get_pin_value(dev_id, gpio, &val);
+	CNXK_GPIO_ERR_STR(ret, "failed to read value");
+	if (val != 0)
+		ret = -EIO;
+	CNXK_GPIO_ERR_STR(ret, "read %d instead of 0", val);
 
-	snprintf(buf + n, sizeof(buf) - n, "/active_low");
 	ret = rte_pmd_gpio_set_pin_active_low(dev_id, gpio, 0);
 	CNXK_GPIO_ERR_STR(ret, "failed to set active_low to 0");
-	ret = cnxk_gpio_validate_attr(buf, "0");
-	CNXK_GPIO_ERR_STR(ret, "failed to validate %s", buf);
-
 out:
 	return ret;
 }
@@ -309,17 +188,13 @@ int
 cnxk_gpio_selftest(uint16_t dev_id)
 {
 	struct cnxk_gpio_queue_conf conf;
-	struct cnxk_gpiochip *gpiochip;
-	char buf[CNXK_GPIO_BUFSZ];
 	struct rte_rawdev *rawdev;
 	unsigned int queues, i;
-	struct cnxk_gpio *gpio;
 	int ret, ret2;
 
 	rawdev = rte_rawdev_pmd_get_named_dev("cnxk_gpio");
 	if (!rawdev)
 		return -ENODEV;
-	gpiochip = rawdev->dev_private;
 
 	queues = rte_rawdev_queue_count(dev_id);
 	if (queues == 0)
@@ -329,10 +204,6 @@ cnxk_gpio_selftest(uint16_t dev_id)
 	if (ret)
 		return ret;
 
-	fd = open("/dev/otx-gpio-ctr", O_RDWR | O_SYNC);
-	if (fd < 0)
-		return -errno;
-
 	for (i = 0; i < queues; i++) {
 		ret = rte_rawdev_queue_conf_get(dev_id, i, &conf, sizeof(conf));
 		if (ret) {
@@ -355,15 +226,7 @@ cnxk_gpio_selftest(uint16_t dev_id)
 			goto out;
 		}
 
-		gpio = gpiochip->gpios[conf.gpio];
-		snprintf(buf, sizeof(buf), CNXK_GPIO_PATH_FMT, gpio->num);
-		if (!cnxk_gpio_attr_exists(buf)) {
-			CNXK_GPIO_LOG(ERR, "%s does not exist", buf);
-			ret = -ENOENT;
-			goto release;
-		}
-
-		ret = cnxk_gpio_test_input(dev_id, gpiochip->base, conf.gpio);
+		ret = cnxk_gpio_test_input(dev_id, conf.gpio);
 		if (ret)
 			goto release;
 
@@ -371,7 +234,7 @@ cnxk_gpio_selftest(uint16_t dev_id)
 		if (ret)
 			goto release;
 
-		ret = cnxk_gpio_test_output(dev_id, gpiochip->base, conf.gpio);
+		ret = cnxk_gpio_test_output(dev_id, conf.gpio);
 release:
 		ret2 = ret;
 		ret = rte_rawdev_queue_release(dev_id, i);
@@ -381,12 +244,6 @@ cnxk_gpio_selftest(uint16_t dev_id)
 			break;
 		}
 
-		if (cnxk_gpio_attr_exists(buf)) {
-			CNXK_GPIO_LOG(ERR, "%s still exists", buf);
-			ret = -EIO;
-			break;
-		}
-
 		if (ret2) {
 			ret = ret2;
 			break;
@@ -394,7 +251,6 @@ cnxk_gpio_selftest(uint16_t dev_id)
 	}
 
 out:
-	close(fd);
 	rte_rawdev_stop(dev_id);
 
 	return ret;
diff --git a/drivers/raw/cnxk_gpio/meson.build b/drivers/raw/cnxk_gpio/meson.build
index 9d9a527392..372f3d9f46 100644
--- a/drivers/raw/cnxk_gpio/meson.build
+++ b/drivers/raw/cnxk_gpio/meson.build
@@ -5,7 +5,6 @@
 deps += ['bus_vdev', 'common_cnxk', 'rawdev', 'kvargs']
 sources = files(
         'cnxk_gpio.c',
-        'cnxk_gpio_irq.c',
         'cnxk_gpio_selftest.c',
 )
 headers = files('rte_pmd_cnxk_gpio.h')
diff --git a/drivers/raw/cnxk_gpio/rte_pmd_cnxk_gpio.h b/drivers/raw/cnxk_gpio/rte_pmd_cnxk_gpio.h
index 80a37be9c7..d0b165639c 100644
--- a/drivers/raw/cnxk_gpio/rte_pmd_cnxk_gpio.h
+++ b/drivers/raw/cnxk_gpio/rte_pmd_cnxk_gpio.h
@@ -5,6 +5,8 @@
 #ifndef _RTE_PMD_CNXK_GPIO_H_
 #define _RTE_PMD_CNXK_GPIO_H_
 
+#include <linux/gpio.h>
+
 #include <rte_malloc.h>
 #include <rte_memcpy.h>
 #include <rte_rawdev.h>
@@ -48,8 +50,10 @@ enum cnxk_gpio_msg_type {
 	CNXK_GPIO_MSG_TYPE_GET_PIN_DIR,
 	/** Type used to read inverted logic state */
 	CNXK_GPIO_MSG_TYPE_GET_PIN_ACTIVE_LOW,
-	/** Type used to register interrupt handler */
+	/** Type used to register interrupt handler (deprecated) */
 	CNXK_GPIO_MSG_TYPE_REGISTER_IRQ,
+	/** Type used to register interrupt handler */
+	CNXK_GPIO_MSG_TYPE_REGISTER_IRQ2,
 	/** Type used to remove interrupt handler */
 	CNXK_GPIO_MSG_TYPE_UNREGISTER_IRQ,
 };
@@ -79,7 +83,7 @@ enum cnxk_gpio_pin_dir {
 };
 
 /**
- * GPIO interrupt handler
+ * GPIO interrupt handler (deprecated)
  *
  * @param gpio
  *   Zero-based GPIO number
@@ -97,6 +101,23 @@ struct cnxk_gpio_irq {
 	int cpu;
 };
 
+/**
+ * GPIO interrupt handler
+ *
+ * @param event
+ *   Pointer to gpio event data
+ * @param data
+ *   Cookie passed to interrupt handler
+ */
+typedef void (*cnxk_gpio_irq_handler2_t)(struct gpio_v2_line_event *event, void *data);
+
+struct cnxk_gpio_irq2 {
+	/** Interrupt handler */
+	cnxk_gpio_irq_handler2_t handler;
+	/** User data passed to irq handler */
+	void *data;
+};
+
 struct cnxk_gpio_msg {
 	/** Message type */
 	enum cnxk_gpio_msg_type type;
@@ -338,7 +359,7 @@ rte_pmd_gpio_get_pin_active_low(uint16_t dev_id, int gpio, int *val)
 }
 
 /**
- * Attach interrupt handler to GPIO
+ * Attach interrupt handler to GPIO (deprecated)
  *
  * @param dev_id
  *   The identifier of the device
@@ -371,6 +392,36 @@ rte_pmd_gpio_register_irq(uint16_t dev_id, int gpio, int cpu,
 	return __rte_pmd_gpio_enq_deq(dev_id, gpio, &msg, NULL, 0);
 }
 
+/**
+ * Attach interrupt handler to GPIO
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param gpio
+ *   Zero-based GPIO number
+ * @param handler
+ *   Interrupt handler to be executed
+ * @param data
+ *   Data to be passed to interrupt handler
+ *
+ * @return
+ *   Returns 0 on success, negative error code otherwise
+ */
+static __rte_always_inline int
+rte_pmd_gpio_register_irq2(uint16_t dev_id, int gpio, cnxk_gpio_irq_handler2_t handler, void *data)
+{
+	struct cnxk_gpio_irq2 irq = {
+		.handler = handler,
+		.data = data,
+	};
+	struct cnxk_gpio_msg msg = {
+		.type = CNXK_GPIO_MSG_TYPE_REGISTER_IRQ2,
+		.data = &irq,
+	};
+
+	return __rte_pmd_gpio_enq_deq(dev_id, gpio, &msg, NULL, 0);
+}
+
 /**
  * Detach interrupt handler from GPIO
  *
-- 
2.34.1


^ permalink raw reply	[relevance 1%]

* Community CI Meeting Minutes - January 9, 2025
@ 2025-03-13 22:54  4% Patrick Robb
  0 siblings, 0 replies; 200+ results
From: Patrick Robb @ 2025-03-13 22:54 UTC (permalink / raw)
  To: dev; +Cc: ci

[-- Attachment #1: Type: text/plain, Size: 3351 bytes --]

#####################################################################
January 9, 2025
Attendees
. Patrick Robb
. Aaron Conole
. Luca Vizzarro
. Dean Marx
. Cody Cheng
. Aaron Conole
. Nicholas Pratte
. Matthew McGovern
. Manit Mahajan

#####################################################################
Minutes

=====================================================================
General Announcements
* Happy New Year!
* 25.03 release schedule is:
   * (-rc1): 7 February 2025
   * (-rc2): 28 February 2025
   * (-rc3): 7 March 2025
   * Release: 26 March 2025

=====================================================================
CI Status

---------------------------------------------------------------------
UNH-IOL Community Lab
* OvS install now requires python 3.7, which opensuse 15.6 does not ship
with. So, options are to either upgrade python on this test container, or
discontinue the test for this distro.
* Lab Dashboard updates:
   * We are adding a patch series search feature for searching by series
title or series # id. This will deploy before end of week.
   * Working on embedding the code coverage reports in the dashboard
   * Working on embedding the code coverage reports in the dashboard such
that the reports don’t have to be manually downloaded and opened by users
      * We should also begin running the coverage reports on a per series
basis
         * Can produce json output from gcov/lcov, and compare the
“basline” json coverage to a new result
* ABI testing: We have been running ABI tests for 24.11 and main for the
past few weeks, but still need to backfill ABI results for patches
submitted immediately after 24.11 released. We can put these into queue
today.
* The lab will have a little downtime next week for infra maintenance,
which will be announced the day before on the mailing list.

---------------------------------------------------------------------
Intel Lab
* None

---------------------------------------------------------------------
Github Actions
* None

---------------------------------------------------------------------
Loongarch Lab
* None

=====================================================================
DTS Improvements & Test Development
* Series Patrick should review:
   * Dean: queue start stop
   * Nick: ethertype test suite
   * Softnic testsuite
      * V3 is submitted
      * * Perf testing:
   * UNH team has been investigating performance traffic generators
      * Trex
         * Have just been testing the trex api for setting traffic streams,
collecting stats etc.
      * Dperf
      * Need to decide on collecting stats via trex API vs testpmd when
writing the single core forwarding perf test
   * Luca proposes an abstraction for creating files on the DTS nodes, such
that we don’t have to manually create and copy files requires for DTS
testsuites.

=====================================================================
Any other business
* Next Meeting January 23
* Azure:
   * Currently using a “Lisa” test framework for managing nodes and running
some tests
   * Aiming to leverage DTS in the future
   * Can we get 2 nodes on the same l2 networking?
      * Not really on Azure, so we would need to provide support for tests
through l3 routing.

[-- Attachment #2: Type: text/html, Size: 3569 bytes --]

^ permalink raw reply	[relevance 4%]

* Re: [RFC v3 5/8] build: generate symbol maps
  2025-03-11  9:56 18%   ` [RFC v3 5/8] build: generate symbol maps David Marchand
@ 2025-03-13 17:26  0%     ` Bruce Richardson
  2025-03-14 15:38  0%       ` David Marchand
  2025-03-14 15:27  0%     ` Andre Muezerie
  1 sibling, 1 reply; 200+ results
From: Bruce Richardson @ 2025-03-13 17:26 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, thomas, andremue

On Tue, Mar 11, 2025 at 10:56:03AM +0100, David Marchand wrote:
> Rather than maintain a file in parallel of the code, symbols to be
> exported can be marked with a token RTE_EXPORT_*SYMBOL.
> 
> From those marks, the build framework generates map files only for
> symbols actually compiled (which means that the WINDOWS_NO_EXPORT hack
> becomes unnecessary).
> 
> The build framework directly creates a map file in the format that the
> linker expects (rather than converting from GNU linker to MSVC linker).
> 
> Empty maps are allowed again as a replacement for drivers/version.map.
> 
> The symbol check is updated to only support the new format.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>

Some comments inline below.
/Bruce

> ---
> Changes since RFC v2:
> - because of MSVC limitations wrt macro passed via cmdline,
>   used an internal header for defining RTE_EXPORT_* macros,
> - updated documentation and tooling,
> 
> ---
>  MAINTAINERS                                |   2 +
>  buildtools/gen-version-map.py              | 111 ++++++++++
>  buildtools/map-list-symbol.sh              |  10 +-
>  buildtools/meson.build                     |   1 +
>  config/meson.build                         |   2 +
>  config/rte_export.h                        |  16 ++
>  devtools/check-symbol-change.py            |  90 +++++++++
>  devtools/check-symbol-maps.sh              |  14 --
>  devtools/checkpatches.sh                   |   2 +-
>  doc/guides/contributing/abi_versioning.rst | 224 ++-------------------
>  drivers/meson.build                        |  94 +++++----
>  drivers/version.map                        |   3 -
>  lib/meson.build                            |  91 ++++++---
>  13 files changed, 371 insertions(+), 289 deletions(-)
>  create mode 100755 buildtools/gen-version-map.py
>  create mode 100644 config/rte_export.h
>  create mode 100755 devtools/check-symbol-change.py
>  delete mode 100644 drivers/version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 312e6fcee5..04772951d3 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -95,6 +95,7 @@ F: devtools/check-maintainers.sh
>  F: devtools/check-forbidden-tokens.awk
>  F: devtools/check-git-log.sh
>  F: devtools/check-spdx-tag.sh
> +F: devtools/check-symbol-change.py
>  F: devtools/check-symbol-change.sh
>  F: devtools/check-symbol-maps.sh
>  F: devtools/checkpatches.sh
> @@ -127,6 +128,7 @@ F: config/
>  F: buildtools/check-symbols.sh
>  F: buildtools/chkincs/
>  F: buildtools/call-sphinx-build.py
> +F: buildtools/gen-version-map.py
>  F: buildtools/get-cpu-count.py
>  F: buildtools/get-numa-count.py
>  F: buildtools/list-dir-globs.py
> diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
> new file mode 100755
> index 0000000000..b160aa828b
> --- /dev/null
> +++ b/buildtools/gen-version-map.py
> @@ -0,0 +1,111 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright (c) 2024 Red Hat, Inc.
> +
> +"""Generate a version map file used by GNU or MSVC linker."""
> +

While it's an internal build script not to be run by users directly, I
believe a short one-line usage here might be useful, since the code below
is directly referencing sys.argv[N] values. That makes it easier for the
user to know what they are.

Alternatively, assign them to proper names at the top of the script e.g.:
	scriptname, link_mode, abi_version_file, output, *input = sys.argv

Final alternative (which may be a bit overkill) is to use argparse.

> +import re
> +import sys
> +
> +# From rte_export.h
> +export_exp_sym_regexp = re.compile(r"^RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+), ([0-9]+.[0-9]+)\)")
> +export_int_sym_regexp = re.compile(r"^RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> +export_sym_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^)]+)\)")
> +# From rte_function_versioning.h
> +ver_sym_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +ver_exp_sym_regexp = re.compile(r"^RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> +default_sym_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +
> +with open(sys.argv[2]) as f:
> +    abi = 'DPDK_{}'.format(re.match("([0-9]+).[0-9]", f.readline()).group(1))
> +
> +symbols = {}
> +
> +for file in sys.argv[4:]:
> +    with open(file, encoding="utf-8") as f:
> +        for ln in f.readlines():
> +            node = None
> +            symbol = None
> +            comment = None
> +            if export_exp_sym_regexp.match(ln):
> +                node = 'EXPERIMENTAL'
> +                symbol = export_exp_sym_regexp.match(ln).group(1)
> +                comment = ' # added in {}'.format(export_exp_sym_regexp.match(ln).group(2))
> +            elif export_int_sym_regexp.match(ln):
> +                node = 'INTERNAL'
> +                symbol = export_int_sym_regexp.match(ln).group(1)
> +            elif export_sym_regexp.match(ln):
> +                node = abi
> +                symbol = export_sym_regexp.match(ln).group(1)
> +            elif ver_sym_regexp.match(ln):
> +                node = 'DPDK_{}'.format(ver_sym_regexp.match(ln).group(1))
> +                symbol = ver_sym_regexp.match(ln).group(2)
> +            elif ver_exp_sym_regexp.match(ln):
> +                node = 'EXPERIMENTAL'
> +                symbol = ver_exp_sym_regexp.match(ln).group(1)
> +            elif default_sym_regexp.match(ln):
> +                node = 'DPDK_{}'.format(default_sym_regexp.match(ln).group(1))
> +                symbol = default_sym_regexp.match(ln).group(2)
> +
> +            if not symbol:
> +                continue
> +
> +            if node not in symbols:
> +                symbols[node] = {}
> +            symbols[node][symbol] = comment
> +
> +if sys.argv[1] == 'msvc':
> +    with open(sys.argv[3], "w") as outfile:
> +        outfile.writelines(f"EXPORTS\n")
> +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> +            if key not in symbols:
> +                continue
> +            for symbol in sorted(symbols[key].keys()):
> +                outfile.writelines(f"\t{symbol}\n")
> +            del symbols[key]
> +else:
> +    with open(sys.argv[3], "w") as outfile:
> +        local_token = False
> +        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
> +            if key not in symbols:
> +                continue
> +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> +            for symbol in sorted(symbols[key].keys()):
> +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> +                    prefix = '__emutls_v.'
> +                else:
> +                    prefix = ''
> +                outfile.writelines(f"\t{prefix}{symbol};")
> +                comment = symbols[key][symbol]
> +                if comment:
> +                    outfile.writelines(f"{comment}")
> +                outfile.writelines("\n")

How about using "" rather than None for the default comment so you can
always just do a print of "{prefix}{symbol};{comment}\n". The fact that
writelines doesn't output a "\n" is a little confusing here, so maybe use
"print" instead.

	print("f\t{prefix}{symbol};{comment}", file=outfile)

> +            outfile.writelines("\n")
> +            if not local_token:
> +                outfile.writelines("\tlocal: *;\n")
> +                local_token = True
> +            outfile.writelines("};\n")
> +            del symbols[key]
> +        for key in sorted(symbols.keys()):
> +            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
> +            for symbol in sorted(symbols[key].keys()):
> +                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
> +                    prefix = '__emutls_v.'
> +                else:
> +                    prefix = ''
> +                outfile.writelines(f"\t{prefix}{symbol};")
> +                comment = symbols[key][symbol]
> +                if comment:
> +                    outfile.writelines(f"{comment}")
> +                outfile.writelines("\n")
> +            outfile.writelines(f"}} {abi};\n")
> +            if not local_token:
> +                outfile.writelines("\tlocal: *;\n")
> +                local_token = True
> +            del symbols[key]
> +        # No exported symbol, add a catch all
> +        if not local_token:
> +            outfile.writelines(f"{abi} {{\n")
> +            outfile.writelines("\tlocal: *;\n")
> +            local_token = True
> +            outfile.writelines("};\n")
> diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
> index eb98451d8e..0829df4be5 100755
> --- a/buildtools/map-list-symbol.sh
> +++ b/buildtools/map-list-symbol.sh
> @@ -62,10 +62,14 @@ for file in $@; do
>  		if (current_section == "") {
>  			next;
>  		}
> +		symbol_version = current_version
> +		if (/^[^}].*[^:*]; # added in /) {
> +			symbol_version = $5
> +		}
>  		if ("'$version'" != "") {
> -			if ("'$version'" == "unset" && current_version != "") {
> +			if ("'$version'" == "unset" && symbol_version != "") {
>  				next;
> -			} else if ("'$version'" != "unset" && "'$version'" != current_version) {
> +			} else if ("'$version'" != "unset" && "'$version'" != symbol_version) {
>  				next;
>  			}
>  		}
> @@ -73,7 +77,7 @@ for file in $@; do
>  		if ("'$symbol'" == "all" || $1 == "'$symbol'") {
>  			ret = 0;
>  			if ("'$quiet'" == "") {
> -				print "'$file' "current_section" "$1" "current_version;
> +				print "'$file' "current_section" "$1" "symbol_version;
>  			}
>  			if ("'$symbol'" != "all") {
>  				exit 0;
> diff --git a/buildtools/meson.build b/buildtools/meson.build
> index 4e2c1217a2..b745e9afa4 100644
> --- a/buildtools/meson.build
> +++ b/buildtools/meson.build
> @@ -16,6 +16,7 @@ else
>      py3 = ['meson', 'runpython']
>  endif
>  echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
> +gen_version_map = py3 + files('gen-version-map.py')
>  list_dir_globs = py3 + files('list-dir-globs.py')
>  map_to_win_cmd = py3 + files('map_to_win.py')
>  sphinx_wrapper = py3 + files('call-sphinx-build.py')
> diff --git a/config/meson.build b/config/meson.build
> index f31fef216c..54657055fb 100644
> --- a/config/meson.build
> +++ b/config/meson.build
> @@ -303,8 +303,10 @@ endif
>  # add -include rte_config to cflags
>  if is_ms_compiler
>      add_project_arguments('/FI', 'rte_config.h', language: 'c')
> +    add_project_arguments('/FI', 'rte_export.h', language: 'c')
>  else
>      add_project_arguments('-include', 'rte_config.h', language: 'c')
> +    add_project_arguments('-include', 'rte_export.h', language: 'c')
>  endif
>  
>  # enable extra warnings and disable any unwanted warnings
> diff --git a/config/rte_export.h b/config/rte_export.h
> new file mode 100644
> index 0000000000..83d871fe11
> --- /dev/null
> +++ b/config/rte_export.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2025 Red Hat, Inc.
> + */
> +
> +#ifndef RTE_EXPORT_H
> +#define RTE_EXPORT_H
> +
> +/* *Internal* macros for exporting symbols, used by the build system.
> + * For RTE_EXPORT_EXPERIMENTAL_SYMBOL, ver indicates the
> + * version this symbol was introduced in.
> + */
> +#define RTE_EXPORT_EXPERIMENTAL_SYMBOL(a, ver)
> +#define RTE_EXPORT_INTERNAL_SYMBOL(a)
> +#define RTE_EXPORT_SYMBOL(a)
> +
> +#endif /* RTE_EXPORT_H */
> diff --git a/devtools/check-symbol-change.py b/devtools/check-symbol-change.py
> new file mode 100755
> index 0000000000..09709e4f06
> --- /dev/null
> +++ b/devtools/check-symbol-change.py
> @@ -0,0 +1,90 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright (c) 2025 Red Hat, Inc.
> +
> +"""Check exported symbols change in a patch."""
> +
> +import re
> +import sys
> +
> +file_header_regexp = re.compile(r"^(\-\-\-|\+\+\+) [ab]/(lib|drivers)/([^/]+)/([^/]+)")
> +# From rte_export.h
> +export_exp_sym_regexp = re.compile(r"^.RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+),")
> +export_int_sym_regexp = re.compile(r"^.RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
> +export_sym_regexp = re.compile(r"^.RTE_EXPORT_SYMBOL\(([^)]+)\)")
> +# TODO, handle versioned symbols from rte_function_versioning.h
> +# ver_sym_regexp = re.compile(r"^.RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +# ver_exp_sym_regexp = re.compile(r"^.RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
> +# default_sym_regexp = re.compile(r"^.RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
> +
> +symbols = {}
> +
> +for file in sys.argv[1:]:
> +    with open(file, encoding="utf-8") as f:
> +        for ln in f.readlines():
> +            if file_header_regexp.match(ln):
> +                if file_header_regexp.match(ln).group(2) == "lib":
> +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> +                elif file_header_regexp.match(ln).group(3) == "intel":
> +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3, 4))
> +                else:
> +                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
> +
> +                if lib not in symbols:
> +                    symbols[lib] = {}
> +                continue
> +
> +            if export_exp_sym_regexp.match(ln):
> +                symbol = export_exp_sym_regexp.match(ln).group(1)
> +                node = 'EXPERIMENTAL'
> +            elif export_int_sym_regexp.match(ln):
> +                node = 'INTERNAL'
> +                symbol = export_int_sym_regexp.match(ln).group(1)
> +            elif export_sym_regexp.match(ln):
> +                symbol = export_sym_regexp.match(ln).group(1)
> +                node = 'stable'
> +            else:
> +                continue
> +
> +            if symbol not in symbols[lib]:
> +                symbols[lib][symbol] = {}
> +            added = ln[0] == '+'
> +            if added and 'added' in symbols[lib][symbol] and node != symbols[lib][symbol]['added']:
> +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> +            if not added and 'removed' in symbols[lib][symbol] and node != symbols[lib][symbol]['removed']:
> +                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
> +            if added:
> +                symbols[lib][symbol]['added'] = node
> +            else:
> +                symbols[lib][symbol]['removed'] = node
> +
> +    for lib in sorted(symbols.keys()):
> +        error = False
> +        for symbol in sorted(symbols[lib].keys()):
> +            if 'removed' not in symbols[lib][symbol]:
> +                # Symbol addition
> +                node = symbols[lib][symbol]['added']
> +                if node == 'stable':
> +                    print(f"ERROR: {symbol} in {lib} has been added directly to stable ABI.")
> +                    error = True
> +                else:
> +                    print(f"INFO: {symbol} in {lib} has been added to {node} ABI.")
> +                continue
> +
> +            if 'added' not in symbols[lib][symbol]:
> +                # Symbol removal
> +                node = symbols[lib][symbol]['added']
> +                if node == 'stable':
> +                    print(f"INFO: {symbol} in {lib} has been removed from stable ABI.")
> +                    print(f"Please check it has gone though the deprecation process.")
> +                continue
> +
> +            if symbols[lib][symbol]['added'] == symbols[lib][symbol]['removed']:
> +                # Symbol was moved around
> +                continue
> +
> +            # Symbol modifications
> +            added = symbols[lib][symbol]['added']
> +            removed = symbols[lib][symbol]['removed']
> +            print(f"INFO: {symbol} in {lib} is moving from {removed} to {added}")
> +            print(f"Please check it has gone though the deprecation process.")
> diff --git a/devtools/check-symbol-maps.sh b/devtools/check-symbol-maps.sh
> index 6121f78ec6..fcd3931e5d 100755
> --- a/devtools/check-symbol-maps.sh
> +++ b/devtools/check-symbol-maps.sh
> @@ -60,20 +60,6 @@ if [ -n "$local_miss_maps" ] ; then
>      ret=1
>  fi
>  
> -find_empty_maps ()
> -{
> -    for map in $@ ; do
> -        [ $(buildtools/map-list-symbol.sh $map | wc -l) != '0' ] || echo $map
> -    done
> -}
> -
> -empty_maps=$(find_empty_maps $@)
> -if [ -n "$empty_maps" ] ; then
> -    echo "Found empty maps:"
> -    echo "$empty_maps"
> -    ret=1
> -fi
> -
>  find_bad_format_maps ()
>  {
>      abi_version=$(cut -d'.' -f 1 ABI_VERSION)
> diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
> index 003bb49e04..7dcac7c8c9 100755
> --- a/devtools/checkpatches.sh
> +++ b/devtools/checkpatches.sh
> @@ -33,7 +33,7 @@ VOLATILE,PREFER_PACKED,PREFER_ALIGNED,PREFER_PRINTF,STRLCPY,\
>  PREFER_KERNEL_TYPES,PREFER_FALLTHROUGH,BIT_MACRO,CONST_STRUCT,\
>  SPLIT_STRING,LONG_LINE_STRING,C99_COMMENT_TOLERANCE,\
>  LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,\
> -NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG"
> +NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG,EXPORT_SYMBOL"
>  options="$options $DPDK_CHECKPATCH_OPTIONS"
>  
>  print_usage () {
> diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
> index 88dd776b4c..addbb24b9e 100644
> --- a/doc/guides/contributing/abi_versioning.rst
> +++ b/doc/guides/contributing/abi_versioning.rst
> @@ -58,12 +58,12 @@ persists over multiple releases.
>  
>  .. code-block:: none
>  
> - $ head ./lib/acl/version.map
> + $ head ./build/lib/librte_acl_exports.map

I must admit I'm not a fan of these long filenames. How about just
"acl_exports.map"?

>   DPDK_21 {
>          global:
>   ...
>  
> - $ head ./lib/eal/version.map
> + $ head ./build/lib/librte_eal_exports.map
>   DPDK_21 {
>          global:
>   ...
> @@ -77,7 +77,7 @@ that library.
>  
>  .. code-block:: none
>  
> - $ head ./lib/acl/version.map
> + $ head ./build/lib/librte_acl_exports.map
>   DPDK_21 {
>          global:
>   ...
> @@ -88,7 +88,7 @@ that library.
>   } DPDK_21;
>   ...
>  
> - $ head ./lib/eal/version.map
> + $ head ./build/lib/librte_eal_exports.map
>   DPDK_21 {
>          global:
>   ...
> @@ -100,12 +100,12 @@ how this may be done.
>  
>  .. code-block:: none
>  
> - $ head ./lib/acl/version.map
> + $ head ./build/lib/librte_acl_exports.map
>   DPDK_22 {
>          global:
>   ...
>  
> - $ head ./lib/eal/version.map
> + $ head ./build/lib/librte_eal_exports.map
>   DPDK_22 {
>          global:
>   ...
> @@ -134,8 +134,7 @@ linked to the DPDK.
>  
>  To support backward compatibility the ``rte_function_versioning.h``
>  header file provides macros to use when updating exported functions. These
> -macros are used in conjunction with the ``version.map`` file for
> -a given library to allow multiple versions of a symbol to exist in a shared
> +macros allow multiple versions of a symbol to exist in a shared
>  library so that older binaries need not be immediately recompiled.
>  
>  The macros are:
> @@ -169,6 +168,7 @@ Assume we have a function as follows
>    * Create an acl context object for apps to
>    * manipulate
>    */
> + RTE_EXPORT_SYMBOL(rte_acl_create)
>   struct rte_acl_ctx *
>   rte_acl_create(const struct rte_acl_param *param)
>   {
> @@ -187,6 +187,7 @@ private, is safe), but it also requires modifying the code as follows
>    * Create an acl context object for apps to
>    * manipulate
>    */
> + RTE_EXPORT_SYMBOL(rte_acl_create)
>   struct rte_acl_ctx *
>   rte_acl_create(const struct rte_acl_param *param, int debug)
>   {
> @@ -203,78 +204,16 @@ The addition of a parameter to the function is ABI breaking as the function is
>  public, and existing application may use it in its current form. However, the
>  compatibility macros in DPDK allow a developer to use symbol versioning so that
>  multiple functions can be mapped to the same public symbol based on when an
> -application was linked to it. To see how this is done, we start with the
> -requisite libraries version map file. Initially the version map file for the acl
> -library looks like this
> +application was linked to it.
>  
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -
> -        rte_acl_add_rules;
> -        rte_acl_build;
> -        rte_acl_classify;
> -        rte_acl_classify_alg;
> -        rte_acl_classify_scalar;
> -        rte_acl_create;
> -        rte_acl_dump;
> -        rte_acl_find_existing;
> -        rte_acl_free;
> -        rte_acl_ipv4vlan_add_rules;
> -        rte_acl_ipv4vlan_build;
> -        rte_acl_list_dump;
> -        rte_acl_reset;
> -        rte_acl_reset_rules;
> -        rte_acl_set_ctx_classify;
> -
> -        local: *;
> -   };
> -
> -This file needs to be modified as follows
> -
> -.. code-block:: none
> -
> -   DPDK_21 {
> -        global:
> -
> -        rte_acl_add_rules;
> -        rte_acl_build;
> -        rte_acl_classify;
> -        rte_acl_classify_alg;
> -        rte_acl_classify_scalar;
> -        rte_acl_create;
> -        rte_acl_dump;
> -        rte_acl_find_existing;
> -        rte_acl_free;
> -        rte_acl_ipv4vlan_add_rules;
> -        rte_acl_ipv4vlan_build;
> -        rte_acl_list_dump;
> -        rte_acl_reset;
> -        rte_acl_reset_rules;
> -        rte_acl_set_ctx_classify;
> -
> -        local: *;
> -   };
> -
> -   DPDK_22 {
> -        global:
> -        rte_acl_create;
> -
> -   } DPDK_21;
> -
> -The addition of the new block tells the linker that a new version node
> -``DPDK_22`` is available, which contains the symbol rte_acl_create, and inherits
> -the symbols from the DPDK_21 node. This list is directly translated into a
> -list of exported symbols when DPDK is compiled as a shared library.
> -
> -Next, we need to specify in the code which function maps to the rte_acl_create
> +We need to specify in the code which function maps to the rte_acl_create
>  symbol at which versions.  First, at the site of the initial symbol definition,
>  we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
> -the function return type, and the function name and its arguments.
> +the function return type, the function name and its arguments.

Good fix, though technically not relevant to this patch.

>  
>  .. code-block:: c
>  
> + -RTE_EXPORT_SYMBOL(rte_acl_create)
>   -struct rte_acl_ctx *
>   -rte_acl_create(const struct rte_acl_param *param)
>   +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
> @@ -293,6 +232,7 @@ We have now mapped the original rte_acl_create symbol to the original function
>  
>  Please see the section :ref:`Enabling versioning macros
>  <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
> +

Ditto.

>  Next, we need to create the new version of the symbol. We create a new
>  function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
>  
> @@ -312,9 +252,9 @@ The macro instructs the linker to create the new default symbol
>  ``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
>  (declared by the macro).
>  
> -And that's it, on the next shared library rebuild, there will be two versions of
> -rte_acl_create, an old DPDK_21 version, used by previously built applications,
> -and a new DPDK_22 version, used by future built applications.
> +And that's it. On the next shared library rebuild, there will be two versions of rte_acl_create,
> +an old DPDK_21 version, used by previously built applications, and a new DPDK_22 version,
> +used by future built applications.

nit: not sure what others think but "future built" sounds strange to me?
How about "later built" or "newly built"?

>  
>  .. note::
>  
> @@ -364,6 +304,7 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
>      * Create an acl context object for apps to
>      * manipulate
>      */
<snip>

^ permalink raw reply	[relevance 0%]

* Re: [RFC v3 3/8] eal: rework function versioning macros
  2025-03-13 16:53  0%     ` Bruce Richardson
@ 2025-03-13 17:09  0%       ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-13 17:09 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, thomas, andremue, Tyler Retzlaff, Jasvinder Singh

On Thu, Mar 13, 2025 at 5:54 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
> > diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
> > index 7afd1c1886..88dd776b4c 100644
> > --- a/doc/guides/contributing/abi_versioning.rst
> > +++ b/doc/guides/contributing/abi_versioning.rst
> > @@ -138,27 +138,20 @@ macros are used in conjunction with the ``version.map`` file for
> >  a given library to allow multiple versions of a symbol to exist in a shared
> >  library so that older binaries need not be immediately recompiled.
> >
> > -The macros exported are:
> > +The macros are:
> >
> > -* ``VERSION_SYMBOL(b, e, n)``: Creates a symbol version table entry binding
> > -  versioned symbol ``b@DPDK_n`` to the internal function ``be``.
> > +* ``RTE_VERSION_SYMBOL(ver, type, name, args``: Creates a symbol version table
>
> Missing closing brace .........................^ here
>
> > +  entry binding symbol ``<name>@DPDK_<ver>`` to the internal function name
> > +  ``<name>_v<ver>``.
> >
> > -* ``BIND_DEFAULT_SYMBOL(b, e, n)``: Creates a symbol version entry instructing
> > -  the linker to bind references to symbol ``b`` to the internal symbol
> > -  ``be``.
> > +* ``RTE_DEFAULT_SYMBO(ver, type, name, args)``: Creates a symbol version entry
>
> s/SYMBO/SYMBOL/

Good catch thanks...

>
> > +  instructing the linker to bind references to symbol ``<name>`` to the internal
> > +  symbol ``<name>_v<ver>``.
> >
> > -* ``MAP_STATIC_SYMBOL(f, p)``: Declare the prototype ``f``, and map it to the
> > -  fully qualified function ``p``, so that if a symbol becomes versioned, it
> > -  can still be mapped back to the public symbol name.
> > -
> > -* ``__vsym``:  Annotation to be used in a declaration of the internal symbol
> > -  ``be`` to signal that it is being used as an implementation of a particular
> > -  version of symbol ``b``.
> > -
> > -* ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
> > -  binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
> > -  The macro is used when a symbol matures to become part of the stable ABI, to
> > -  provide an alias to experimental until the next major ABI version.
> > +* ``RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args)``:  Similar to RTE_VERSION_SYMBOL
> > +  but for experimental API symbols. The macro is used when a symbol matures
> > +  to become part of the stable ABI, to provide an alias to experimental
> > +  until the next major ABI version.
>
> Just to clarify - this is where we create two names/aliases for the one
> function, so it can be found either as an experimental version or a stable
> one, right? In that way it's actually quite different from
> RTE_VERSION_SYMBOL which is used to define a *NEW" version of an existing
> function, i.e. two functions rather than one function with two names.

I did not change the behavior, it is still mapping a symbol to an
actual implementation (there is an example later in this doc).
The previous description about an alias was probably inaccurate.


>
> >
> >  .. _example_abi_macro_usage:
> >
> > @@ -277,49 +270,36 @@ list of exported symbols when DPDK is compiled as a shared library.
> >
> >  Next, we need to specify in the code which function maps to the rte_acl_create
> >  symbol at which versions.  First, at the site of the initial symbol definition,
> > -we need to update the function so that it is uniquely named, and not in conflict
> > -with the public symbol name
> > +we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
> > +the function return type, and the function name and its arguments.
> >
> >  .. code-block:: c
> >
> >   -struct rte_acl_ctx *
> >   -rte_acl_create(const struct rte_acl_param *param)
> > - +struct rte_acl_ctx * __vsym
> > - +rte_acl_create_v21(const struct rte_acl_param *param)
> > + +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
> >   {
> >          size_t sz;
> >          struct rte_acl_ctx *ctx;
> >          ...
> > -
> > -Note that the base name of the symbol was kept intact, as this is conducive to
> > -the macros used for versioning symbols and we have annotated the function as
> > -``__vsym``, an implementation of a versioned symbol . That is our next step,
> > -mapping this new symbol name to the initial symbol name at version node 21.
> > -Immediately after the function, we add the VERSION_SYMBOL macro.
> > -
> > -.. code-block:: c
> > -
> > -   #include <rte_function_versioning.h>
> > -
> > -   ...
> > -   VERSION_SYMBOL(rte_acl_create, _v21, 21);
> > + }
> >
> >  Remembering to also add the rte_function_versioning.h header to the requisite c
> >  file where these changes are being made. The macro instructs the linker to
> >  create a new symbol ``rte_acl_create@DPDK_21``, which matches the symbol created
> > -in older builds, but now points to the above newly named function. We have now
> > -mapped the original rte_acl_create symbol to the original function (but with a
> > -new name).
> > +in older builds, but now points to the above newly named function ``rte_acl_create_v21``.
> > +We have now mapped the original rte_acl_create symbol to the original function
> > +(but with a new name).
> >
> >  Please see the section :ref:`Enabling versioning macros
> >  <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
> > -Next, we need to create the new ``v22`` version of the symbol. We create a new
> > -function name, with the ``v22`` suffix, and implement it appropriately.
> > +Next, we need to create the new version of the symbol. We create a new
> > +function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
> >
> >  .. code-block:: c
> >
> > -   struct rte_acl_ctx * __vsym
> > -   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
> > +   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
> > +        int debug))
>
> Not directly relevant to the changes in this patch, but since this is
> documentation which doesn't actually need to be based on a real-life
> example, we should maybe come up with example functions which are short and
> don't need line wrapping. This example would be just as
> effective/instructive with a return value of "int" and parameter type
> without a "const" qualifier. :-)

Yep, I'll simplify the example.


>
> >     {
> >          struct rte_acl_ctx *ctx = rte_acl_create_v21(param);
> >
> > @@ -328,35 +308,9 @@ function name, with the ``v22`` suffix, and implement it appropriately.
> >          return ctx;
> >     }
> >
> > -This code serves as our new API call. Its the same as our old call, but adds the
> > -new parameter in place. Next we need to map this function to the new default
> > -symbol ``rte_acl_create@DPDK_22``. To do this, immediately after the function,
> > -we add the BIND_DEFAULT_SYMBOL macro.
> > -
> > -.. code-block:: c
> > -
> > -   #include <rte_function_versioning.h>
> > -
> > -   ...
> > -   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
> > -
> >  The macro instructs the linker to create the new default symbol
> > -``rte_acl_create@DPDK_22``, which points to the above newly named function.
> > -
> > -We finally modify the prototype of the call in the public header file,
> > -such that it contains both versions of the symbol and the public API.
> > -
> > -.. code-block:: c
> > -
> > -   struct rte_acl_ctx *
> > -   rte_acl_create(const struct rte_acl_param *param);
> > -
> > -   struct rte_acl_ctx * __vsym
> > -   rte_acl_create_v21(const struct rte_acl_param *param);
> > -
> > -   struct rte_acl_ctx * __vsym
> > -   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
> > -
> > +``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
> > +(declared by the macro).
> >
> >  And that's it, on the next shared library rebuild, there will be two versions of
> >  rte_acl_create, an old DPDK_21 version, used by previously built applications,
> > @@ -365,43 +319,10 @@ and a new DPDK_22 version, used by future built applications.
> >  .. note::
> >
> >     **Before you leave**, please take care reviewing the sections on
> > -   :ref:`mapping static symbols <mapping_static_symbols>`,
> >     :ref:`enabling versioning macros <enabling_versioning_macros>`,
> >     and :ref:`ABI deprecation <abi_deprecation>`.
> >
> >
> > -.. _mapping_static_symbols:
> > -
> > -Mapping static symbols
> > -______________________
> > -
> > -Now we've taken what was a public symbol, and duplicated it into two uniquely
> > -and differently named symbols. We've then mapped each of those back to the
> > -public symbol ``rte_acl_create`` with different version tags. This only applies
> > -to dynamic linking, as static linking has no notion of versioning. That leaves
> > -this code in a position of no longer having a symbol simply named
> > -``rte_acl_create`` and a static build will fail on that missing symbol.
> > -
> > -To correct this, we can simply map a function of our choosing back to the public
> > -symbol in the static build with the ``MAP_STATIC_SYMBOL`` macro.  Generally the
> > -assumption is that the most recent version of the symbol is the one you want to
> > -map.  So, back in the C file where, immediately after ``rte_acl_create_v22`` is
> > -defined, we add this
> > -
> > -
> > -.. code-block:: c
> > -
> > -   struct rte_acl_ctx * __vsym
> > -   rte_acl_create_v22(const struct rte_acl_param *param, int debug)
> > -   {
> > -        ...
> > -   }
> > -   MAP_STATIC_SYMBOL(struct rte_acl_ctx *rte_acl_create(const struct rte_acl_param *param, int debug), rte_acl_create_v22);
> > -
> > -That tells the compiler that, when building a static library, any calls to the
> > -symbol ``rte_acl_create`` should be linked to ``rte_acl_create_v22``
> > -
> > -
> >  .. _enabling_versioning_macros:
> >
> >  Enabling versioning macros
> > @@ -519,26 +440,17 @@ and ``DPDK_22`` version nodes.
> >      * Create an acl context object for apps to
> >      * manipulate
> >      */
> > -   struct rte_acl_ctx *
> > -   rte_acl_create(const struct rte_acl_param *param)
> > +   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create,
> > +        (const struct rte_acl_param *param))
> >     {
> >     ...
> >     }
> >
> > -   __rte_experimental
> > -   struct rte_acl_ctx *
> > -   rte_acl_create_e(const struct rte_acl_param *param)
> > -   {
> > -      return rte_acl_create(param);
> > -   }
> > -   VERSION_SYMBOL_EXPERIMENTAL(rte_acl_create, _e);
> > -
> > -   struct rte_acl_ctx *
> > -   rte_acl_create_v22(const struct rte_acl_param *param)
> > +   RTE_VERSION_EXPERIMENTAL_SYMBOL(struct rte_acl_ctx *, rte_acl_create,
> > +        (const struct rte_acl_param *param))
> >     {
> >        return rte_acl_create(param);
> >     }
> > -   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
> >
> >  In the map file, we map the symbol to both the ``EXPERIMENTAL``
> >  and ``DPDK_22`` version nodes.
> > @@ -564,13 +476,6 @@ and ``DPDK_22`` version nodes.
> >          rte_acl_create;
> >     };
> >
> > -.. note::
> > -
> > -   Please note, similar to :ref:`symbol versioning <example_abi_macro_usage>`,
> > -   when aliasing to experimental you will also need to take care of
> > -   :ref:`mapping static symbols <mapping_static_symbols>`.
> > -
> > -
> >  .. _abi_deprecation:
> >
> >  Deprecating part of a public API
> > @@ -616,10 +521,10 @@ Next remove the corresponding versioned export.
> >
> >  .. code-block:: c
> >
> > - -VERSION_SYMBOL(rte_acl_create, _v21, 21);
> > + -RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
> >
> >
> > -Note that the internal function definition could also be removed, but its used
> > +Note that the internal function definition must also be removed, but its used
>
> its -> it's (or "it is" if you want the longer version).

Ok, I'll fix this too.

>
> >  in our example by the newer version ``v22``, so we leave it in place and declare
> >  it as static. This is a coding style choice.
> >

[snip]

> > diff --git a/lib/eal/include/rte_function_versioning.h b/lib/eal/include/rte_function_versioning.h
> > index eb6dd2bc17..0020ce4885 100644
> > --- a/lib/eal/include/rte_function_versioning.h
> > +++ b/lib/eal/include/rte_function_versioning.h
> > @@ -11,8 +11,6 @@
> >  #error Use of function versioning disabled, is "use_function_versioning=true" in meson.build?
> >  #endif
> >
> > -#ifdef RTE_BUILD_SHARED_LIB
> > -
> >  /*
> >   * Provides backwards compatibility when updating exported functions.
> >   * When a symbol is exported from a library to provide an API, it also provides a
> > @@ -20,80 +18,54 @@
> >   * arguments, etc.  On occasion that function may need to change to accommodate
> >   * new functionality, behavior, etc.  When that occurs, it is desirable to
> >   * allow for backwards compatibility for a time with older binaries that are
> > - * dynamically linked to the dpdk.  To support that, the __vsym and
> > - * VERSION_SYMBOL macros are created.  They, in conjunction with the
> > - * version.map file for a given library allow for multiple versions of
> > - * a symbol to exist in a shared library so that older binaries need not be
> > - * immediately recompiled.
> > - *
> > - * Refer to the guidelines document in the docs subdirectory for details on the
> > - * use of these macros
> > + * dynamically linked to the dpdk.
> >   */
> >
> > -/*
> > - * Macro Parameters:
> > - * b - function base name
> > - * e - function version extension, to be concatenated with base name
> > - * n - function symbol version string to be applied
> > - * f - function prototype
> > - * p - full function symbol name
> > - */
> > +#ifdef RTE_BUILD_SHARED_LIB
> >
> >  /*
> > - * VERSION_SYMBOL
> > - * Creates a symbol version table entry binding symbol <b>@DPDK_<n> to the internal
> > - * function name <b><e>
> > + * RTE_VERSION_SYMBOL
> > + * Creates a symbol version table entry binding symbol <name>@DPDK_<ver> to the internal
> > + * function name <name>_v<ver>.
> >   */
> > -#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@DPDK_" RTE_STR(n))
> > +#define RTE_VERSION_SYMBOL(ver, type, name, args) \
> > +__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@DPDK_" RTE_STR(ver)); \
> > +__rte_used type name ## _v ## ver args; \
> > +type name ## _v ## ver args
> >
> >  /*
> > - * VERSION_SYMBOL_EXPERIMENTAL
> > - * Creates a symbol version table entry binding the symbol <b>@EXPERIMENTAL to the internal
> > - * function name <b><e>. The macro is used when a symbol matures to become part of the stable ABI,
> > - * to provide an alias to experimental for some time.
> > + * RTE_VERSION_EXPERIMENTAL_SYMBOL
> > + * Similar to RTE_VERSION_SYMBOL but for experimental API symbols.
> > + * This is mainly used for keeping compatibility for symbols that get promoted to stable ABI.
> >   */
> > -#define VERSION_SYMBOL_EXPERIMENTAL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@EXPERIMENTAL")
> > +#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
> > +__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL") \
> > +__rte_used type name ## _exp args; \
> > +type name ## _exp args
> >
> >  /*
> > - * BIND_DEFAULT_SYMBOL
> > + * RTE_DEFAULT_SYMBOL
> >   * Creates a symbol version entry instructing the linker to bind references to
> > - * symbol <b> to the internal symbol <b><e>
> > + * symbol <name> to the internal symbol <name>_v<ver>.
> >   */
> > -#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
> > +#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
> > +__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@@DPDK_" RTE_STR(ver)); \
> > +__rte_used type name ## _v ## ver args; \
> > +type name ## _v ## ver args
> >
> > -/*
> > - * __vsym
> > - * Annotation to be used in declaration of the internal symbol <b><e> to signal
> > - * that it is being used as an implementation of a particular version of symbol
> > - * <b>.
> > - */
> > -#define __vsym __rte_used
> > +#else /* !RTE_BUILD_SHARED_LIB */
> >
> > -/*
> > - * MAP_STATIC_SYMBOL
> > - * If a function has been bifurcated into multiple versions, none of which
> > - * are defined as the exported symbol name in the map file, this macro can be
> > - * used to alias a specific version of the symbol to its exported name.  For
> > - * example, if you have 2 versions of a function foo_v1 and foo_v2, where the
> > - * former is mapped to foo@DPDK_1 and the latter is mapped to foo@DPDK_2 when
> > - * building a shared library, this macro can be used to map either foo_v1 or
> > - * foo_v2 to the symbol foo when building a static library, e.g.:
> > - * MAP_STATIC_SYMBOL(void foo(), foo_v2);
> > - */
> > -#define MAP_STATIC_SYMBOL(f, p)
> > +#define RTE_VERSION_SYMBOL(ver, type, name, args) \
> > +type name ## _v ## ver args; \
> > +type name ## _v ## ver args
> >
> > -#else
> > -/*
> > - * No symbol versioning in use
> > - */
> > -#define VERSION_SYMBOL(b, e, n)
> > -#define VERSION_SYMBOL_EXPERIMENTAL(b, e)
> > -#define __vsym
> > -#define BIND_DEFAULT_SYMBOL(b, e, n)
> > -#define MAP_STATIC_SYMBOL(f, p) f __attribute__((alias(RTE_STR(p))))
> > -/*
> > - * RTE_BUILD_SHARED_LIB=n
> > - */
> > -#endif
> > +#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
> > +type name ## _exp args; \
> > +type name ## _exp args
> > +
> > +#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
> > +type name args
> > +
> > +#endif /* RTE_BUILD_SHARED_LIB */
> >
> >  #endif /* _RTE_FUNCTION_VERSIONING_H_ */
>
> Changes to this file look ok to me.

Thank you Bruce.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [RFC v3 3/8] eal: rework function versioning macros
  2025-03-11  9:56 13%   ` [RFC v3 3/8] eal: rework function versioning macros David Marchand
@ 2025-03-13 16:53  0%     ` Bruce Richardson
  2025-03-13 17:09  0%       ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2025-03-13 16:53 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, thomas, andremue, Tyler Retzlaff, Jasvinder Singh

On Tue, Mar 11, 2025 at 10:56:01AM +0100, David Marchand wrote:
> For versioning symbols:
> - MSVC uses pragmas on the symbol,
> - GNU linker uses special asm directives,
> 
> To accommodate both GNU linker and MSVC linker, introduce new macros for
> exporting and versioning symbols that will surround the whole function.
> 
> This has the advantage of hiding all the ugly details in the macros.
> Now versioning a symbol is just a call to a single macro:
> - RTE_VERSION_SYMBOL (resp. RTE_VERSION_EXPERIMENTAL_SYMBOL), for
>   keeping an old implementation code under a versioned function (resp.
>   experimental function),
> - RTE_DEFAULT_SYMBOL, for declaring the new default versioned function,
>   and handling the static link special case, instead of
>   BIND_DEFAULT_SYMBOL + MAP_STATIC_SYMBOL,
> 
> Update lib/net accordingly.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

A few review comments on the docs inline below. See nothing wrong from
initial review of code changes.

/Bruce

> Changes since RFC v2:
> 
> Changes since RFC v1:
> - renamed and prefixed macros,
> - reindented in prevision of second patch,
> 
> ---
>  doc/guides/contributing/abi_versioning.rst | 165 +++++----------------
>  lib/eal/include/rte_function_versioning.h  |  96 +++++-------
>  lib/net/net_crc.h                          |  15 --
>  lib/net/rte_net_crc.c                      |  28 +---
>  4 files changed, 77 insertions(+), 227 deletions(-)
> 
> diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
> index 7afd1c1886..88dd776b4c 100644
> --- a/doc/guides/contributing/abi_versioning.rst
> +++ b/doc/guides/contributing/abi_versioning.rst
> @@ -138,27 +138,20 @@ macros are used in conjunction with the ``version.map`` file for
>  a given library to allow multiple versions of a symbol to exist in a shared
>  library so that older binaries need not be immediately recompiled.
>  
> -The macros exported are:
> +The macros are:
>  
> -* ``VERSION_SYMBOL(b, e, n)``: Creates a symbol version table entry binding
> -  versioned symbol ``b@DPDK_n`` to the internal function ``be``.
> +* ``RTE_VERSION_SYMBOL(ver, type, name, args``: Creates a symbol version table

Missing closing brace .........................^ here

> +  entry binding symbol ``<name>@DPDK_<ver>`` to the internal function name
> +  ``<name>_v<ver>``.
>  
> -* ``BIND_DEFAULT_SYMBOL(b, e, n)``: Creates a symbol version entry instructing
> -  the linker to bind references to symbol ``b`` to the internal symbol
> -  ``be``.
> +* ``RTE_DEFAULT_SYMBO(ver, type, name, args)``: Creates a symbol version entry

s/SYMBO/SYMBOL/

> +  instructing the linker to bind references to symbol ``<name>`` to the internal
> +  symbol ``<name>_v<ver>``.
>  
> -* ``MAP_STATIC_SYMBOL(f, p)``: Declare the prototype ``f``, and map it to the
> -  fully qualified function ``p``, so that if a symbol becomes versioned, it
> -  can still be mapped back to the public symbol name.
> -
> -* ``__vsym``:  Annotation to be used in a declaration of the internal symbol
> -  ``be`` to signal that it is being used as an implementation of a particular
> -  version of symbol ``b``.
> -
> -* ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
> -  binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
> -  The macro is used when a symbol matures to become part of the stable ABI, to
> -  provide an alias to experimental until the next major ABI version.
> +* ``RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args)``:  Similar to RTE_VERSION_SYMBOL
> +  but for experimental API symbols. The macro is used when a symbol matures
> +  to become part of the stable ABI, to provide an alias to experimental
> +  until the next major ABI version.

Just to clarify - this is where we create two names/aliases for the one
function, so it can be found either as an experimental version or a stable
one, right? In that way it's actually quite different from
RTE_VERSION_SYMBOL which is used to define a *NEW" version of an existing
function, i.e. two functions rather than one function with two names.

>  
>  .. _example_abi_macro_usage:
>  
> @@ -277,49 +270,36 @@ list of exported symbols when DPDK is compiled as a shared library.
>  
>  Next, we need to specify in the code which function maps to the rte_acl_create
>  symbol at which versions.  First, at the site of the initial symbol definition,
> -we need to update the function so that it is uniquely named, and not in conflict
> -with the public symbol name
> +we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
> +the function return type, and the function name and its arguments.
>  
>  .. code-block:: c
>  
>   -struct rte_acl_ctx *
>   -rte_acl_create(const struct rte_acl_param *param)
> - +struct rte_acl_ctx * __vsym
> - +rte_acl_create_v21(const struct rte_acl_param *param)
> + +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
>   {
>          size_t sz;
>          struct rte_acl_ctx *ctx;
>          ...
> -
> -Note that the base name of the symbol was kept intact, as this is conducive to
> -the macros used for versioning symbols and we have annotated the function as
> -``__vsym``, an implementation of a versioned symbol . That is our next step,
> -mapping this new symbol name to the initial symbol name at version node 21.
> -Immediately after the function, we add the VERSION_SYMBOL macro.
> -
> -.. code-block:: c
> -
> -   #include <rte_function_versioning.h>
> -
> -   ...
> -   VERSION_SYMBOL(rte_acl_create, _v21, 21);
> + }
>  
>  Remembering to also add the rte_function_versioning.h header to the requisite c
>  file where these changes are being made. The macro instructs the linker to
>  create a new symbol ``rte_acl_create@DPDK_21``, which matches the symbol created
> -in older builds, but now points to the above newly named function. We have now
> -mapped the original rte_acl_create symbol to the original function (but with a
> -new name).
> +in older builds, but now points to the above newly named function ``rte_acl_create_v21``.
> +We have now mapped the original rte_acl_create symbol to the original function
> +(but with a new name).
>  
>  Please see the section :ref:`Enabling versioning macros
>  <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
> -Next, we need to create the new ``v22`` version of the symbol. We create a new
> -function name, with the ``v22`` suffix, and implement it appropriately.
> +Next, we need to create the new version of the symbol. We create a new
> +function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
>  
>  .. code-block:: c
>  
> -   struct rte_acl_ctx * __vsym
> -   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
> +   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
> +        int debug))

Not directly relevant to the changes in this patch, but since this is
documentation which doesn't actually need to be based on a real-life
example, we should maybe come up with example functions which are short and
don't need line wrapping. This example would be just as
effective/instructive with a return value of "int" and parameter type
without a "const" qualifier. :-)

>     {
>          struct rte_acl_ctx *ctx = rte_acl_create_v21(param);
>  
> @@ -328,35 +308,9 @@ function name, with the ``v22`` suffix, and implement it appropriately.
>          return ctx;
>     }
>  
> -This code serves as our new API call. Its the same as our old call, but adds the
> -new parameter in place. Next we need to map this function to the new default
> -symbol ``rte_acl_create@DPDK_22``. To do this, immediately after the function,
> -we add the BIND_DEFAULT_SYMBOL macro.
> -
> -.. code-block:: c
> -
> -   #include <rte_function_versioning.h>
> -
> -   ...
> -   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
> -
>  The macro instructs the linker to create the new default symbol
> -``rte_acl_create@DPDK_22``, which points to the above newly named function.
> -
> -We finally modify the prototype of the call in the public header file,
> -such that it contains both versions of the symbol and the public API.
> -
> -.. code-block:: c
> -
> -   struct rte_acl_ctx *
> -   rte_acl_create(const struct rte_acl_param *param);
> -
> -   struct rte_acl_ctx * __vsym
> -   rte_acl_create_v21(const struct rte_acl_param *param);
> -
> -   struct rte_acl_ctx * __vsym
> -   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
> -
> +``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
> +(declared by the macro).
>  
>  And that's it, on the next shared library rebuild, there will be two versions of
>  rte_acl_create, an old DPDK_21 version, used by previously built applications,
> @@ -365,43 +319,10 @@ and a new DPDK_22 version, used by future built applications.
>  .. note::
>  
>     **Before you leave**, please take care reviewing the sections on
> -   :ref:`mapping static symbols <mapping_static_symbols>`,
>     :ref:`enabling versioning macros <enabling_versioning_macros>`,
>     and :ref:`ABI deprecation <abi_deprecation>`.
>  
>  
> -.. _mapping_static_symbols:
> -
> -Mapping static symbols
> -______________________
> -
> -Now we've taken what was a public symbol, and duplicated it into two uniquely
> -and differently named symbols. We've then mapped each of those back to the
> -public symbol ``rte_acl_create`` with different version tags. This only applies
> -to dynamic linking, as static linking has no notion of versioning. That leaves
> -this code in a position of no longer having a symbol simply named
> -``rte_acl_create`` and a static build will fail on that missing symbol.
> -
> -To correct this, we can simply map a function of our choosing back to the public
> -symbol in the static build with the ``MAP_STATIC_SYMBOL`` macro.  Generally the
> -assumption is that the most recent version of the symbol is the one you want to
> -map.  So, back in the C file where, immediately after ``rte_acl_create_v22`` is
> -defined, we add this
> -
> -
> -.. code-block:: c
> -
> -   struct rte_acl_ctx * __vsym
> -   rte_acl_create_v22(const struct rte_acl_param *param, int debug)
> -   {
> -        ...
> -   }
> -   MAP_STATIC_SYMBOL(struct rte_acl_ctx *rte_acl_create(const struct rte_acl_param *param, int debug), rte_acl_create_v22);
> -
> -That tells the compiler that, when building a static library, any calls to the
> -symbol ``rte_acl_create`` should be linked to ``rte_acl_create_v22``
> -
> -
>  .. _enabling_versioning_macros:
>  
>  Enabling versioning macros
> @@ -519,26 +440,17 @@ and ``DPDK_22`` version nodes.
>      * Create an acl context object for apps to
>      * manipulate
>      */
> -   struct rte_acl_ctx *
> -   rte_acl_create(const struct rte_acl_param *param)
> +   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create,
> +        (const struct rte_acl_param *param))
>     {
>     ...
>     }
>  
> -   __rte_experimental
> -   struct rte_acl_ctx *
> -   rte_acl_create_e(const struct rte_acl_param *param)
> -   {
> -      return rte_acl_create(param);
> -   }
> -   VERSION_SYMBOL_EXPERIMENTAL(rte_acl_create, _e);
> -
> -   struct rte_acl_ctx *
> -   rte_acl_create_v22(const struct rte_acl_param *param)
> +   RTE_VERSION_EXPERIMENTAL_SYMBOL(struct rte_acl_ctx *, rte_acl_create,
> +        (const struct rte_acl_param *param))
>     {
>        return rte_acl_create(param);
>     }
> -   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
>  
>  In the map file, we map the symbol to both the ``EXPERIMENTAL``
>  and ``DPDK_22`` version nodes.
> @@ -564,13 +476,6 @@ and ``DPDK_22`` version nodes.
>          rte_acl_create;
>     };
>  
> -.. note::
> -
> -   Please note, similar to :ref:`symbol versioning <example_abi_macro_usage>`,
> -   when aliasing to experimental you will also need to take care of
> -   :ref:`mapping static symbols <mapping_static_symbols>`.
> -
> -
>  .. _abi_deprecation:
>  
>  Deprecating part of a public API
> @@ -616,10 +521,10 @@ Next remove the corresponding versioned export.
>  
>  .. code-block:: c
>  
> - -VERSION_SYMBOL(rte_acl_create, _v21, 21);
> + -RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
>  
>  
> -Note that the internal function definition could also be removed, but its used
> +Note that the internal function definition must also be removed, but its used

its -> it's (or "it is" if you want the longer version).

>  in our example by the newer version ``v22``, so we leave it in place and declare
>  it as static. This is a coding style choice.
>  
> @@ -663,16 +568,18 @@ In the case of our map above, it would transform to look as follows
>          local: *;
>   };
>  
> -Then any uses of BIND_DEFAULT_SYMBOL that pointed to the old node should be
> +Then any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
>  updated to point to the new version node in any header files for all affected
>  symbols.
>  
>  .. code-block:: c
>  
> - -BIND_DEFAULT_SYMBOL(rte_acl_create, _v21, 21);
> - +BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
> + -RTE_DEFAULT_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
> +        int debug))
> + -RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
> +        int debug))
>  
> -Lastly, any VERSION_SYMBOL macros that point to the old version nodes
> +Lastly, any RTE_VERSION_SYMBOL macros that point to the old version nodes
>  should be removed, taking care to preserve any code that is shared
>  with the new version node.
>  
> diff --git a/lib/eal/include/rte_function_versioning.h b/lib/eal/include/rte_function_versioning.h
> index eb6dd2bc17..0020ce4885 100644
> --- a/lib/eal/include/rte_function_versioning.h
> +++ b/lib/eal/include/rte_function_versioning.h
> @@ -11,8 +11,6 @@
>  #error Use of function versioning disabled, is "use_function_versioning=true" in meson.build?
>  #endif
>  
> -#ifdef RTE_BUILD_SHARED_LIB
> -
>  /*
>   * Provides backwards compatibility when updating exported functions.
>   * When a symbol is exported from a library to provide an API, it also provides a
> @@ -20,80 +18,54 @@
>   * arguments, etc.  On occasion that function may need to change to accommodate
>   * new functionality, behavior, etc.  When that occurs, it is desirable to
>   * allow for backwards compatibility for a time with older binaries that are
> - * dynamically linked to the dpdk.  To support that, the __vsym and
> - * VERSION_SYMBOL macros are created.  They, in conjunction with the
> - * version.map file for a given library allow for multiple versions of
> - * a symbol to exist in a shared library so that older binaries need not be
> - * immediately recompiled.
> - *
> - * Refer to the guidelines document in the docs subdirectory for details on the
> - * use of these macros
> + * dynamically linked to the dpdk.
>   */
>  
> -/*
> - * Macro Parameters:
> - * b - function base name
> - * e - function version extension, to be concatenated with base name
> - * n - function symbol version string to be applied
> - * f - function prototype
> - * p - full function symbol name
> - */
> +#ifdef RTE_BUILD_SHARED_LIB
>  
>  /*
> - * VERSION_SYMBOL
> - * Creates a symbol version table entry binding symbol <b>@DPDK_<n> to the internal
> - * function name <b><e>
> + * RTE_VERSION_SYMBOL
> + * Creates a symbol version table entry binding symbol <name>@DPDK_<ver> to the internal
> + * function name <name>_v<ver>.
>   */
> -#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@DPDK_" RTE_STR(n))
> +#define RTE_VERSION_SYMBOL(ver, type, name, args) \
> +__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@DPDK_" RTE_STR(ver)); \
> +__rte_used type name ## _v ## ver args; \
> +type name ## _v ## ver args
>  
>  /*
> - * VERSION_SYMBOL_EXPERIMENTAL
> - * Creates a symbol version table entry binding the symbol <b>@EXPERIMENTAL to the internal
> - * function name <b><e>. The macro is used when a symbol matures to become part of the stable ABI,
> - * to provide an alias to experimental for some time.
> + * RTE_VERSION_EXPERIMENTAL_SYMBOL
> + * Similar to RTE_VERSION_SYMBOL but for experimental API symbols.
> + * This is mainly used for keeping compatibility for symbols that get promoted to stable ABI.
>   */
> -#define VERSION_SYMBOL_EXPERIMENTAL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@EXPERIMENTAL")
> +#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
> +__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL") \
> +__rte_used type name ## _exp args; \
> +type name ## _exp args
>  
>  /*
> - * BIND_DEFAULT_SYMBOL
> + * RTE_DEFAULT_SYMBOL
>   * Creates a symbol version entry instructing the linker to bind references to
> - * symbol <b> to the internal symbol <b><e>
> + * symbol <name> to the internal symbol <name>_v<ver>.
>   */
> -#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
> +#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
> +__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@@DPDK_" RTE_STR(ver)); \
> +__rte_used type name ## _v ## ver args; \
> +type name ## _v ## ver args
>  
> -/*
> - * __vsym
> - * Annotation to be used in declaration of the internal symbol <b><e> to signal
> - * that it is being used as an implementation of a particular version of symbol
> - * <b>.
> - */
> -#define __vsym __rte_used
> +#else /* !RTE_BUILD_SHARED_LIB */
>  
> -/*
> - * MAP_STATIC_SYMBOL
> - * If a function has been bifurcated into multiple versions, none of which
> - * are defined as the exported symbol name in the map file, this macro can be
> - * used to alias a specific version of the symbol to its exported name.  For
> - * example, if you have 2 versions of a function foo_v1 and foo_v2, where the
> - * former is mapped to foo@DPDK_1 and the latter is mapped to foo@DPDK_2 when
> - * building a shared library, this macro can be used to map either foo_v1 or
> - * foo_v2 to the symbol foo when building a static library, e.g.:
> - * MAP_STATIC_SYMBOL(void foo(), foo_v2);
> - */
> -#define MAP_STATIC_SYMBOL(f, p)
> +#define RTE_VERSION_SYMBOL(ver, type, name, args) \
> +type name ## _v ## ver args; \
> +type name ## _v ## ver args
>  
> -#else
> -/*
> - * No symbol versioning in use
> - */
> -#define VERSION_SYMBOL(b, e, n)
> -#define VERSION_SYMBOL_EXPERIMENTAL(b, e)
> -#define __vsym
> -#define BIND_DEFAULT_SYMBOL(b, e, n)
> -#define MAP_STATIC_SYMBOL(f, p) f __attribute__((alias(RTE_STR(p))))
> -/*
> - * RTE_BUILD_SHARED_LIB=n
> - */
> -#endif
> +#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
> +type name ## _exp args; \
> +type name ## _exp args
> +
> +#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
> +type name args
> +
> +#endif /* RTE_BUILD_SHARED_LIB */
>  
>  #endif /* _RTE_FUNCTION_VERSIONING_H_ */

Changes to this file look ok to me.

<snip>

^ permalink raw reply	[relevance 0%]

* Re: [EXTERNAL] Re: [patch v2 0/6] Support VMBUS channels without monitoring enabled
  2025-03-12  0:33  4%   ` [EXTERNAL] " Long Li
@ 2025-03-12 15:36  0%     ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2025-03-12 15:36 UTC (permalink / raw)
  To: Long Li; +Cc: longli, Wei Hu, dev

On Wed, 12 Mar 2025 00:33:52 +0000
Long Li <longli@microsoft.com> wrote:

> > Subject: [EXTERNAL] Re: [patch v2 0/6] Support VMBUS channels without
> > monitoring enabled
> > 
> > On Mon, 10 Mar 2025 14:42:51 -0700
> > longli@linuxonhyperv.com wrote:
> >   
> > > From: Long Li <longli@microsoft.com>
> > >
> > > Hyperv may expose VMBUS channels without monitoring enabled. In this
> > > case, it programs almost all the data traffic to VF.
> > >
> > > This patchset enabled vmbus/netvsc to use channels without monitoring
> > > enabled.  
> > 
> > 
> > CI still reports a build issue  
> 
> There are ABI changes to rte_vmbus_* calls. This patch added rte_vmbus_device* as the 1st parameter to those calls.
> 
> This will be a breaking change, and it only affects hn_netvsc as it's the only PMD using the vmbus.
> 
> Reading ./doc/guides/contributing/abi_policy.rst, I think the best option is to use RTE_NEXT_ABI. But I can't find its definition in the code base.
> 
> Please advise on how to proceed with making those breaking ABI changes.
> 
> Thanks,
> Long

Can't take it as is, here are some options:

1. Version the API even though should only be used internally. Use API versioning
   as transistion until 25.11.
2. Wait for 25.11 and just fix it now, and do deprecation notice now.

3. Mark the API's as internal (in 25.11) and do deprecation notice now.

4. Make new functions with different names, and mark old ones as deprecated, then remove in 25.11


^ permalink raw reply	[relevance 0%]

* RE: [EXTERNAL] Re: [patch v2 0/6] Support VMBUS channels without monitoring enabled
  @ 2025-03-12  0:33  4%   ` Long Li
  2025-03-12 15:36  0%     ` Stephen Hemminger
  0 siblings, 1 reply; 200+ results
From: Long Li @ 2025-03-12  0:33 UTC (permalink / raw)
  To: Stephen Hemminger, longli; +Cc: Wei Hu, dev

> Subject: [EXTERNAL] Re: [patch v2 0/6] Support VMBUS channels without
> monitoring enabled
> 
> On Mon, 10 Mar 2025 14:42:51 -0700
> longli@linuxonhyperv.com wrote:
> 
> > From: Long Li <longli@microsoft.com>
> >
> > Hyperv may expose VMBUS channels without monitoring enabled. In this
> > case, it programs almost all the data traffic to VF.
> >
> > This patchset enabled vmbus/netvsc to use channels without monitoring
> > enabled.
> 
> 
> CI still reports a build issue

There are ABI changes to rte_vmbus_* calls. This patch added rte_vmbus_device* as the 1st parameter to those calls.

This will be a breaking change, and it only affects hn_netvsc as it's the only PMD using the vmbus.

Reading ./doc/guides/contributing/abi_policy.rst, I think the best option is to use RTE_NEXT_ABI. But I can't find its definition in the code base.

Please advise on how to proceed with making those breaking ABI changes.

Thanks,
Long

^ permalink raw reply	[relevance 4%]

* Re: [RFC v3 0/8] Symbol versioning and export rework
  2025-03-11 10:18  3%   ` [RFC v3 0/8] Symbol versioning and export rework Morten Brørup
@ 2025-03-11 13:43  0%     ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-11 13:43 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, thomas, bruce.richardson, andremue

On Tue, Mar 11, 2025 at 11:18 AM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: David Marchand [mailto:david.marchand@redhat.com]
> > Sent: Tuesday, 11 March 2025 10.56
> >
> > So far, each DPDK library (or driver) exposing symbols in an ABI had to
> > maintain a version.map and use some macros for symbol versioning,
> > specially crafted with the GNU linker in mind.
> >
> > This series proposes to rework the whole principle, and instead rely on
> > marking the symbol exports in the source code itself, then let it to
> > the
> > build framework to produce a version script adapted to the linker in
> > use
> > (think GNU linker vs MSVC linker).
> >
> > This greatly simplifies versioning symbols: a developer does not need
> > to
> > know anything about version.map, or that a versioned symbol must be
> > renamed with _v26, annotated with __vsym, exported in a header etc...
> >
> > Checking symbol maps becomes unnecessary since generated by the build
> > framework.
> >
> > Updating to a new ABI is just a matter of bumping the value in
> > ABI_VERSION.
> >
> >
> > Comments please.
>
> Excellent. I'm all for automating this!
>
> Feature creep:
>
> Have you thought about how this (or related automation) can possibly also benefit the CI, e.g. for ABI breakage testing?
>
> Or possible benefits to (automated) documentation of versioned functions?
> Or possible benefits to remembering all versioned ABIs when writing the release notes?

Not really, this series is already touching enough code and needs in
detail reviews.
A simple ack is pointless.

I prefer focusing on just making this part right.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* RE: [RFC v3 0/8] Symbol versioning and export rework
  2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
                     ` (2 preceding siblings ...)
  2025-03-11  9:56 16%   ` [RFC v3 7/8] build: use dynamically generated version maps David Marchand
@ 2025-03-11 10:18  3%   ` Morten Brørup
  2025-03-11 13:43  0%     ` David Marchand
  3 siblings, 1 reply; 200+ results
From: Morten Brørup @ 2025-03-11 10:18 UTC (permalink / raw)
  To: David Marchand, dev; +Cc: thomas, bruce.richardson, andremue

> From: David Marchand [mailto:david.marchand@redhat.com]
> Sent: Tuesday, 11 March 2025 10.56
> 
> So far, each DPDK library (or driver) exposing symbols in an ABI had to
> maintain a version.map and use some macros for symbol versioning,
> specially crafted with the GNU linker in mind.
> 
> This series proposes to rework the whole principle, and instead rely on
> marking the symbol exports in the source code itself, then let it to
> the
> build framework to produce a version script adapted to the linker in
> use
> (think GNU linker vs MSVC linker).
> 
> This greatly simplifies versioning symbols: a developer does not need
> to
> know anything about version.map, or that a versioned symbol must be
> renamed with _v26, annotated with __vsym, exported in a header etc...
> 
> Checking symbol maps becomes unnecessary since generated by the build
> framework.
> 
> Updating to a new ABI is just a matter of bumping the value in
> ABI_VERSION.
> 
> 
> Comments please.

Excellent. I'm all for automating this!

Feature creep:

Have you thought about how this (or related automation) can possibly also benefit the CI, e.g. for ABI breakage testing?

Or possible benefits to (automated) documentation of versioned functions?
Or possible benefits to remembering all versioned ABIs when writing the release notes?


^ permalink raw reply	[relevance 3%]

* [RFC v3 7/8] build: use dynamically generated version maps
  2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
  2025-03-11  9:56 13%   ` [RFC v3 3/8] eal: rework function versioning macros David Marchand
  2025-03-11  9:56 18%   ` [RFC v3 5/8] build: generate symbol maps David Marchand
@ 2025-03-11  9:56 16%   ` David Marchand
  2025-03-11 10:18  3%   ` [RFC v3 0/8] Symbol versioning and export rework Morten Brørup
  3 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-11  9:56 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue, Aaron Conole, Michael Santana

Switch to always dynamically generate version maps.

As the map files get generated, tooling around checking, converting,
updating etc.. static version maps can be removed.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 .github/workflows/build.yml                   |   1 -
 MAINTAINERS                                   |   7 -
 buildtools/check-symbols.sh                   |  33 +-
 buildtools/map-list-symbol.sh                 |   7 +-
 buildtools/map_to_win.py                      |  41 ---
 buildtools/meson.build                        |   1 -
 devtools/check-symbol-change.sh               | 186 -----------
 devtools/check-symbol-maps.sh                 | 101 ------
 devtools/checkpatches.sh                      |   2 +-
 devtools/update-abi.sh                        |  46 ---
 devtools/update_version_map_abi.py            | 210 ------------
 doc/guides/contributing/abi_policy.rst        |  21 +-
 doc/guides/contributing/coding_style.rst      |   7 -
 .../contributing/img/patch_cheatsheet.svg     | 303 ++++++++----------
 doc/guides/contributing/patches.rst           |   6 +-
 drivers/meson.build                           |  74 ++---
 lib/meson.build                               |  73 ++---
 17 files changed, 188 insertions(+), 931 deletions(-)
 delete mode 100644 buildtools/map_to_win.py
 delete mode 100755 devtools/check-symbol-change.sh
 delete mode 100755 devtools/check-symbol-maps.sh
 delete mode 100755 devtools/update-abi.sh
 delete mode 100755 devtools/update_version_map_abi.py

diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
index fba46b920f..e97b5cdb8b 100644
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -31,7 +31,6 @@ jobs:
         failed=
         devtools/check-doc-vs-code.sh upstream/${{ env.REF_GIT_BRANCH }} || failed=true
         devtools/check-meson.py || failed=true
-        devtools/check-symbol-maps.sh || failed=true
         [ -z "$failed" ]
   ubuntu-vm-builds:
     name: ${{ join(matrix.config.*, '-') }}
diff --git a/MAINTAINERS b/MAINTAINERS
index 04772951d3..9474189035 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -88,7 +88,6 @@ M: Thomas Monjalon <thomas@monjalon.net>
 F: MAINTAINERS
 F: devtools/build-dict.sh
 F: devtools/check-abi.sh
-F: devtools/check-abi-version.sh
 F: devtools/check-doc-vs-code.sh
 F: devtools/check-dup-includes.sh
 F: devtools/check-maintainers.sh
@@ -96,17 +95,13 @@ F: devtools/check-forbidden-tokens.awk
 F: devtools/check-git-log.sh
 F: devtools/check-spdx-tag.sh
 F: devtools/check-symbol-change.py
-F: devtools/check-symbol-change.sh
-F: devtools/check-symbol-maps.sh
 F: devtools/checkpatches.sh
 F: devtools/get-maintainer.sh
 F: devtools/git-log-fixes.sh
 F: devtools/load-devel-config
 F: devtools/parse-flow-support.sh
 F: devtools/process-iwyu.py
-F: devtools/update-abi.sh
 F: devtools/update-patches.py
-F: devtools/update_version_map_abi.py
 F: devtools/libabigail.abignore
 F: devtools/words-case.txt
 F: license/
@@ -166,7 +161,6 @@ M: Tyler Retzlaff <roretzla@linux.microsoft.com>
 F: lib/eal/common/
 F: lib/eal/unix/
 F: lib/eal/include/
-F: lib/eal/version.map
 F: doc/guides/prog_guide/env_abstraction_layer.rst
 F: app/test/test_alarm.c
 F: app/test/test_atomic.c
@@ -396,7 +390,6 @@ Windows support
 M: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
 M: Tyler Retzlaff <roretzla@linux.microsoft.com>
 F: lib/eal/windows/
-F: buildtools/map_to_win.py
 F: doc/guides/windows_gsg/
 
 Windows memory allocation
diff --git a/buildtools/check-symbols.sh b/buildtools/check-symbols.sh
index b8ac24391e..0d6745ec14 100755
--- a/buildtools/check-symbols.sh
+++ b/buildtools/check-symbols.sh
@@ -7,29 +7,12 @@ OBJFILE=$2
 
 ROOTDIR=$(readlink -f $(dirname $(readlink -f $0))/..)
 LIST_SYMBOL=$ROOTDIR/buildtools/map-list-symbol.sh
-CHECK_SYMBOL_MAPS=$ROOTDIR/devtools/check-symbol-maps.sh
-
-# added check for "make -C test/" usage
-if [ ! -e $MAPFILE ] || [ ! -f $OBJFILE ]
-then
-	exit 0
-fi
-
-if [ -d $MAPFILE ]
-then
-	exit 0
-fi
-
 DUMPFILE=$(mktemp -t dpdk.${0##*/}.objdump.XXXXXX)
 trap 'rm -f "$DUMPFILE"' EXIT
 objdump -t $OBJFILE >$DUMPFILE
 
 ret=0
 
-if ! $CHECK_SYMBOL_MAPS $MAPFILE; then
-	ret=1
-fi
-
 for SYM in `$LIST_SYMBOL -S EXPERIMENTAL $MAPFILE |cut -d ' ' -f 3`
 do
 	if grep -q "\.text.*[[:space:]]$SYM$" $DUMPFILE &&
@@ -37,8 +20,7 @@ do
 		$LIST_SYMBOL -s $SYM $MAPFILE | grep -q EXPERIMENTAL
 	then
 		cat >&2 <<- END_OF_MESSAGE
-		$SYM is not flagged as experimental
-		but is listed in version map
+		$SYM is not flagged as experimental but is exported as an experimental symbol
 		Please add __rte_experimental to the definition of $SYM
 		END_OF_MESSAGE
 		ret=1
@@ -53,9 +35,8 @@ for SYM in `awk '{
 do
 	$LIST_SYMBOL -S EXPERIMENTAL -s $SYM -q $MAPFILE || {
 		cat >&2 <<- END_OF_MESSAGE
-		$SYM is flagged as experimental
-		but is not listed in version map
-		Please add $SYM to the version map
+		$SYM is flagged as experimental but is not exported as an experimental symbol
+		Please add RTE_EXPORT_EXPERIMENTAL_SYMBOL to the definition of $SYM
 		END_OF_MESSAGE
 		ret=1
 	}
@@ -67,8 +48,7 @@ do
 		! grep -q "\.text\.internal.*[[:space:]]$SYM$" $DUMPFILE
 	then
 		cat >&2 <<- END_OF_MESSAGE
-		$SYM is not flagged as internal
-		but is listed in version map
+		$SYM is not flagged as internal but is exported as an internal symbol
 		Please add __rte_internal to the definition of $SYM
 		END_OF_MESSAGE
 		ret=1
@@ -83,9 +63,8 @@ for SYM in `awk '{
 do
 	$LIST_SYMBOL -S INTERNAL -s $SYM -q $MAPFILE || {
 		cat >&2 <<- END_OF_MESSAGE
-		$SYM is flagged as internal
-		but is not listed in version map
-		Please add $SYM to the version map
+		$SYM is flagged as internal but is not exported as an internal symbol
+		Please add RTE_EXPORT_INTERNAL_SYMBOL to the definition of $SYM
 		END_OF_MESSAGE
 		ret=1
 	}
diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
index 0829df4be5..962d5f3271 100755
--- a/buildtools/map-list-symbol.sh
+++ b/buildtools/map-list-symbol.sh
@@ -42,7 +42,6 @@ for file in $@; do
 	cat "$file" |awk '
 	BEGIN {
 		current_section = "";
-		current_version = "";
 		if ("'$section'" == "all" && "'$symbol'" == "all" && "'$version'" == "") {
 			ret = 0;
 		} else {
@@ -54,15 +53,11 @@ for file in $@; do
 			current_section = $1;
 		}
 	}
-	/.*}/ { current_section = ""; current_version = ""; }
-	/^\t# added in / {
-		current_version=$4;
-	}
+	/.*}/ { current_section = ""; }
 	/^[^}].*[^:*];/ {
 		if (current_section == "") {
 			next;
 		}
-		symbol_version = current_version
 		if (/^[^}].*[^:*]; # added in /) {
 			symbol_version = $5
 		}
diff --git a/buildtools/map_to_win.py b/buildtools/map_to_win.py
deleted file mode 100644
index aa1752cacd..0000000000
--- a/buildtools/map_to_win.py
+++ /dev/null
@@ -1,41 +0,0 @@
-#!/usr/bin/env python3
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2019 Intel Corporation
-
-import sys
-
-
-def is_function_line(ln):
-    return ln.startswith('\t') and ln.endswith(';\n') and ":" not in ln and "# WINDOWS_NO_EXPORT" not in ln
-
-# MinGW keeps the original .map file but replaces per_lcore* to __emutls_v.per_lcore*
-def create_mingw_map_file(input_map, output_map):
-    with open(input_map) as f_in, open(output_map, 'w') as f_out:
-        f_out.writelines([lines.replace('per_lcore', '__emutls_v.per_lcore') for lines in f_in.readlines()])
-
-def main(args):
-    if not args[1].endswith('version.map') or \
-            not args[2].endswith('exports.def') and \
-            not args[2].endswith('mingw.map'):
-        return 1
-
-    if args[2].endswith('mingw.map'):
-        create_mingw_map_file(args[1], args[2])
-        return 0
-
-# generate def file from map file.
-# This works taking indented lines only which end with a ";" and which don't
-# have a colon in them, i.e. the lines defining functions only.
-    else:
-        with open(args[1]) as f_in:
-            functions = [ln[:-2] + '\n' for ln in sorted(f_in.readlines())
-                         if is_function_line(ln)]
-            functions = ["EXPORTS\n"] + functions
-
-    with open(args[2], 'w') as f_out:
-        f_out.writelines(functions)
-    return 0
-
-
-if __name__ == "__main__":
-    sys.exit(main(sys.argv))
diff --git a/buildtools/meson.build b/buildtools/meson.build
index b745e9afa4..1cd1ce02fd 100644
--- a/buildtools/meson.build
+++ b/buildtools/meson.build
@@ -18,7 +18,6 @@ endif
 echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
 gen_version_map = py3 + files('gen-version-map.py')
 list_dir_globs = py3 + files('list-dir-globs.py')
-map_to_win_cmd = py3 + files('map_to_win.py')
 sphinx_wrapper = py3 + files('call-sphinx-build.py')
 get_cpu_count_cmd = py3 + files('get-cpu-count.py')
 get_numa_count_cmd = py3 + files('get-numa-count.py')
diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh
deleted file mode 100755
index 8992214ac8..0000000000
--- a/devtools/check-symbol-change.sh
+++ /dev/null
@@ -1,186 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2018 Neil Horman <nhorman@tuxdriver.com>
-
-build_map_changes()
-{
-	local fname="$1"
-	local mapdb="$2"
-
-	cat "$fname" | awk '
-		# Initialize our variables
-		BEGIN {map="";sym="";ar="";sec=""; in_sec=0; in_map=0}
-
-		# Anything that starts with + or -, followed by an a
-		# and ends in the string .map is the name of our map file
-		# This may appear multiple times in a patch if multiple
-		# map files are altered, and all section/symbol names
-		# appearing between a triggering of this rule and the
-		# next trigger of this rule are associated with this file
-		/[-+] [ab]\/.*\.map/ {map=$2; in_map=1; next}
-
-		# The previous rule catches all .map files, anything else
-		# indicates we left the map chunk.
-		/[-+] [ab]\// {in_map=0}
-
-		# Triggering this rule, which starts a line and ends it
-		# with a { identifies a versioned section.  The section name is
-		# the rest of the line with the + and { symbols removed.
-		# Triggering this rule sets in_sec to 1, which actives the
-		# symbol rule below
-		/^.*{/ {
-			gsub("+", "");
-			if (in_map == 1) {
-				sec=$(NF-1); in_sec=1;
-			}
-		}
-
-		# This rule identifies the end of a section, and disables the
-		# symbol rule
-		/.*}/ {in_sec=0}
-
-		# This rule matches on a + followed by any characters except a :
-		# (which denotes a global vs local segment), and ends with a ;.
-		# The semicolon is removed and the symbol is printed with its
-		# association file name and version section, along with an
-		# indicator that the symbol is a new addition.  Note this rule
-		# only works if we have found a version section in the rule
-		# above (hence the in_sec check) And found a map file (the
-		# in_map check).  If we are not in a map chunk, do nothing.  If
-		# we are in a map chunk but not a section chunk, record it as
-		# unknown.
-		/^+[^}].*[^:*];/ {gsub(";","");sym=$2;
-			if (in_map == 1) {
-				if (in_sec == 1) {
-					print map " " sym " " sec " add"
-				} else {
-					print map " " sym " unknown add"
-				}
-			}
-		}
-
-		# This is the same rule as above, but the rule matches on a
-		# leading - rather than a +, denoting that the symbol is being
-		# removed.
-		/^-[^}].*[^:*];/ {gsub(";","");sym=$2;
-			if (in_map == 1) {
-				if (in_sec == 1) {
-					print map " " sym " " sec " del"
-				} else {
-					print map " " sym " unknown del"
-				}
-			}
-		}' > "$mapdb"
-
-		sort -u "$mapdb" > "$mapdb.2"
-		mv -f "$mapdb.2" "$mapdb"
-
-}
-
-is_stable_section() {
-	[ "$1" != 'EXPERIMENTAL' ] && [ "$1" != 'INTERNAL' ]
-}
-
-check_for_rule_violations()
-{
-	local mapdb="$1"
-	local mname
-	local symname
-	local secname
-	local ar
-	local ret=0
-
-	while read mname symname secname ar
-	do
-		if [ "$ar" = "add" ]
-		then
-
-			if [ "$secname" = "unknown" ]
-			then
-				# Just inform the user of this occurrence, but
-				# don't flag it as an error
-				echo -n "INFO: symbol $symname is added but "
-				echo -n "patch has insufficient context "
-				echo -n "to determine the section name "
-				echo -n "please ensure the version is "
-				echo "EXPERIMENTAL"
-				continue
-			fi
-
-			oldsecname=$(sed -n \
-			"s#$mname $symname \(.*\) del#\1#p" "$mapdb")
-
-			# A symbol can not enter a stable section directly
-			if [ -z "$oldsecname" ]
-			then
-				if ! is_stable_section $secname
-				then
-					echo -n "INFO: symbol $symname has "
-					echo -n "been added to the "
-					echo -n "$secname section of the "
-					echo "version map"
-					continue
-				else
-					echo -n "ERROR: symbol $symname "
-					echo -n "is added in the $secname "
-					echo -n "section, but is expected to "
-					echo -n "be added in the EXPERIMENTAL "
-					echo "section of the version map"
-					ret=1
-					continue
-				fi
-			fi
-
-			# This symbol is moving inside a section, nothing to do
-			if [ "$oldsecname" = "$secname" ]
-			then
-				continue
-			fi
-
-			# This symbol is moving between two sections (the
-			# original section is a stable section).
-			# This can be legit, just warn.
-			if is_stable_section $oldsecname
-			then
-				echo -n "INFO: symbol $symname is being "
-				echo -n "moved from $oldsecname to $secname. "
-				echo -n "Ensure that it has gone through the "
-				echo "deprecation process"
-				continue
-			fi
-		else
-
-			if ! grep -q "$mname $symname .* add" "$mapdb" && \
-			   is_stable_section $secname
-			then
-				# Just inform users that stable
-				# symbols need to go through a deprecation
-				# process
-				echo -n "INFO: symbol $symname is being "
-				echo -n "removed, ensure that it has "
-				echo "gone through the deprecation process"
-			fi
-		fi
-	done < "$mapdb"
-
-	return $ret
-}
-
-trap clean_and_exit_on_sig EXIT
-
-mapfile=`mktemp -t dpdk.mapdb.XXXXXX`
-patch=$1
-exit_code=1
-
-clean_and_exit_on_sig()
-{
-	rm -f "$mapfile"
-	exit $exit_code
-}
-
-build_map_changes "$patch" "$mapfile"
-check_for_rule_violations "$mapfile"
-exit_code=$?
-rm -f "$mapfile"
-
-exit $exit_code
diff --git a/devtools/check-symbol-maps.sh b/devtools/check-symbol-maps.sh
deleted file mode 100755
index fcd3931e5d..0000000000
--- a/devtools/check-symbol-maps.sh
+++ /dev/null
@@ -1,101 +0,0 @@
-#! /bin/sh -e
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright 2018 Mellanox Technologies, Ltd
-
-cd $(dirname $0)/..
-
-# speed up by ignoring Unicode details
-export LC_ALL=C
-
-if [ $# = 0 ] ; then
-    set -- $(find lib drivers -name '*.map' -a ! -path drivers/version.map)
-fi
-
-ret=0
-
-find_orphan_symbols ()
-{
-    for map in $@ ; do
-        for sym in $(sed -rn 's,^([^}]*_.*);.*$,\1,p' $map) ; do
-            if echo $sym | grep -q '^per_lcore_' ; then
-                symsrc=${sym#per_lcore_}
-            elif echo $sym | grep -q '^__rte_.*_trace_' ; then
-                symsrc=${sym#__}
-            else
-                symsrc=$sym
-            fi
-            if [ -z "$(grep -rlw $symsrc $(dirname $map) | grep -v $map)" ] ; then
-                echo "$map: $sym"
-            fi
-        done
-    done
-}
-
-orphan_symbols=$(find_orphan_symbols $@)
-if [ -n "$orphan_symbols" ] ; then
-    echo "Found only in symbol map file:"
-    echo "$orphan_symbols" | sed 's,^,\t,'
-    ret=1
-fi
-
-find_duplicate_symbols ()
-{
-    for map in $@ ; do
-        buildtools/map-list-symbol.sh $map | \
-            sort | uniq -c | grep -v " 1 $map" || true
-    done
-}
-
-duplicate_symbols=$(find_duplicate_symbols $@)
-if [ -n "$duplicate_symbols" ] ; then
-    echo "Found duplicates in symbol map file:"
-    echo "$duplicate_symbols"
-    ret=1
-fi
-
-local_miss_maps=$(grep -L 'local: \*;' $@ || true)
-if [ -n "$local_miss_maps" ] ; then
-    echo "Found maps without local catch-all:"
-    echo "$local_miss_maps"
-    ret=1
-fi
-
-find_bad_format_maps ()
-{
-    abi_version=$(cut -d'.' -f 1 ABI_VERSION)
-    next_abi_version=$((abi_version + 1))
-    for map in $@ ; do
-        cat $map | awk '
-            /^(DPDK_('$abi_version'|'$next_abi_version')|EXPERIMENTAL|INTERNAL) \{$/ { next; } # start of a section
-            /^}( DPDK_'$abi_version')?;$/ { next; } # end of a section
-            /^$/ { next; } # empty line
-            /^\t(global:|local: \*;)$/ { next; } # qualifiers
-            /^\t[a-zA-Z_0-9]*;( # WINDOWS_NO_EXPORT)?$/ { next; } # symbols
-            /^\t# added in [0-9]*\.[0-9]*$/ { next; } # version comments
-            { print $0; }' || echo $map
-    done
-}
-
-bad_format_maps=$(find_bad_format_maps $@)
-if [ -n "$bad_format_maps" ] ; then
-    echo "Found badly formatted maps:"
-    echo "$bad_format_maps"
-    ret=1
-fi
-
-find_non_versioned_maps ()
-{
-    for map in $@ ; do
-        [ $(buildtools/map-list-symbol.sh -S EXPERIMENTAL -V unset $map | wc -l) = '0' ] ||
-            echo $map
-    done
-}
-
-non_versioned_maps=$(find_non_versioned_maps $@)
-if [ -n "$non_versioned_maps" ] ; then
-    echo "Found non versioned maps:"
-    echo "$non_versioned_maps"
-    ret=1
-fi
-
-exit $ret
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 7dcac7c8c9..1f3c551b31 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -9,7 +9,7 @@
 # - DPDK_CHECKPATCH_OPTIONS
 . $(dirname $(readlink -f $0))/load-devel-config
 
-VALIDATE_NEW_API=$(dirname $(readlink -f $0))/check-symbol-change.sh
+VALIDATE_NEW_API=$(dirname $(readlink -f $0))/check-symbol-change.py
 
 # Enable codespell by default. This can be overwritten from a config file.
 # Codespell can also be enabled by setting DPDK_CHECKPATCH_CODESPELL to a valid path
diff --git a/devtools/update-abi.sh b/devtools/update-abi.sh
deleted file mode 100755
index 45437f3c3b..0000000000
--- a/devtools/update-abi.sh
+++ /dev/null
@@ -1,46 +0,0 @@
-#!/bin/sh -e
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2019 Intel Corporation
-
-abi_version=$1
-abi_version_file="./ABI_VERSION"
-update_path="lib drivers"
-
-# check ABI version format string
-check_abi_version() {
-      echo $1 | grep -q -e "^[[:digit:]]\{1,2\}\.[[:digit:]]\{1,2\}$"
-}
-
-if [ -z "$1" ]; then
-      # output to stderr
-      >&2 echo "Please provide ABI version"
-      exit 1
-fi
-
-# check version string format
-if ! check_abi_version $abi_version ; then
-      # output to stderr
-      >&2 echo "ABI version must be formatted as MAJOR.MINOR version"
-      exit 1
-fi
-
-if [ -n "$2" ]; then
-      abi_version_file=$2
-fi
-
-if [ -n "$3" ]; then
-      # drop $1 and $2
-      shift 2
-      # assign all other arguments as update paths
-      update_path=$@
-fi
-
-echo "New ABI version:" $abi_version
-echo "ABI_VERSION path:" $abi_version_file
-echo "Path to update:" $update_path
-
-echo $abi_version > $abi_version_file
-
-find $update_path -name version.map -exec \
-      devtools/update_version_map_abi.py {} \
-      $abi_version \; -print
diff --git a/devtools/update_version_map_abi.py b/devtools/update_version_map_abi.py
deleted file mode 100755
index d17b02a327..0000000000
--- a/devtools/update_version_map_abi.py
+++ /dev/null
@@ -1,210 +0,0 @@
-#!/usr/bin/env python3
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2019 Intel Corporation
-
-"""
-A Python program that updates and merges all available stable ABI versions into
-one ABI version, while leaving experimental ABI exactly as it is. The intended
-ABI version is supplied via command-line parameter. This script is to be called
-from the devtools/update-abi.sh utility.
-"""
-
-import argparse
-import sys
-import re
-
-
-def __parse_map_file(f_in):
-    # match function name, followed by semicolon, followed by EOL or comments,
-    # optionally with whitespace in between each item
-    func_line_regex = re.compile(r"\s*"
-                                 r"(?P<line>"
-                                 r"(?P<func>[a-zA-Z_0-9]+)"
-                                 r"\s*"
-                                 r";"
-                                 r"\s*"
-                                 r"(?P<comment>#.+)?"
-                                 r")"
-                                 r"\s*"
-                                 r"$")
-    # match section name, followed by opening bracked, followed by EOL,
-    # optionally with whitespace in between each item
-    section_begin_regex = re.compile(r"\s*"
-                                     r"(?P<version>[a-zA-Z0-9_\.]+)"
-                                     r"\s*"
-                                     r"{"
-                                     r"\s*"
-                                     r"$")
-    # match closing bracket, optionally followed by section name (for when we
-    # inherit from another ABI version), followed by semicolon, followed by
-    # EOL, optionally with whitespace in between each item
-    section_end_regex = re.compile(r"\s*"
-                                   r"}"
-                                   r"\s*"
-                                   r"(?P<parent>[a-zA-Z0-9_\.]+)?"
-                                   r"\s*"
-                                   r";"
-                                   r"\s*"
-                                   r"$")
-
-    # for stable ABI, we don't care about which version introduced which
-    # function, we just flatten the list. there are dupes in certain files, so
-    # use a set instead of a list
-    stable_lines = set()
-    # copy experimental section as is
-    experimental_lines = []
-    # copy internal section as is
-    internal_lines = []
-    in_experimental = False
-    in_internal = False
-    has_stable = False
-
-    # gather all functions
-    for line in f_in:
-        # clean up the line
-        line = line.strip('\n').strip()
-
-        # is this an end of section?
-        match = section_end_regex.match(line)
-        if match:
-            # whatever section this was, it's not active any more
-            in_experimental = False
-            in_internal = False
-            continue
-
-        # if we're in the middle of experimental section, we need to copy
-        # the section verbatim, so just add the line
-        if in_experimental:
-            experimental_lines += [line]
-            continue
-
-        # if we're in the middle of internal section, we need to copy
-        # the section verbatim, so just add the line
-        if in_internal:
-            internal_lines += [line]
-            continue
-
-        # skip empty lines
-        if not line:
-            continue
-
-        # is this a beginning of a new section?
-        match = section_begin_regex.match(line)
-        if match:
-            cur_section = match.group("version")
-            # is it experimental?
-            in_experimental = cur_section == "EXPERIMENTAL"
-            # is it internal?
-            in_internal = cur_section == "INTERNAL"
-            if not in_experimental and not in_internal:
-                has_stable = True
-            continue
-
-        # is this a function?
-        match = func_line_regex.match(line)
-        if match:
-            stable_lines.add(match.group("line"))
-
-    return has_stable, stable_lines, experimental_lines, internal_lines
-
-
-def __generate_stable_abi(f_out, abi_major, lines):
-    # print ABI version header
-    print("DPDK_{} {{".format(abi_major), file=f_out)
-
-    # print global section if it exists
-    if lines:
-        print("\tglobal:", file=f_out)
-        # blank line
-        print(file=f_out)
-
-        # print all stable lines, alphabetically sorted
-        for line in sorted(lines):
-            print("\t{}".format(line), file=f_out)
-
-        # another blank line
-        print(file=f_out)
-
-    # print local section
-    print("\tlocal: *;", file=f_out)
-
-    # end stable version
-    print("};", file=f_out)
-
-
-def __generate_experimental_abi(f_out, lines):
-    # start experimental section
-    print("EXPERIMENTAL {", file=f_out)
-
-    # print all experimental lines as they were
-    for line in lines:
-        # don't print empty whitespace
-        if not line:
-            print("", file=f_out)
-        else:
-            print("\t{}".format(line), file=f_out)
-
-    # end section
-    print("};", file=f_out)
-
-def __generate_internal_abi(f_out, lines):
-    # start internal section
-    print("INTERNAL {", file=f_out)
-
-    # print all internal lines as they were
-    for line in lines:
-        # don't print empty whitespace
-        if not line:
-            print("", file=f_out)
-        else:
-            print("\t{}".format(line), file=f_out)
-
-    # end section
-    print("};", file=f_out)
-
-def __main():
-    arg_parser = argparse.ArgumentParser(
-        description='Merge versions in linker version script.')
-
-    arg_parser.add_argument("map_file", type=str,
-                            help='path to linker version script file '
-                                 '(pattern: version.map)')
-    arg_parser.add_argument("abi_version", type=str,
-                            help='target ABI version (pattern: MAJOR.MINOR)')
-
-    parsed = arg_parser.parse_args()
-
-    if not parsed.map_file.endswith('version.map'):
-        print("Invalid input file: {}".format(parsed.map_file),
-              file=sys.stderr)
-        arg_parser.print_help()
-        sys.exit(1)
-
-    if not re.match(r"\d{1,2}\.\d{1,2}", parsed.abi_version):
-        print("Invalid ABI version: {}".format(parsed.abi_version),
-              file=sys.stderr)
-        arg_parser.print_help()
-        sys.exit(1)
-    abi_major = parsed.abi_version.split('.')[0]
-
-    with open(parsed.map_file) as f_in:
-        has_stable, stable_lines, experimental_lines, internal_lines = __parse_map_file(f_in)
-
-    with open(parsed.map_file, 'w') as f_out:
-        need_newline = has_stable and experimental_lines
-        if has_stable:
-            __generate_stable_abi(f_out, abi_major, stable_lines)
-        if need_newline:
-            # separate sections with a newline
-            print(file=f_out)
-        if experimental_lines:
-            __generate_experimental_abi(f_out, experimental_lines)
-        if internal_lines:
-            if has_stable or experimental_lines:
-              # separate sections with a newline
-              print(file=f_out)
-            __generate_internal_abi(f_out, internal_lines)
-
-
-if __name__ == "__main__":
-    __main()
diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
index d96153c6b2..f03a7467ac 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -330,31 +330,14 @@ become part of a tracked ABI version.
 
 Note that marking an API as experimental is a multi step process.
 To mark an API as experimental, the symbols which are desired to be exported
-must be placed in an EXPERIMENTAL version block in the corresponding libraries'
-version map script.
+must be annotated with a RTE_EXPORT_EXPERIMENTAL_SYMBOL call in the corresponding libraries'
+sources.
 Experimental symbols must be commented so that it is clear in which DPDK
 version they were introduced.
 
-.. code-block:: none
-
-   EXPERIMENTAL {
-           global:
-
-           # added in 20.11
-           rte_foo_init;
-           rte_foo_configure;
-
-           # added in 21.02
-           rte_foo_cleanup;
-   ...
-
 Secondly, the corresponding prototypes of those exported functions (in the
 development header files), must be marked with the ``__rte_experimental`` tag
 (see ``rte_compat.h``).
-The DPDK build makefiles perform a check to ensure that the map file and the
-C code reflect the same list of symbols.
-This check can be circumvented by defining ``ALLOW_EXPERIMENTAL_API``
-during compilation in the corresponding library Makefile.
 
 In addition to tagging the code with ``__rte_experimental``,
 the doxygen markup must also contain the EXPERIMENTAL string,
diff --git a/doc/guides/contributing/coding_style.rst b/doc/guides/contributing/coding_style.rst
index 1ebc79ca3c..43e27bbd0a 100644
--- a/doc/guides/contributing/coding_style.rst
+++ b/doc/guides/contributing/coding_style.rst
@@ -1018,13 +1018,6 @@ name
 	sources are stored in a directory ``lib/xyz``, this value should
 	never be needed for new libraries.
 
-.. note::
-
-	The name value also provides the name used to find the function version
-	map file, as part of the build process, so if the directory name and
-	library names differ, the ``version.map`` file should be named
-	consistently with the library, not the directory
-
 objs
 	**Default Value = []**.
 	This variable can be used to pass to the library build some pre-built
diff --git a/doc/guides/contributing/img/patch_cheatsheet.svg b/doc/guides/contributing/img/patch_cheatsheet.svg
index 4debb07b98..a06d8a2a3b 100644
--- a/doc/guides/contributing/img/patch_cheatsheet.svg
+++ b/doc/guides/contributing/img/patch_cheatsheet.svg
@@ -1,18 +1,18 @@
 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
 <svg
-   xmlns:dc="http://purl.org/dc/elements/1.1/"
-   xmlns:cc="http://creativecommons.org/ns#"
-   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
-   xmlns:svg="http://www.w3.org/2000/svg"
-   xmlns="http://www.w3.org/2000/svg"
-   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
-   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
    version="1.1"
    width="210mm"
    height="297mm"
    id="svg2985"
-   inkscape:version="1.0.1 (3bc2e813f5, 2020-09-07)"
-   sodipodi:docname="patch_cheatsheet.svg">
+   inkscape:version="1.4 (e7c3feb100, 2024-10-09)"
+   sodipodi:docname="patch_cheatsheet.svg"
+   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+   xmlns="http://www.w3.org/2000/svg"
+   xmlns:svg="http://www.w3.org/2000/svg"
+   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
+   xmlns:cc="http://creativecommons.org/ns#"
+   xmlns:dc="http://purl.org/dc/elements/1.1/">
   <sodipodi:namedview
      pagecolor="#ffffff"
      bordercolor="#666666"
@@ -23,18 +23,22 @@
      inkscape:pageopacity="0"
      inkscape:pageshadow="2"
      inkscape:window-width="1920"
-     inkscape:window-height="1017"
+     inkscape:window-height="975"
      id="namedview274"
      showgrid="false"
      inkscape:zoom="0.89702958"
-     inkscape:cx="246.07409"
-     inkscape:cy="416.76022"
-     inkscape:window-x="1072"
-     inkscape:window-y="-8"
+     inkscape:cx="546.24732"
+     inkscape:cy="385.71749"
+     inkscape:window-x="0"
+     inkscape:window-y="0"
      inkscape:window-maximized="1"
      inkscape:current-layer="layer1"
      inkscape:document-rotation="0"
-     inkscape:snap-grids="false" />
+     inkscape:snap-grids="false"
+     inkscape:showpageshadow="2"
+     inkscape:pagecheckerboard="0"
+     inkscape:deskcolor="#d1d1d1"
+     inkscape:document-units="mm" />
   <defs
      id="defs3">
     <linearGradient
@@ -906,7 +910,7 @@
              style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:11.5613px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start"
              id="tspan4092-8-7-6-9-7"
              y="855.79816"
-             x="460.18405">****</tspan></text>
+             x="460.18405">***</tspan></text>
       </g>
     </g>
     <text
@@ -1132,161 +1136,126 @@
            id="tspan4092-8-6-3-1-8-4-4-55-7"
            style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">*</tspan></text>
     </g>
+    <text
+       x="424.10629"
+       y="363.21423"
+       id="text4090-8"
+       xml:space="preserve"
+       style="font-style:normal;font-weight:normal;font-size:40.4213px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1.01053"
+       transform="scale(1.0105317,0.98957807)"><tspan
+         x="424.10629"
+         y="363.21423"
+         id="tspan4092-8"
+         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21.2212px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:1.01053">+ Rebase to git  </tspan></text>
+    <text
+       x="424.10629"
+       y="393.60123"
+       id="text4090-8-5"
+       xml:space="preserve"
+       style="font-style:normal;font-weight:normal;font-size:40.4213px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1.01053"
+       transform="scale(1.0105317,0.98957807)"><tspan
+         x="424.10629"
+         y="393.60123"
+         id="tspan4092-8-5"
+         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21.2212px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:1.01053">+ Checkpatch </tspan></text>
+    <text
+       x="424.10629"
+       y="424.20575"
+       id="text4090-8-5-6"
+       xml:space="preserve"
+       style="font-style:normal;font-weight:normal;font-size:40.4213px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1.01053"
+       transform="scale(1.0105317,0.98957807)"><tspan
+         x="424.10629"
+         y="424.20575"
+         id="tspan4092-8-5-5"
+         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21.2212px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:1.01053">+ ABI breakage </tspan></text>
+    <text
+       x="424.10629"
+       y="453.10339"
+       id="text4090-8-5-6-9-4"
+       xml:space="preserve"
+       style="font-style:normal;font-weight:normal;font-size:40.4213px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1.01053"
+       transform="scale(1.0105317,0.98957807)"><tspan
+         x="424.10629"
+         y="453.10339"
+         id="tspan4092-8-5-5-3-4"
+         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21.2212px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:1.01053">+ Maintainers file</tspan></text>
+    <text
+       x="424.10629"
+       y="514.09497"
+       id="text4090-8-5-6-9-4-6"
+       xml:space="preserve"
+       style="font-style:normal;font-weight:normal;font-size:40.4213px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1.01053"
+       transform="scale(1.0105317,0.98957807)"><tspan
+         x="424.10629"
+         y="514.09497"
+         id="tspan4092-8-5-5-3-4-0"
+         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21.2212px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:1.01053">+ Release notes</tspan></text>
+    <text
+       x="425.12708"
+       y="544.91718"
+       id="text4090-8-5-6-9-4-6-6"
+       xml:space="preserve"
+       style="font-style:normal;font-weight:normal;font-size:40.4213px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1.01053"
+       transform="scale(1.0105317,0.98957807)"><tspan
+         x="425.12708"
+         y="544.91718"
+         id="tspan4092-8-5-5-3-4-0-6"
+         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21.2212px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:1.01053">+ Documentation</tspan></text>
     <g
-       transform="translate(1.0962334,-2.7492248)"
-       id="g3605">
-      <text
-         x="42.176418"
-         y="1020.4383"
-         id="text4090-8-7-8-7-6-3-8-4"
-         xml:space="preserve"
-         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:0%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none"><tspan
-           x="42.176418"
-           y="1020.4383"
-           id="tspan4092-8-6-3-1-8-4-4-55"
-           style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:11px;line-height:125%;font-family:monospace;-inkscape-font-specification:Monospace;text-align:start;writing-mode:lr-tb;text-anchor:start">The version.map function names must be in alphabetical order.</tspan></text>
-      <text
-         x="30.942892"
-         y="1024.2014"
-         id="text4090-8-7-8-7-6-3-8-4-1-5"
-         xml:space="preserve"
-         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:0%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none"><tspan
-           x="30.942892"
-           y="1024.2014"
-           id="tspan4092-8-6-3-1-8-4-4-55-7-2"
-           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">*</tspan></text>
-      <text
-         x="25.247679"
-         y="1024.2014"
-         id="text4090-8-7-8-7-6-3-8-4-1-5-6"
-         xml:space="preserve"
-         style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:0%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none"><tspan
-           x="25.247679"
-           y="1024.2014"
-           id="tspan4092-8-6-3-1-8-4-4-55-7-2-8"
-           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">*</tspan></text>
-    </g>
-    <g
-       transform="matrix(1.0211743,0,0,1,25.427515,-30.749225)"
-       id="g3275">
+       transform="matrix(1.0211743,0,0,1,25.427515,-31.583927)"
+       id="g3334">
       <g
-         id="g3341">
+         id="g3267"
+         transform="translate(-13.517932,3.1531035)">
         <text
-           x="394.78601"
-           y="390.17807"
-           id="text4090-8"
-           xml:space="preserve"
-           style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-             x="394.78601"
-             y="390.17807"
-             id="tspan4092-8"
-             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Rebase to git  </tspan></text>
-        <text
-           x="394.78601"
-           y="420.24835"
-           id="text4090-8-5"
+           x="660.46729"
+           y="468.01297"
+           id="text4090-8-1-8-9-1-4-1"
            xml:space="preserve"
-           style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-             x="394.78601"
-             y="420.24835"
-             id="tspan4092-8-5"
-             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Checkpatch </tspan></text>
-        <text
-           x="394.78601"
-           y="450.53394"
-           id="text4090-8-5-6"
-           xml:space="preserve"
-           style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-             x="394.78601"
-             y="450.53394"
-             id="tspan4092-8-5-5"
-             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ ABI breakage </tspan></text>
-        <text
+           style="font-style:normal;font-weight:normal;font-size:25.6917px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
+             x="660.46729"
+             y="468.01297"
+             id="tspan4092-8-7-6-9-7-0-7"
+             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:11.5613px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start" /></text>
+      </g>
+      <text
+         x="394.78601"
+         y="483.59955"
+         id="text4090-8-5-6-9"
+         xml:space="preserve"
+         style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
            x="394.78601"
-           y="513.13031"
-           id="text4090-8-5-6-9-4"
-           xml:space="preserve"
-           style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-             x="394.78601"
-             y="513.13031"
-             id="tspan4092-8-5-5-3-4"
-             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Maintainers file</tspan></text>
-        <text
+           y="483.59955"
+           id="tspan4092-8-5-5-3"
+           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start" /></text>
+    </g>
+    <g
+       id="g3428"
+       transform="matrix(1.0211743,0,0,1,25.427515,-63.867847)">
+      <text
+         x="394.78601"
+         y="541.38928"
+         id="text4090-8-5-6-9-4-6-1"
+         xml:space="preserve"
+         style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
            x="394.78601"
-           y="573.48621"
-           id="text4090-8-5-6-9-4-6"
-           xml:space="preserve"
-           style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-             x="394.78601"
-             y="573.48621"
-             id="tspan4092-8-5-5-3-4-0"
-             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Release notes</tspan></text>
+           y="541.38928"
+           id="tspan4092-8-5-5-3-4-0-7"
+           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Doxygen</tspan></text>
+      <g
+         transform="translate(-119.92979,57.949844)"
+         id="g3267-9">
         <text
-           x="395.79617"
-           y="603.98718"
-           id="text4090-8-5-6-9-4-6-6"
+           x="628.93628"
+           y="473.13675"
+           id="text4090-8-1-8-9-1-4-1-4"
            xml:space="preserve"
-           style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-             x="395.79617"
-             y="603.98718"
-             id="tspan4092-8-5-5-3-4-0-6"
-             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Documentation</tspan></text>
-        <g
-           transform="translate(0,-0.83470152)"
-           id="g3334">
-          <g
-             id="g3267"
-             transform="translate(-13.517932,3.1531035)">
-            <text
-               x="660.46729"
-               y="468.01297"
-               id="text4090-8-1-8-9-1-4-1"
-               xml:space="preserve"
-               style="font-style:normal;font-weight:normal;font-size:25.6917px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-                 x="660.46729"
-                 y="468.01297"
-                 id="tspan4092-8-7-6-9-7-0-7"
-                 style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:11.5613px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">**</tspan></text>
-          </g>
-          <text
-             x="394.78601"
-             y="483.59955"
-             id="text4090-8-5-6-9"
-             xml:space="preserve"
-             style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-               x="394.78601"
-               y="483.59955"
-               id="tspan4092-8-5-5-3"
-               style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Update version.map</tspan></text>
-        </g>
-        <g
-           id="g3428"
-           transform="translate(0,0.88137813)">
-          <text
-             x="394.78601"
-             y="541.38928"
-             id="text4090-8-5-6-9-4-6-1"
-             xml:space="preserve"
-             style="font-style:normal;font-weight:normal;font-size:40px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-               x="394.78601"
-               y="541.38928"
-               id="tspan4092-8-5-5-3-4-0-7"
-               style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+ Doxygen</tspan></text>
-          <g
-             transform="translate(-119.92979,57.949844)"
-             id="g3267-9">
-            <text
-               x="628.93628"
-               y="473.13675"
-               id="text4090-8-1-8-9-1-4-1-4"
-               xml:space="preserve"
-               style="font-style:normal;font-weight:normal;font-size:25.6917px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
-                 x="628.93628"
-                 y="473.13675"
-                 id="tspan4092-8-7-6-9-7-0-7-8"
-                 style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:11.5613px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">***</tspan></text>
-          </g>
-        </g>
+           style="font-style:normal;font-weight:normal;font-size:25.6917px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none"><tspan
+             x="628.93628"
+             y="473.13675"
+             id="tspan4092-8-7-6-9-7-0-7-8"
+             style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:11.5613px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">**</tspan></text>
       </g>
     </g>
     <text
@@ -1301,7 +1270,7 @@
          id="tspan4092-8-5-5-3-4-0-6-2-11-0"
          style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:21px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">+</tspan></text>
     <g
-       transform="translate(1.0962334,-2.7492248)"
+       transform="translate(1.0962334,-14.749225)"
        id="g3595">
       <text
          x="30.942892"
@@ -1332,7 +1301,7 @@
            x="19.552465"
            y="1037.0271"
            id="tspan4092-8-6-3-1-8-4-4-55-7-3-9"
-           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">*</tspan></text>
+           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start" /></text>
       <text
          x="42.830166"
          y="1033.2393"
@@ -1345,7 +1314,7 @@
            style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:11px;line-height:125%;font-family:monospace;-inkscape-font-specification:Monospace;text-align:start;writing-mode:lr-tb;text-anchor:start">New header files must get a new page in the API docs.</tspan></text>
     </g>
     <g
-       transform="translate(1.0962334,-2.7492248)"
+       transform="translate(1.0962334,-14.749225)"
        id="g3619">
       <text
          x="42.212418"
@@ -1396,7 +1365,7 @@
            x="14.016749"
            y="1049.8527"
            id="tspan4092-8-6-3-1-8-4-4-55-7-3-9-6-5"
-           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start">*</tspan></text>
+           style="font-style:normal;font-variant:normal;font-weight:300;font-stretch:normal;font-size:13px;line-height:125%;font-family:monospace;-inkscape-font-specification:'Monospace Bold';text-align:start;writing-mode:lr-tb;text-anchor:start" /></text>
     </g>
     <rect
        width="196.44218"
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index d21ee288b2..8ad6b6e715 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -160,9 +160,9 @@ Make your planned changes in the cloned ``dpdk`` repo. Here are some guidelines
 
   * For other PMDs and more info, refer to the ``MAINTAINERS`` file.
 
-* New external functions should be added to the local ``version.map`` file. See
-  the :doc:`ABI policy <abi_policy>` and :ref:`ABI versioning <abi_versioning>`
-  guides. New external functions should also be added in alphabetical order.
+* New external functions should be exported.
+  See the :doc:`ABI policy <abi_policy>` and :ref:`ABI versioning <abi_versioning>`
+  guides.
 
 * Any new API function should be used in ``/app`` test directory.
 
diff --git a/drivers/meson.build b/drivers/meson.build
index c8bc556f1a..6904d34eee 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -5,8 +5,6 @@ if is_ms_compiler
     subdir_done()
 endif
 
-fs = import('fs')
-
 # Defines the order of dependencies evaluation
 subdirs = [
         'common',
@@ -260,59 +258,27 @@ foreach subpath:subdirs
                 install: true)
 
         # now build the shared driver
-        version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), drv_path)
-
-        if not fs.is_file(version_map)
-            if is_ms_linker
-                link_mode = 'msvc'
-            elif is_windows
-                link_mode = 'mingw'
-            else
-                link_mode = 'gnu'
-            endif
-            version_map = custom_target(lib_name + '_map',
-                    command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
-                    input: sources,
-                    output: 'lib@0@_exports.map'.format(lib_name))
-            version_map_path = version_map.full_path()
-            version_map_dep = [version_map]
-            lk_deps = [version_map]
-
-            if is_ms_linker
-                if is_ms_compiler
-                    lk_args = ['/def:' + version_map.full_path()]
-                else
-                    lk_args = ['-Wl,/def:' + version_map.full_path()]
-                endif
-            else
-                lk_args = ['-Wl,--version-script=' + version_map.full_path()]
-            endif
+        if is_ms_linker
+            link_mode = 'msvc'
+        elif is_windows
+            link_mode = 'mingw'
         else
-            version_map_path = version_map
-            version_map_dep = []
-            lk_deps = [version_map]
-
-            if is_windows
-                if is_ms_linker
-                    def_file = custom_target(lib_name + '_def',
-                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                            input: version_map,
-                            output: '@0@_exports.def'.format(lib_name))
-                    lk_deps += [def_file]
-
-                    lk_args = ['-Wl,/def:' + def_file.full_path()]
-                else
-                    mingw_map = custom_target(lib_name + '_mingw',
-                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                            input: version_map,
-                            output: '@0@_mingw.map'.format(lib_name))
-                    lk_deps += [mingw_map]
-
-                    lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
-                endif
+            link_mode = 'gnu'
+        endif
+        version_map = custom_target(lib_name + '_map',
+                command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
+                input: sources,
+                output: 'lib@0@_exports.map'.format(lib_name))
+        lk_deps = [version_map]
+
+        if is_ms_linker
+            if is_ms_compiler
+                lk_args = ['/def:' + version_map.full_path()]
             else
-                lk_args = ['-Wl,--version-script=' + version_map]
+                lk_args = ['-Wl,/def:' + version_map.full_path()]
             endif
+        else
+            lk_args = ['-Wl,--version-script=' + version_map.full_path()]
         endif
 
         if not is_windows and developer_mode
@@ -320,11 +286,11 @@ foreach subpath:subdirs
             # check-symbols.sh script, using it as a
             # dependency of the .so build
             lk_deps += custom_target(lib_name + '.sym_chk',
-                    command: [check_symbols, version_map_path, '@INPUT@'],
+                    command: [check_symbols, version_map.full_path(), '@INPUT@'],
                     capture: true,
                     input: static_lib,
                     output: lib_name + '.sym_chk',
-                    depends: version_map_dep)
+                    depends: [version_map])
         endif
 
         shared_lib = shared_library(lib_name, sources_pmd_info,
diff --git a/lib/meson.build b/lib/meson.build
index b6bac02b48..f143bc202b 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -1,7 +1,6 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2019 Intel Corporation
 
-fs = import('fs')
 
 # process all libraries equally, as far as possible
 # "core" libs first, then others alphabetically as far as possible
@@ -255,61 +254,27 @@ foreach l:libraries
             include_directories: includes,
             dependencies: static_deps)
 
-    if not fs.is_file('@0@/@1@/version.map'.format(meson.current_source_dir(), l))
-        if is_ms_linker
-            link_mode = 'msvc'
-        elif is_windows
-            link_mode = 'mingw'
-        else
-            link_mode = 'gnu'
-        endif
-        version_map = custom_target(libname + '_map',
-                command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
-                input: sources,
-                output: 'lib@0@_exports.map'.format(libname))
-        version_map_path = version_map.full_path()
-        version_map_dep = [version_map]
-        lk_deps = [version_map]
-
-        if is_ms_linker
-            if is_ms_compiler
-                lk_args = ['/def:' + version_map.full_path()]
-            else
-                lk_args = ['-Wl,/def:' + version_map.full_path()]
-            endif
-        else
-            lk_args = ['-Wl,--version-script=' + version_map.full_path()]
-        endif
+    if is_ms_linker
+        link_mode = 'msvc'
+    elif is_windows
+        link_mode = 'mingw'
     else
-        version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
-        version_map_path = version_map
-        version_map_dep = []
-        lk_deps = [version_map]
-        if is_ms_linker
-            def_file = custom_target(libname + '_def',
-                    command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                    input: version_map,
-                    output: '@0@_exports.def'.format(libname))
-            lk_deps += [def_file]
+        link_mode = 'gnu'
+    endif
+    version_map = custom_target(libname + '_map',
+            command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
+            input: sources,
+            output: 'lib@0@_exports.map'.format(libname))
+    lk_deps = [version_map]
 
-            if is_ms_compiler
-                lk_args = ['/def:' + def_file.full_path()]
-            else
-                lk_args = ['-Wl,/def:' + def_file.full_path()]
-            endif
+    if is_ms_linker
+        if is_ms_compiler
+            lk_args = ['/def:' + version_map.full_path()]
         else
-            if is_windows
-                mingw_map = custom_target(libname + '_mingw',
-                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                        input: version_map,
-                        output: '@0@_mingw.map'.format(libname))
-                lk_deps += [mingw_map]
-
-                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
-            else
-                lk_args = ['-Wl,--version-script=' + version_map]
-            endif
+            lk_args = ['-Wl,/def:' + version_map.full_path()]
         endif
+    else
+        lk_args = ['-Wl,--version-script=' + version_map.full_path()]
     endif
 
     if developer_mode and not is_windows
@@ -317,11 +282,11 @@ foreach l:libraries
         # check-symbols.sh script, using it as a
         # dependency of the .so build
         lk_deps += custom_target(name + '.sym_chk',
-                command: [check_symbols, version_map_path, '@INPUT@'],
+                command: [check_symbols, version_map.full_path(), '@INPUT@'],
                 capture: true,
                 input: static_lib,
                 output: name + '.sym_chk',
-                depends: version_map_dep)
+                depends: [version_map])
     endif
 
     if not use_function_versioning or is_windows
-- 
2.48.1


^ permalink raw reply	[relevance 16%]

* [RFC v3 5/8] build: generate symbol maps
  2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
  2025-03-11  9:56 13%   ` [RFC v3 3/8] eal: rework function versioning macros David Marchand
@ 2025-03-11  9:56 18%   ` David Marchand
  2025-03-13 17:26  0%     ` Bruce Richardson
  2025-03-14 15:27  0%     ` Andre Muezerie
  2025-03-11  9:56 16%   ` [RFC v3 7/8] build: use dynamically generated version maps David Marchand
  2025-03-11 10:18  3%   ` [RFC v3 0/8] Symbol versioning and export rework Morten Brørup
  3 siblings, 2 replies; 200+ results
From: David Marchand @ 2025-03-11  9:56 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue

Rather than maintain a file in parallel of the code, symbols to be
exported can be marked with a token RTE_EXPORT_*SYMBOL.

From those marks, the build framework generates map files only for
symbols actually compiled (which means that the WINDOWS_NO_EXPORT hack
becomes unnecessary).

The build framework directly creates a map file in the format that the
linker expects (rather than converting from GNU linker to MSVC linker).

Empty maps are allowed again as a replacement for drivers/version.map.

The symbol check is updated to only support the new format.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since RFC v2:
- because of MSVC limitations wrt macro passed via cmdline,
  used an internal header for defining RTE_EXPORT_* macros,
- updated documentation and tooling,

---
 MAINTAINERS                                |   2 +
 buildtools/gen-version-map.py              | 111 ++++++++++
 buildtools/map-list-symbol.sh              |  10 +-
 buildtools/meson.build                     |   1 +
 config/meson.build                         |   2 +
 config/rte_export.h                        |  16 ++
 devtools/check-symbol-change.py            |  90 +++++++++
 devtools/check-symbol-maps.sh              |  14 --
 devtools/checkpatches.sh                   |   2 +-
 doc/guides/contributing/abi_versioning.rst | 224 ++-------------------
 drivers/meson.build                        |  94 +++++----
 drivers/version.map                        |   3 -
 lib/meson.build                            |  91 ++++++---
 13 files changed, 371 insertions(+), 289 deletions(-)
 create mode 100755 buildtools/gen-version-map.py
 create mode 100644 config/rte_export.h
 create mode 100755 devtools/check-symbol-change.py
 delete mode 100644 drivers/version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 312e6fcee5..04772951d3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -95,6 +95,7 @@ F: devtools/check-maintainers.sh
 F: devtools/check-forbidden-tokens.awk
 F: devtools/check-git-log.sh
 F: devtools/check-spdx-tag.sh
+F: devtools/check-symbol-change.py
 F: devtools/check-symbol-change.sh
 F: devtools/check-symbol-maps.sh
 F: devtools/checkpatches.sh
@@ -127,6 +128,7 @@ F: config/
 F: buildtools/check-symbols.sh
 F: buildtools/chkincs/
 F: buildtools/call-sphinx-build.py
+F: buildtools/gen-version-map.py
 F: buildtools/get-cpu-count.py
 F: buildtools/get-numa-count.py
 F: buildtools/list-dir-globs.py
diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
new file mode 100755
index 0000000000..b160aa828b
--- /dev/null
+++ b/buildtools/gen-version-map.py
@@ -0,0 +1,111 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2024 Red Hat, Inc.
+
+"""Generate a version map file used by GNU or MSVC linker."""
+
+import re
+import sys
+
+# From rte_export.h
+export_exp_sym_regexp = re.compile(r"^RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+), ([0-9]+.[0-9]+)\)")
+export_int_sym_regexp = re.compile(r"^RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
+export_sym_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^)]+)\)")
+# From rte_function_versioning.h
+ver_sym_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+ver_exp_sym_regexp = re.compile(r"^RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
+default_sym_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+
+with open(sys.argv[2]) as f:
+    abi = 'DPDK_{}'.format(re.match("([0-9]+).[0-9]", f.readline()).group(1))
+
+symbols = {}
+
+for file in sys.argv[4:]:
+    with open(file, encoding="utf-8") as f:
+        for ln in f.readlines():
+            node = None
+            symbol = None
+            comment = None
+            if export_exp_sym_regexp.match(ln):
+                node = 'EXPERIMENTAL'
+                symbol = export_exp_sym_regexp.match(ln).group(1)
+                comment = ' # added in {}'.format(export_exp_sym_regexp.match(ln).group(2))
+            elif export_int_sym_regexp.match(ln):
+                node = 'INTERNAL'
+                symbol = export_int_sym_regexp.match(ln).group(1)
+            elif export_sym_regexp.match(ln):
+                node = abi
+                symbol = export_sym_regexp.match(ln).group(1)
+            elif ver_sym_regexp.match(ln):
+                node = 'DPDK_{}'.format(ver_sym_regexp.match(ln).group(1))
+                symbol = ver_sym_regexp.match(ln).group(2)
+            elif ver_exp_sym_regexp.match(ln):
+                node = 'EXPERIMENTAL'
+                symbol = ver_exp_sym_regexp.match(ln).group(1)
+            elif default_sym_regexp.match(ln):
+                node = 'DPDK_{}'.format(default_sym_regexp.match(ln).group(1))
+                symbol = default_sym_regexp.match(ln).group(2)
+
+            if not symbol:
+                continue
+
+            if node not in symbols:
+                symbols[node] = {}
+            symbols[node][symbol] = comment
+
+if sys.argv[1] == 'msvc':
+    with open(sys.argv[3], "w") as outfile:
+        outfile.writelines(f"EXPORTS\n")
+        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
+            if key not in symbols:
+                continue
+            for symbol in sorted(symbols[key].keys()):
+                outfile.writelines(f"\t{symbol}\n")
+            del symbols[key]
+else:
+    with open(sys.argv[3], "w") as outfile:
+        local_token = False
+        for key in (abi, 'EXPERIMENTAL', 'INTERNAL'):
+            if key not in symbols:
+                continue
+            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
+            for symbol in sorted(symbols[key].keys()):
+                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
+                    prefix = '__emutls_v.'
+                else:
+                    prefix = ''
+                outfile.writelines(f"\t{prefix}{symbol};")
+                comment = symbols[key][symbol]
+                if comment:
+                    outfile.writelines(f"{comment}")
+                outfile.writelines("\n")
+            outfile.writelines("\n")
+            if not local_token:
+                outfile.writelines("\tlocal: *;\n")
+                local_token = True
+            outfile.writelines("};\n")
+            del symbols[key]
+        for key in sorted(symbols.keys()):
+            outfile.writelines(f"{key} {{\n\tglobal:\n\n")
+            for symbol in sorted(symbols[key].keys()):
+                if sys.argv[1] == 'mingw' and symbol.startswith('per_lcore'):
+                    prefix = '__emutls_v.'
+                else:
+                    prefix = ''
+                outfile.writelines(f"\t{prefix}{symbol};")
+                comment = symbols[key][symbol]
+                if comment:
+                    outfile.writelines(f"{comment}")
+                outfile.writelines("\n")
+            outfile.writelines(f"}} {abi};\n")
+            if not local_token:
+                outfile.writelines("\tlocal: *;\n")
+                local_token = True
+            del symbols[key]
+        # No exported symbol, add a catch all
+        if not local_token:
+            outfile.writelines(f"{abi} {{\n")
+            outfile.writelines("\tlocal: *;\n")
+            local_token = True
+            outfile.writelines("};\n")
diff --git a/buildtools/map-list-symbol.sh b/buildtools/map-list-symbol.sh
index eb98451d8e..0829df4be5 100755
--- a/buildtools/map-list-symbol.sh
+++ b/buildtools/map-list-symbol.sh
@@ -62,10 +62,14 @@ for file in $@; do
 		if (current_section == "") {
 			next;
 		}
+		symbol_version = current_version
+		if (/^[^}].*[^:*]; # added in /) {
+			symbol_version = $5
+		}
 		if ("'$version'" != "") {
-			if ("'$version'" == "unset" && current_version != "") {
+			if ("'$version'" == "unset" && symbol_version != "") {
 				next;
-			} else if ("'$version'" != "unset" && "'$version'" != current_version) {
+			} else if ("'$version'" != "unset" && "'$version'" != symbol_version) {
 				next;
 			}
 		}
@@ -73,7 +77,7 @@ for file in $@; do
 		if ("'$symbol'" == "all" || $1 == "'$symbol'") {
 			ret = 0;
 			if ("'$quiet'" == "") {
-				print "'$file' "current_section" "$1" "current_version;
+				print "'$file' "current_section" "$1" "symbol_version;
 			}
 			if ("'$symbol'" != "all") {
 				exit 0;
diff --git a/buildtools/meson.build b/buildtools/meson.build
index 4e2c1217a2..b745e9afa4 100644
--- a/buildtools/meson.build
+++ b/buildtools/meson.build
@@ -16,6 +16,7 @@ else
     py3 = ['meson', 'runpython']
 endif
 echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
+gen_version_map = py3 + files('gen-version-map.py')
 list_dir_globs = py3 + files('list-dir-globs.py')
 map_to_win_cmd = py3 + files('map_to_win.py')
 sphinx_wrapper = py3 + files('call-sphinx-build.py')
diff --git a/config/meson.build b/config/meson.build
index f31fef216c..54657055fb 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -303,8 +303,10 @@ endif
 # add -include rte_config to cflags
 if is_ms_compiler
     add_project_arguments('/FI', 'rte_config.h', language: 'c')
+    add_project_arguments('/FI', 'rte_export.h', language: 'c')
 else
     add_project_arguments('-include', 'rte_config.h', language: 'c')
+    add_project_arguments('-include', 'rte_export.h', language: 'c')
 endif
 
 # enable extra warnings and disable any unwanted warnings
diff --git a/config/rte_export.h b/config/rte_export.h
new file mode 100644
index 0000000000..83d871fe11
--- /dev/null
+++ b/config/rte_export.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2025 Red Hat, Inc.
+ */
+
+#ifndef RTE_EXPORT_H
+#define RTE_EXPORT_H
+
+/* *Internal* macros for exporting symbols, used by the build system.
+ * For RTE_EXPORT_EXPERIMENTAL_SYMBOL, ver indicates the
+ * version this symbol was introduced in.
+ */
+#define RTE_EXPORT_EXPERIMENTAL_SYMBOL(a, ver)
+#define RTE_EXPORT_INTERNAL_SYMBOL(a)
+#define RTE_EXPORT_SYMBOL(a)
+
+#endif /* RTE_EXPORT_H */
diff --git a/devtools/check-symbol-change.py b/devtools/check-symbol-change.py
new file mode 100755
index 0000000000..09709e4f06
--- /dev/null
+++ b/devtools/check-symbol-change.py
@@ -0,0 +1,90 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2025 Red Hat, Inc.
+
+"""Check exported symbols change in a patch."""
+
+import re
+import sys
+
+file_header_regexp = re.compile(r"^(\-\-\-|\+\+\+) [ab]/(lib|drivers)/([^/]+)/([^/]+)")
+# From rte_export.h
+export_exp_sym_regexp = re.compile(r"^.RTE_EXPORT_EXPERIMENTAL_SYMBOL\(([^,]+),")
+export_int_sym_regexp = re.compile(r"^.RTE_EXPORT_INTERNAL_SYMBOL\(([^)]+)\)")
+export_sym_regexp = re.compile(r"^.RTE_EXPORT_SYMBOL\(([^)]+)\)")
+# TODO, handle versioned symbols from rte_function_versioning.h
+# ver_sym_regexp = re.compile(r"^.RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+# ver_exp_sym_regexp = re.compile(r"^.RTE_VERSION_EXPERIMENTAL_SYMBOL\([^,]+, ([^,]+),")
+# default_sym_regexp = re.compile(r"^.RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+
+symbols = {}
+
+for file in sys.argv[1:]:
+    with open(file, encoding="utf-8") as f:
+        for ln in f.readlines():
+            if file_header_regexp.match(ln):
+                if file_header_regexp.match(ln).group(2) == "lib":
+                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
+                elif file_header_regexp.match(ln).group(3) == "intel":
+                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3, 4))
+                else:
+                    lib = '/'.join(file_header_regexp.match(ln).group(2, 3))
+
+                if lib not in symbols:
+                    symbols[lib] = {}
+                continue
+
+            if export_exp_sym_regexp.match(ln):
+                symbol = export_exp_sym_regexp.match(ln).group(1)
+                node = 'EXPERIMENTAL'
+            elif export_int_sym_regexp.match(ln):
+                node = 'INTERNAL'
+                symbol = export_int_sym_regexp.match(ln).group(1)
+            elif export_sym_regexp.match(ln):
+                symbol = export_sym_regexp.match(ln).group(1)
+                node = 'stable'
+            else:
+                continue
+
+            if symbol not in symbols[lib]:
+                symbols[lib][symbol] = {}
+            added = ln[0] == '+'
+            if added and 'added' in symbols[lib][symbol] and node != symbols[lib][symbol]['added']:
+                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
+            if not added and 'removed' in symbols[lib][symbol] and node != symbols[lib][symbol]['removed']:
+                print(f"{symbol} in {lib} was found in multiple ABI, please check.")
+            if added:
+                symbols[lib][symbol]['added'] = node
+            else:
+                symbols[lib][symbol]['removed'] = node
+
+    for lib in sorted(symbols.keys()):
+        error = False
+        for symbol in sorted(symbols[lib].keys()):
+            if 'removed' not in symbols[lib][symbol]:
+                # Symbol addition
+                node = symbols[lib][symbol]['added']
+                if node == 'stable':
+                    print(f"ERROR: {symbol} in {lib} has been added directly to stable ABI.")
+                    error = True
+                else:
+                    print(f"INFO: {symbol} in {lib} has been added to {node} ABI.")
+                continue
+
+            if 'added' not in symbols[lib][symbol]:
+                # Symbol removal
+                node = symbols[lib][symbol]['added']
+                if node == 'stable':
+                    print(f"INFO: {symbol} in {lib} has been removed from stable ABI.")
+                    print(f"Please check it has gone though the deprecation process.")
+                continue
+
+            if symbols[lib][symbol]['added'] == symbols[lib][symbol]['removed']:
+                # Symbol was moved around
+                continue
+
+            # Symbol modifications
+            added = symbols[lib][symbol]['added']
+            removed = symbols[lib][symbol]['removed']
+            print(f"INFO: {symbol} in {lib} is moving from {removed} to {added}")
+            print(f"Please check it has gone though the deprecation process.")
diff --git a/devtools/check-symbol-maps.sh b/devtools/check-symbol-maps.sh
index 6121f78ec6..fcd3931e5d 100755
--- a/devtools/check-symbol-maps.sh
+++ b/devtools/check-symbol-maps.sh
@@ -60,20 +60,6 @@ if [ -n "$local_miss_maps" ] ; then
     ret=1
 fi
 
-find_empty_maps ()
-{
-    for map in $@ ; do
-        [ $(buildtools/map-list-symbol.sh $map | wc -l) != '0' ] || echo $map
-    done
-}
-
-empty_maps=$(find_empty_maps $@)
-if [ -n "$empty_maps" ] ; then
-    echo "Found empty maps:"
-    echo "$empty_maps"
-    ret=1
-fi
-
 find_bad_format_maps ()
 {
     abi_version=$(cut -d'.' -f 1 ABI_VERSION)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 003bb49e04..7dcac7c8c9 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -33,7 +33,7 @@ VOLATILE,PREFER_PACKED,PREFER_ALIGNED,PREFER_PRINTF,STRLCPY,\
 PREFER_KERNEL_TYPES,PREFER_FALLTHROUGH,BIT_MACRO,CONST_STRUCT,\
 SPLIT_STRING,LONG_LINE_STRING,C99_COMMENT_TOLERANCE,\
 LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,\
-NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG"
+NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG,EXPORT_SYMBOL"
 options="$options $DPDK_CHECKPATCH_OPTIONS"
 
 print_usage () {
diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 88dd776b4c..addbb24b9e 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -58,12 +58,12 @@ persists over multiple releases.
 
 .. code-block:: none
 
- $ head ./lib/acl/version.map
+ $ head ./build/lib/librte_acl_exports.map
  DPDK_21 {
         global:
  ...
 
- $ head ./lib/eal/version.map
+ $ head ./build/lib/librte_eal_exports.map
  DPDK_21 {
         global:
  ...
@@ -77,7 +77,7 @@ that library.
 
 .. code-block:: none
 
- $ head ./lib/acl/version.map
+ $ head ./build/lib/librte_acl_exports.map
  DPDK_21 {
         global:
  ...
@@ -88,7 +88,7 @@ that library.
  } DPDK_21;
  ...
 
- $ head ./lib/eal/version.map
+ $ head ./build/lib/librte_eal_exports.map
  DPDK_21 {
         global:
  ...
@@ -100,12 +100,12 @@ how this may be done.
 
 .. code-block:: none
 
- $ head ./lib/acl/version.map
+ $ head ./build/lib/librte_acl_exports.map
  DPDK_22 {
         global:
  ...
 
- $ head ./lib/eal/version.map
+ $ head ./build/lib/librte_eal_exports.map
  DPDK_22 {
         global:
  ...
@@ -134,8 +134,7 @@ linked to the DPDK.
 
 To support backward compatibility the ``rte_function_versioning.h``
 header file provides macros to use when updating exported functions. These
-macros are used in conjunction with the ``version.map`` file for
-a given library to allow multiple versions of a symbol to exist in a shared
+macros allow multiple versions of a symbol to exist in a shared
 library so that older binaries need not be immediately recompiled.
 
 The macros are:
@@ -169,6 +168,7 @@ Assume we have a function as follows
   * Create an acl context object for apps to
   * manipulate
   */
+ RTE_EXPORT_SYMBOL(rte_acl_create)
  struct rte_acl_ctx *
  rte_acl_create(const struct rte_acl_param *param)
  {
@@ -187,6 +187,7 @@ private, is safe), but it also requires modifying the code as follows
   * Create an acl context object for apps to
   * manipulate
   */
+ RTE_EXPORT_SYMBOL(rte_acl_create)
  struct rte_acl_ctx *
  rte_acl_create(const struct rte_acl_param *param, int debug)
  {
@@ -203,78 +204,16 @@ The addition of a parameter to the function is ABI breaking as the function is
 public, and existing application may use it in its current form. However, the
 compatibility macros in DPDK allow a developer to use symbol versioning so that
 multiple functions can be mapped to the same public symbol based on when an
-application was linked to it. To see how this is done, we start with the
-requisite libraries version map file. Initially the version map file for the acl
-library looks like this
+application was linked to it.
 
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_create;
-        rte_acl_dump;
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
-   };
-
-This file needs to be modified as follows
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_create;
-        rte_acl_dump;
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-        rte_acl_create;
-
-   } DPDK_21;
-
-The addition of the new block tells the linker that a new version node
-``DPDK_22`` is available, which contains the symbol rte_acl_create, and inherits
-the symbols from the DPDK_21 node. This list is directly translated into a
-list of exported symbols when DPDK is compiled as a shared library.
-
-Next, we need to specify in the code which function maps to the rte_acl_create
+We need to specify in the code which function maps to the rte_acl_create
 symbol at which versions.  First, at the site of the initial symbol definition,
 we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
-the function return type, and the function name and its arguments.
+the function return type, the function name and its arguments.
 
 .. code-block:: c
 
+ -RTE_EXPORT_SYMBOL(rte_acl_create)
  -struct rte_acl_ctx *
  -rte_acl_create(const struct rte_acl_param *param)
  +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
@@ -293,6 +232,7 @@ We have now mapped the original rte_acl_create symbol to the original function
 
 Please see the section :ref:`Enabling versioning macros
 <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
+
 Next, we need to create the new version of the symbol. We create a new
 function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
 
@@ -312,9 +252,9 @@ The macro instructs the linker to create the new default symbol
 ``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
 (declared by the macro).
 
-And that's it, on the next shared library rebuild, there will be two versions of
-rte_acl_create, an old DPDK_21 version, used by previously built applications,
-and a new DPDK_22 version, used by future built applications.
+And that's it. On the next shared library rebuild, there will be two versions of rte_acl_create,
+an old DPDK_21 version, used by previously built applications, and a new DPDK_22 version,
+used by future built applications.
 
 .. note::
 
@@ -364,6 +304,7 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
     * Create an acl context object for apps to
     * manipulate
     */
+   RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_acl_create)
    __rte_experimental
    struct rte_acl_ctx *
    rte_acl_create(const struct rte_acl_param *param)
@@ -371,27 +312,8 @@ Assume we have an experimental function ``rte_acl_create`` as follows:
    ...
    }
 
-In the map file, experimental symbols are listed as part of the ``EXPERIMENTAL``
-version node.
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-        ...
-
-        local: *;
-   };
-
-   EXPERIMENTAL {
-        global:
-
-        rte_acl_create;
-   };
-
 When we promote the symbol to the stable ABI, we simply strip the
-``__rte_experimental`` annotation from the function and move the symbol from the
-``EXPERIMENTAL`` node, to the node of the next major ABI version as follow.
+``__rte_experimental`` annotation from the function.
 
 .. code-block:: c
 
@@ -399,31 +321,13 @@ When we promote the symbol to the stable ABI, we simply strip the
     * Create an acl context object for apps to
     * manipulate
     */
+   RTE_EXPORT_SYMBOL(rte_acl_create)
    struct rte_acl_ctx *
    rte_acl_create(const struct rte_acl_param *param)
    {
           ...
    }
 
-We then update the map file, adding the symbol ``rte_acl_create``
-to the ``DPDK_22`` version node.
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-        ...
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-
-        rte_acl_create;
-   } DPDK_21;
-
-
 Although there are strictly no guarantees or commitments associated with
 :ref:`experimental symbols <experimental_apis>`, a maintainer may wish to offer
 an alias to experimental. The process to add an alias to experimental,
@@ -452,30 +356,6 @@ and ``DPDK_22`` version nodes.
       return rte_acl_create(param);
    }
 
-In the map file, we map the symbol to both the ``EXPERIMENTAL``
-and ``DPDK_22`` version nodes.
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-        ...
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-
-        rte_acl_create;
-   } DPDK_21;
-
-   EXPERIMENTAL {
-        global:
-
-        rte_acl_create;
-   };
-
 .. _abi_deprecation:
 
 Deprecating part of a public API
@@ -484,38 +364,7 @@ ________________________________
 Lets assume that you've done the above updates, and in preparation for the next
 major ABI version you decide you would like to retire the old version of the
 function. After having gone through the ABI deprecation announcement process,
-removal is easy. Start by removing the symbol from the requisite version map
-file:
-
-.. code-block:: none
-
-   DPDK_21 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_dump;
- -      rte_acl_create
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
-   };
-
-   DPDK_22 {
-        global:
-        rte_acl_create;
-   } DPDK_21;
-
+removal is easy.
 
 Next remove the corresponding versioned export.
 
@@ -539,36 +388,7 @@ of a major ABI version. If a version node completely specifies an API, then
 removing part of it, typically makes it incomplete. In those cases it is better
 to remove the entire node.
 
-To do this, start by modifying the version map file, such that all symbols from
-the node to be removed are merged into the next node in the map.
-
-In the case of our map above, it would transform to look as follows
-
-.. code-block:: none
-
-   DPDK_22 {
-        global:
-
-        rte_acl_add_rules;
-        rte_acl_build;
-        rte_acl_classify;
-        rte_acl_classify_alg;
-        rte_acl_classify_scalar;
-        rte_acl_dump;
-        rte_acl_create
-        rte_acl_find_existing;
-        rte_acl_free;
-        rte_acl_ipv4vlan_add_rules;
-        rte_acl_ipv4vlan_build;
-        rte_acl_list_dump;
-        rte_acl_reset;
-        rte_acl_reset_rules;
-        rte_acl_set_ctx_classify;
-
-        local: *;
- };
-
-Then any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
+Any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
 updated to point to the new version node in any header files for all affected
 symbols.
 
diff --git a/drivers/meson.build b/drivers/meson.build
index 05391a575d..c8bc556f1a 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -245,14 +245,14 @@ foreach subpath:subdirs
                 dependencies: static_deps,
                 c_args: cflags)
         objs += tmp_lib.extract_all_objects(recursive: true)
-        sources = custom_target(out_filename,
+        sources_pmd_info = custom_target(out_filename,
                 command: [pmdinfo, tmp_lib.full_path(), '@OUTPUT@', pmdinfogen],
                 output: out_filename,
                 depends: [tmp_lib])
 
         # now build the static driver
         static_lib = static_library(lib_name,
-                sources,
+                sources_pmd_info,
                 objects: objs,
                 include_directories: includes,
                 dependencies: static_deps,
@@ -262,48 +262,72 @@ foreach subpath:subdirs
         # now build the shared driver
         version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), drv_path)
 
-        lk_deps = []
-        lk_args = []
         if not fs.is_file(version_map)
-            version_map = '@0@/version.map'.format(meson.current_source_dir())
-            lk_deps += [version_map]
-        else
-            lk_deps += [version_map]
-            if not is_windows and developer_mode
-                # on unix systems check the output of the
-                # check-symbols.sh script, using it as a
-                # dependency of the .so build
-                lk_deps += custom_target(lib_name + '.sym_chk',
-                        command: [check_symbols, version_map, '@INPUT@'],
-                        capture: true,
-                        input: static_lib,
-                        output: lib_name + '.sym_chk')
+            if is_ms_linker
+                link_mode = 'msvc'
+            elif is_windows
+                link_mode = 'mingw'
+            else
+                link_mode = 'gnu'
             endif
-        endif
+            version_map = custom_target(lib_name + '_map',
+                    command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
+                    input: sources,
+                    output: 'lib@0@_exports.map'.format(lib_name))
+            version_map_path = version_map.full_path()
+            version_map_dep = [version_map]
+            lk_deps = [version_map]
 
-        if is_windows
             if is_ms_linker
-                def_file = custom_target(lib_name + '_def',
-                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                        input: version_map,
-                        output: '@0@_exports.def'.format(lib_name))
-                lk_deps += [def_file]
-
-                lk_args = ['-Wl,/def:' + def_file.full_path()]
+                if is_ms_compiler
+                    lk_args = ['/def:' + version_map.full_path()]
+                else
+                    lk_args = ['-Wl,/def:' + version_map.full_path()]
+                endif
             else
-                mingw_map = custom_target(lib_name + '_mingw',
-                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                        input: version_map,
-                        output: '@0@_mingw.map'.format(lib_name))
-                lk_deps += [mingw_map]
-
-                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+                lk_args = ['-Wl,--version-script=' + version_map.full_path()]
             endif
         else
-            lk_args = ['-Wl,--version-script=' + version_map]
+            version_map_path = version_map
+            version_map_dep = []
+            lk_deps = [version_map]
+
+            if is_windows
+                if is_ms_linker
+                    def_file = custom_target(lib_name + '_def',
+                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
+                            input: version_map,
+                            output: '@0@_exports.def'.format(lib_name))
+                    lk_deps += [def_file]
+
+                    lk_args = ['-Wl,/def:' + def_file.full_path()]
+                else
+                    mingw_map = custom_target(lib_name + '_mingw',
+                            command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
+                            input: version_map,
+                            output: '@0@_mingw.map'.format(lib_name))
+                    lk_deps += [mingw_map]
+
+                    lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+                endif
+            else
+                lk_args = ['-Wl,--version-script=' + version_map]
+            endif
+        endif
+
+        if not is_windows and developer_mode
+            # on unix systems check the output of the
+            # check-symbols.sh script, using it as a
+            # dependency of the .so build
+            lk_deps += custom_target(lib_name + '.sym_chk',
+                    command: [check_symbols, version_map_path, '@INPUT@'],
+                    capture: true,
+                    input: static_lib,
+                    output: lib_name + '.sym_chk',
+                    depends: version_map_dep)
         endif
 
-        shared_lib = shared_library(lib_name, sources,
+        shared_lib = shared_library(lib_name, sources_pmd_info,
                 objects: objs,
                 include_directories: includes,
                 dependencies: shared_deps,
diff --git a/drivers/version.map b/drivers/version.map
deleted file mode 100644
index 17cc97bda6..0000000000
--- a/drivers/version.map
+++ /dev/null
@@ -1,3 +0,0 @@
-DPDK_25 {
-	local: *;
-};
diff --git a/lib/meson.build b/lib/meson.build
index ce92cb5537..b6bac02b48 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2019 Intel Corporation
 
+fs = import('fs')
 
 # process all libraries equally, as far as possible
 # "core" libs first, then others alphabetically as far as possible
@@ -254,42 +255,60 @@ foreach l:libraries
             include_directories: includes,
             dependencies: static_deps)
 
-    if not use_function_versioning or is_windows
-        # use pre-build objects to build shared lib
-        sources = []
-        objs += static_lib.extract_all_objects(recursive: false)
-    else
-        # for compat we need to rebuild with
-        # RTE_BUILD_SHARED_LIB defined
-        cflags += '-DRTE_BUILD_SHARED_LIB'
-    endif
-
-    version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
-    lk_deps = [version_map]
-
-    if is_ms_linker
-        def_file = custom_target(libname + '_def',
-                command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
-                input: version_map,
-                output: '@0@_exports.def'.format(libname))
-        lk_deps += [def_file]
+    if not fs.is_file('@0@/@1@/version.map'.format(meson.current_source_dir(), l))
+        if is_ms_linker
+            link_mode = 'msvc'
+        elif is_windows
+            link_mode = 'mingw'
+        else
+            link_mode = 'gnu'
+        endif
+        version_map = custom_target(libname + '_map',
+                command: [gen_version_map, link_mode, abi_version_file, '@OUTPUT@', '@INPUT@'],
+                input: sources,
+                output: 'lib@0@_exports.map'.format(libname))
+        version_map_path = version_map.full_path()
+        version_map_dep = [version_map]
+        lk_deps = [version_map]
 
-        if is_ms_compiler
-            lk_args = ['/def:' + def_file.full_path()]
+        if is_ms_linker
+            if is_ms_compiler
+                lk_args = ['/def:' + version_map.full_path()]
+            else
+                lk_args = ['-Wl,/def:' + version_map.full_path()]
+            endif
         else
-            lk_args = ['-Wl,/def:' + def_file.full_path()]
+            lk_args = ['-Wl,--version-script=' + version_map.full_path()]
         endif
     else
-        if is_windows
-            mingw_map = custom_target(libname + '_mingw',
+        version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
+        version_map_path = version_map
+        version_map_dep = []
+        lk_deps = [version_map]
+        if is_ms_linker
+            def_file = custom_target(libname + '_def',
                     command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
                     input: version_map,
-                    output: '@0@_mingw.map'.format(libname))
-            lk_deps += [mingw_map]
+                    output: '@0@_exports.def'.format(libname))
+            lk_deps += [def_file]
 
-            lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+            if is_ms_compiler
+                lk_args = ['/def:' + def_file.full_path()]
+            else
+                lk_args = ['-Wl,/def:' + def_file.full_path()]
+            endif
         else
-            lk_args = ['-Wl,--version-script=' + version_map]
+            if is_windows
+                mingw_map = custom_target(libname + '_mingw',
+                        command: [map_to_win_cmd, '@INPUT@', '@OUTPUT@'],
+                        input: version_map,
+                        output: '@0@_mingw.map'.format(libname))
+                lk_deps += [mingw_map]
+
+                lk_args = ['-Wl,--version-script=' + mingw_map.full_path()]
+            else
+                lk_args = ['-Wl,--version-script=' + version_map]
+            endif
         endif
     endif
 
@@ -298,11 +317,21 @@ foreach l:libraries
         # check-symbols.sh script, using it as a
         # dependency of the .so build
         lk_deps += custom_target(name + '.sym_chk',
-                command: [check_symbols,
-                    version_map, '@INPUT@'],
+                command: [check_symbols, version_map_path, '@INPUT@'],
                 capture: true,
                 input: static_lib,
-                output: name + '.sym_chk')
+                output: name + '.sym_chk',
+                depends: version_map_dep)
+    endif
+
+    if not use_function_versioning or is_windows
+        # use pre-build objects to build shared lib
+        sources = []
+        objs += static_lib.extract_all_objects(recursive: false)
+    else
+        # for compat we need to rebuild with
+        # RTE_BUILD_SHARED_LIB defined
+        cflags += '-DRTE_BUILD_SHARED_LIB'
     endif
 
     shared_lib = shared_library(libname,
-- 
2.48.1


^ permalink raw reply	[relevance 18%]

* [RFC v3 3/8] eal: rework function versioning macros
  2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
@ 2025-03-11  9:56 13%   ` David Marchand
  2025-03-13 16:53  0%     ` Bruce Richardson
  2025-03-11  9:56 18%   ` [RFC v3 5/8] build: generate symbol maps David Marchand
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ results
From: David Marchand @ 2025-03-11  9:56 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue, Tyler Retzlaff, Jasvinder Singh

For versioning symbols:
- MSVC uses pragmas on the symbol,
- GNU linker uses special asm directives,

To accommodate both GNU linker and MSVC linker, introduce new macros for
exporting and versioning symbols that will surround the whole function.

This has the advantage of hiding all the ugly details in the macros.
Now versioning a symbol is just a call to a single macro:
- RTE_VERSION_SYMBOL (resp. RTE_VERSION_EXPERIMENTAL_SYMBOL), for
  keeping an old implementation code under a versioned function (resp.
  experimental function),
- RTE_DEFAULT_SYMBOL, for declaring the new default versioned function,
  and handling the static link special case, instead of
  BIND_DEFAULT_SYMBOL + MAP_STATIC_SYMBOL,

Update lib/net accordingly.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since RFC v2:

Changes since RFC v1:
- renamed and prefixed macros,
- reindented in prevision of second patch,

---
 doc/guides/contributing/abi_versioning.rst | 165 +++++----------------
 lib/eal/include/rte_function_versioning.h  |  96 +++++-------
 lib/net/net_crc.h                          |  15 --
 lib/net/rte_net_crc.c                      |  28 +---
 4 files changed, 77 insertions(+), 227 deletions(-)

diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 7afd1c1886..88dd776b4c 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -138,27 +138,20 @@ macros are used in conjunction with the ``version.map`` file for
 a given library to allow multiple versions of a symbol to exist in a shared
 library so that older binaries need not be immediately recompiled.
 
-The macros exported are:
+The macros are:
 
-* ``VERSION_SYMBOL(b, e, n)``: Creates a symbol version table entry binding
-  versioned symbol ``b@DPDK_n`` to the internal function ``be``.
+* ``RTE_VERSION_SYMBOL(ver, type, name, args``: Creates a symbol version table
+  entry binding symbol ``<name>@DPDK_<ver>`` to the internal function name
+  ``<name>_v<ver>``.
 
-* ``BIND_DEFAULT_SYMBOL(b, e, n)``: Creates a symbol version entry instructing
-  the linker to bind references to symbol ``b`` to the internal symbol
-  ``be``.
+* ``RTE_DEFAULT_SYMBO(ver, type, name, args)``: Creates a symbol version entry
+  instructing the linker to bind references to symbol ``<name>`` to the internal
+  symbol ``<name>_v<ver>``.
 
-* ``MAP_STATIC_SYMBOL(f, p)``: Declare the prototype ``f``, and map it to the
-  fully qualified function ``p``, so that if a symbol becomes versioned, it
-  can still be mapped back to the public symbol name.
-
-* ``__vsym``:  Annotation to be used in a declaration of the internal symbol
-  ``be`` to signal that it is being used as an implementation of a particular
-  version of symbol ``b``.
-
-* ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
-  binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
-  The macro is used when a symbol matures to become part of the stable ABI, to
-  provide an alias to experimental until the next major ABI version.
+* ``RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args)``:  Similar to RTE_VERSION_SYMBOL
+  but for experimental API symbols. The macro is used when a symbol matures
+  to become part of the stable ABI, to provide an alias to experimental
+  until the next major ABI version.
 
 .. _example_abi_macro_usage:
 
@@ -277,49 +270,36 @@ list of exported symbols when DPDK is compiled as a shared library.
 
 Next, we need to specify in the code which function maps to the rte_acl_create
 symbol at which versions.  First, at the site of the initial symbol definition,
-we need to update the function so that it is uniquely named, and not in conflict
-with the public symbol name
+we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
+the function return type, and the function name and its arguments.
 
 .. code-block:: c
 
  -struct rte_acl_ctx *
  -rte_acl_create(const struct rte_acl_param *param)
- +struct rte_acl_ctx * __vsym
- +rte_acl_create_v21(const struct rte_acl_param *param)
+ +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
  {
         size_t sz;
         struct rte_acl_ctx *ctx;
         ...
-
-Note that the base name of the symbol was kept intact, as this is conducive to
-the macros used for versioning symbols and we have annotated the function as
-``__vsym``, an implementation of a versioned symbol . That is our next step,
-mapping this new symbol name to the initial symbol name at version node 21.
-Immediately after the function, we add the VERSION_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   VERSION_SYMBOL(rte_acl_create, _v21, 21);
+ }
 
 Remembering to also add the rte_function_versioning.h header to the requisite c
 file where these changes are being made. The macro instructs the linker to
 create a new symbol ``rte_acl_create@DPDK_21``, which matches the symbol created
-in older builds, but now points to the above newly named function. We have now
-mapped the original rte_acl_create symbol to the original function (but with a
-new name).
+in older builds, but now points to the above newly named function ``rte_acl_create_v21``.
+We have now mapped the original rte_acl_create symbol to the original function
+(but with a new name).
 
 Please see the section :ref:`Enabling versioning macros
 <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
-Next, we need to create the new ``v22`` version of the symbol. We create a new
-function name, with the ``v22`` suffix, and implement it appropriately.
+Next, we need to create the new version of the symbol. We create a new
+function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
 
 .. code-block:: c
 
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
+   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
+        int debug))
    {
         struct rte_acl_ctx *ctx = rte_acl_create_v21(param);
 
@@ -328,35 +308,9 @@ function name, with the ``v22`` suffix, and implement it appropriately.
         return ctx;
    }
 
-This code serves as our new API call. Its the same as our old call, but adds the
-new parameter in place. Next we need to map this function to the new default
-symbol ``rte_acl_create@DPDK_22``. To do this, immediately after the function,
-we add the BIND_DEFAULT_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
-
 The macro instructs the linker to create the new default symbol
-``rte_acl_create@DPDK_22``, which points to the above newly named function.
-
-We finally modify the prototype of the call in the public header file,
-such that it contains both versions of the symbol and the public API.
-
-.. code-block:: c
-
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v21(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
-
+``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
+(declared by the macro).
 
 And that's it, on the next shared library rebuild, there will be two versions of
 rte_acl_create, an old DPDK_21 version, used by previously built applications,
@@ -365,43 +319,10 @@ and a new DPDK_22 version, used by future built applications.
 .. note::
 
    **Before you leave**, please take care reviewing the sections on
-   :ref:`mapping static symbols <mapping_static_symbols>`,
    :ref:`enabling versioning macros <enabling_versioning_macros>`,
    and :ref:`ABI deprecation <abi_deprecation>`.
 
 
-.. _mapping_static_symbols:
-
-Mapping static symbols
-______________________
-
-Now we've taken what was a public symbol, and duplicated it into two uniquely
-and differently named symbols. We've then mapped each of those back to the
-public symbol ``rte_acl_create`` with different version tags. This only applies
-to dynamic linking, as static linking has no notion of versioning. That leaves
-this code in a position of no longer having a symbol simply named
-``rte_acl_create`` and a static build will fail on that missing symbol.
-
-To correct this, we can simply map a function of our choosing back to the public
-symbol in the static build with the ``MAP_STATIC_SYMBOL`` macro.  Generally the
-assumption is that the most recent version of the symbol is the one you want to
-map.  So, back in the C file where, immediately after ``rte_acl_create_v22`` is
-defined, we add this
-
-
-.. code-block:: c
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug)
-   {
-        ...
-   }
-   MAP_STATIC_SYMBOL(struct rte_acl_ctx *rte_acl_create(const struct rte_acl_param *param, int debug), rte_acl_create_v22);
-
-That tells the compiler that, when building a static library, any calls to the
-symbol ``rte_acl_create`` should be linked to ``rte_acl_create_v22``
-
-
 .. _enabling_versioning_macros:
 
 Enabling versioning macros
@@ -519,26 +440,17 @@ and ``DPDK_22`` version nodes.
     * Create an acl context object for apps to
     * manipulate
     */
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param)
+   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create,
+        (const struct rte_acl_param *param))
    {
    ...
    }
 
-   __rte_experimental
-   struct rte_acl_ctx *
-   rte_acl_create_e(const struct rte_acl_param *param)
-   {
-      return rte_acl_create(param);
-   }
-   VERSION_SYMBOL_EXPERIMENTAL(rte_acl_create, _e);
-
-   struct rte_acl_ctx *
-   rte_acl_create_v22(const struct rte_acl_param *param)
+   RTE_VERSION_EXPERIMENTAL_SYMBOL(struct rte_acl_ctx *, rte_acl_create,
+        (const struct rte_acl_param *param))
    {
       return rte_acl_create(param);
    }
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
 
 In the map file, we map the symbol to both the ``EXPERIMENTAL``
 and ``DPDK_22`` version nodes.
@@ -564,13 +476,6 @@ and ``DPDK_22`` version nodes.
         rte_acl_create;
    };
 
-.. note::
-
-   Please note, similar to :ref:`symbol versioning <example_abi_macro_usage>`,
-   when aliasing to experimental you will also need to take care of
-   :ref:`mapping static symbols <mapping_static_symbols>`.
-
-
 .. _abi_deprecation:
 
 Deprecating part of a public API
@@ -616,10 +521,10 @@ Next remove the corresponding versioned export.
 
 .. code-block:: c
 
- -VERSION_SYMBOL(rte_acl_create, _v21, 21);
+ -RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param))
 
 
-Note that the internal function definition could also be removed, but its used
+Note that the internal function definition must also be removed, but its used
 in our example by the newer version ``v22``, so we leave it in place and declare
 it as static. This is a coding style choice.
 
@@ -663,16 +568,18 @@ In the case of our map above, it would transform to look as follows
         local: *;
  };
 
-Then any uses of BIND_DEFAULT_SYMBOL that pointed to the old node should be
+Then any uses of RTE_DEFAULT_SYMBOL that pointed to the old node should be
 updated to point to the new version node in any header files for all affected
 symbols.
 
 .. code-block:: c
 
- -BIND_DEFAULT_SYMBOL(rte_acl_create, _v21, 21);
- +BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
+ -RTE_DEFAULT_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
+        int debug))
+ -RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
+        int debug))
 
-Lastly, any VERSION_SYMBOL macros that point to the old version nodes
+Lastly, any RTE_VERSION_SYMBOL macros that point to the old version nodes
 should be removed, taking care to preserve any code that is shared
 with the new version node.
 
diff --git a/lib/eal/include/rte_function_versioning.h b/lib/eal/include/rte_function_versioning.h
index eb6dd2bc17..0020ce4885 100644
--- a/lib/eal/include/rte_function_versioning.h
+++ b/lib/eal/include/rte_function_versioning.h
@@ -11,8 +11,6 @@
 #error Use of function versioning disabled, is "use_function_versioning=true" in meson.build?
 #endif
 
-#ifdef RTE_BUILD_SHARED_LIB
-
 /*
  * Provides backwards compatibility when updating exported functions.
  * When a symbol is exported from a library to provide an API, it also provides a
@@ -20,80 +18,54 @@
  * arguments, etc.  On occasion that function may need to change to accommodate
  * new functionality, behavior, etc.  When that occurs, it is desirable to
  * allow for backwards compatibility for a time with older binaries that are
- * dynamically linked to the dpdk.  To support that, the __vsym and
- * VERSION_SYMBOL macros are created.  They, in conjunction with the
- * version.map file for a given library allow for multiple versions of
- * a symbol to exist in a shared library so that older binaries need not be
- * immediately recompiled.
- *
- * Refer to the guidelines document in the docs subdirectory for details on the
- * use of these macros
+ * dynamically linked to the dpdk.
  */
 
-/*
- * Macro Parameters:
- * b - function base name
- * e - function version extension, to be concatenated with base name
- * n - function symbol version string to be applied
- * f - function prototype
- * p - full function symbol name
- */
+#ifdef RTE_BUILD_SHARED_LIB
 
 /*
- * VERSION_SYMBOL
- * Creates a symbol version table entry binding symbol <b>@DPDK_<n> to the internal
- * function name <b><e>
+ * RTE_VERSION_SYMBOL
+ * Creates a symbol version table entry binding symbol <name>@DPDK_<ver> to the internal
+ * function name <name>_v<ver>.
  */
-#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@DPDK_" RTE_STR(n))
+#define RTE_VERSION_SYMBOL(ver, type, name, args) \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@DPDK_" RTE_STR(ver)); \
+__rte_used type name ## _v ## ver args; \
+type name ## _v ## ver args
 
 /*
- * VERSION_SYMBOL_EXPERIMENTAL
- * Creates a symbol version table entry binding the symbol <b>@EXPERIMENTAL to the internal
- * function name <b><e>. The macro is used when a symbol matures to become part of the stable ABI,
- * to provide an alias to experimental for some time.
+ * RTE_VERSION_EXPERIMENTAL_SYMBOL
+ * Similar to RTE_VERSION_SYMBOL but for experimental API symbols.
+ * This is mainly used for keeping compatibility for symbols that get promoted to stable ABI.
  */
-#define VERSION_SYMBOL_EXPERIMENTAL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@EXPERIMENTAL")
+#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
+__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL") \
+__rte_used type name ## _exp args; \
+type name ## _exp args
 
 /*
- * BIND_DEFAULT_SYMBOL
+ * RTE_DEFAULT_SYMBOL
  * Creates a symbol version entry instructing the linker to bind references to
- * symbol <b> to the internal symbol <b><e>
+ * symbol <name> to the internal symbol <name>_v<ver>.
  */
-#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
+#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@@DPDK_" RTE_STR(ver)); \
+__rte_used type name ## _v ## ver args; \
+type name ## _v ## ver args
 
-/*
- * __vsym
- * Annotation to be used in declaration of the internal symbol <b><e> to signal
- * that it is being used as an implementation of a particular version of symbol
- * <b>.
- */
-#define __vsym __rte_used
+#else /* !RTE_BUILD_SHARED_LIB */
 
-/*
- * MAP_STATIC_SYMBOL
- * If a function has been bifurcated into multiple versions, none of which
- * are defined as the exported symbol name in the map file, this macro can be
- * used to alias a specific version of the symbol to its exported name.  For
- * example, if you have 2 versions of a function foo_v1 and foo_v2, where the
- * former is mapped to foo@DPDK_1 and the latter is mapped to foo@DPDK_2 when
- * building a shared library, this macro can be used to map either foo_v1 or
- * foo_v2 to the symbol foo when building a static library, e.g.:
- * MAP_STATIC_SYMBOL(void foo(), foo_v2);
- */
-#define MAP_STATIC_SYMBOL(f, p)
+#define RTE_VERSION_SYMBOL(ver, type, name, args) \
+type name ## _v ## ver args; \
+type name ## _v ## ver args
 
-#else
-/*
- * No symbol versioning in use
- */
-#define VERSION_SYMBOL(b, e, n)
-#define VERSION_SYMBOL_EXPERIMENTAL(b, e)
-#define __vsym
-#define BIND_DEFAULT_SYMBOL(b, e, n)
-#define MAP_STATIC_SYMBOL(f, p) f __attribute__((alias(RTE_STR(p))))
-/*
- * RTE_BUILD_SHARED_LIB=n
- */
-#endif
+#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) \
+type name ## _exp args; \
+type name ## _exp args
+
+#define RTE_DEFAULT_SYMBOL(ver, type, name, args) \
+type name args
+
+#endif /* RTE_BUILD_SHARED_LIB */
 
 #endif /* _RTE_FUNCTION_VERSIONING_H_ */
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 4930e2f0b3..320b0edca8 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -7,21 +7,6 @@
 
 #include "rte_net_crc.h"
 
-void
-rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
-
-struct rte_net_crc *
-rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
-	enum rte_net_crc_type type);
-
-uint32_t
-rte_net_crc_calc_v25(const void *data,
-	uint32_t data_len, enum rte_net_crc_type type);
-
-uint32_t
-rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len);
-
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 2fb3eec231..1943d46295 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -345,8 +345,7 @@ handlers_init(enum rte_net_crc_alg alg)
 
 /* Public API */
 
-void
-rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
+RTE_VERSION_SYMBOL(25, void, rte_net_crc_set_alg, (enum rte_net_crc_alg alg))
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -373,10 +372,9 @@ rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
-VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
 
-struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
-	enum rte_net_crc_type type)
+RTE_DEFAULT_SYMBOL(26, struct rte_net_crc *, rte_net_crc_set_alg, (enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type))
 {
 	uint16_t max_simd_bitwidth;
 	struct rte_net_crc *crc;
@@ -414,20 +412,14 @@ struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
 	}
 	return crc;
 }
-BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
-MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
-	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
-	rte_net_crc_set_alg_v26);
 
 void rte_net_crc_free(struct rte_net_crc *crc)
 {
 	rte_free(crc);
 }
 
-uint32_t
-rte_net_crc_calc_v25(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type)
+RTE_VERSION_SYMBOL(25, uint32_t, rte_net_crc_calc, (const void *data, uint32_t data_len,
+	enum rte_net_crc_type type))
 {
 	uint32_t ret;
 	rte_net_crc_handler f_handle;
@@ -437,18 +429,12 @@ rte_net_crc_calc_v25(const void *data,
 
 	return ret;
 }
-VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
 
-uint32_t
-rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len)
+RTE_DEFAULT_SYMBOL(26, uint32_t, rte_net_crc_calc, (const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len))
 {
 	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
 }
-BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
-MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len),
-	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
-- 
2.48.1


^ permalink raw reply	[relevance 13%]

* [RFC v3 0/8] Symbol versioning and export rework
  2025-03-05 21:23  6% [RFC] eal: add new function versioning macros David Marchand
  2025-03-06 12:50  6% ` [RFC v2 1/2] " David Marchand
@ 2025-03-11  9:55  3% ` David Marchand
  2025-03-11  9:56 13%   ` [RFC v3 3/8] eal: rework function versioning macros David Marchand
                     ` (3 more replies)
  2025-03-17 15:42  3% ` [RFC v4 " David Marchand
  2 siblings, 4 replies; 200+ results
From: David Marchand @ 2025-03-11  9:55 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue

So far, each DPDK library (or driver) exposing symbols in an ABI had to
maintain a version.map and use some macros for symbol versioning,
specially crafted with the GNU linker in mind.

This series proposes to rework the whole principle, and instead rely on
marking the symbol exports in the source code itself, then let it to the
build framework to produce a version script adapted to the linker in use
(think GNU linker vs MSVC linker).

This greatly simplifies versioning symbols: a developer does not need to
know anything about version.map, or that a versioned symbol must be
renamed with _v26, annotated with __vsym, exported in a header etc...

Checking symbol maps becomes unnecessary since generated by the build
framework.

Updating to a new ABI is just a matter of bumping the value in
ABI_VERSION.


Comments please.


-- 
David Marchand

Changes since RFC v2:
- updated RTE_VERSION_SYMBOL() (and friends) so that only the fonction
  signature is enclosed in the macro,
- dropped invalid exports for some dead symbols or inline helpers,
- updated documentation and tooling,
- converted the whole tree (via a local script of mine),

David Marchand (8):
  lib: remove incorrect exported symbols
  drivers: remove incorrect exported symbols
  eal: rework function versioning macros
  buildtools: display version when listing symbols
  build: generate symbol maps
  build: mark exported symbols
  build: use dynamically generated version maps
  build: remove static version maps

 .github/workflows/build.yml                   |   1 -
 MAINTAINERS                                   |   9 +-
 buildtools/check-symbols.sh                   |  33 +-
 buildtools/gen-version-map.py                 | 111 ++++
 buildtools/map-list-symbol.sh                 |  15 +-
 buildtools/map_to_win.py                      |  41 --
 buildtools/meson.build                        |   2 +-
 config/meson.build                            |   2 +
 config/rte_export.h                           |  16 +
 devtools/check-symbol-change.py               |  90 +++
 devtools/check-symbol-change.sh               | 186 ------
 devtools/check-symbol-maps.sh                 | 115 ----
 devtools/checkpatches.sh                      |   4 +-
 devtools/update-abi.sh                        |  46 --
 devtools/update_version_map_abi.py            | 210 -------
 doc/guides/contributing/abi_policy.rst        |  21 +-
 doc/guides/contributing/abi_versioning.rst    | 385 ++----------
 doc/guides/contributing/coding_style.rst      |   7 -
 .../contributing/img/patch_cheatsheet.svg     | 303 +++++----
 doc/guides/contributing/patches.rst           |   6 +-
 drivers/baseband/acc/rte_acc100_pmd.c         |   1 +
 drivers/baseband/acc/version.map              |  10 -
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c         |   1 +
 drivers/baseband/fpga_5gnr_fec/version.map    |  11 -
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |   1 +
 drivers/baseband/fpga_lte_fec/version.map     |  10 -
 drivers/bus/auxiliary/auxiliary_common.c      |   2 +
 drivers/bus/auxiliary/version.map             |   8 -
 drivers/bus/cdx/cdx.c                         |   4 +
 drivers/bus/cdx/cdx_vfio.c                    |   4 +
 drivers/bus/cdx/version.map                   |  14 -
 drivers/bus/dpaa/dpaa_bus.c                   | 104 ++++
 drivers/bus/dpaa/version.map                  | 109 ----
 drivers/bus/fslmc/fslmc_bus.c                 |   4 +
 drivers/bus/fslmc/fslmc_vfio.c                |  12 +
 drivers/bus/fslmc/mc/dpbp.c                   |   6 +
 drivers/bus/fslmc/mc/dpci.c                   |   3 +
 drivers/bus/fslmc/mc/dpcon.c                  |   6 +
 drivers/bus/fslmc/mc/dpdmai.c                 |   8 +
 drivers/bus/fslmc/mc/dpio.c                   |  13 +
 drivers/bus/fslmc/mc/dpmng.c                  |   2 +
 drivers/bus/fslmc/mc/mc_sys.c                 |   1 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c      |   3 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpci.c      |   2 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  11 +
 drivers/bus/fslmc/qbman/qbman_debug.c         |   2 +
 drivers/bus/fslmc/qbman/qbman_portal.c        |  41 ++
 drivers/bus/fslmc/version.map                 | 129 ----
 drivers/bus/ifpga/ifpga_bus.c                 |   3 +
 drivers/bus/ifpga/version.map                 |   9 -
 drivers/bus/pci/bsd/pci.c                     |  10 +
 drivers/bus/pci/linux/pci.c                   |  10 +
 drivers/bus/pci/pci_common.c                  |  10 +
 drivers/bus/pci/version.map                   |  43 --
 drivers/bus/pci/windows/pci.c                 |  10 +
 drivers/bus/platform/platform.c               |   2 +
 drivers/bus/platform/version.map              |  10 -
 drivers/bus/uacce/uacce.c                     |   9 +
 drivers/bus/uacce/version.map                 |  15 -
 drivers/bus/vdev/vdev.c                       |   6 +
 drivers/bus/vdev/version.map                  |  17 -
 drivers/bus/vmbus/linux/vmbus_bus.c           |   6 +
 drivers/bus/vmbus/version.map                 |  33 -
 drivers/bus/vmbus/vmbus_channel.c             |  13 +
 drivers/bus/vmbus/vmbus_common.c              |   3 +
 drivers/common/cnxk/cnxk_security.c           |  12 +
 drivers/common/cnxk/cnxk_utils.c              |   1 +
 drivers/common/cnxk/roc_platform.c            | 559 +++++++++++++++++
 drivers/common/cnxk/roc_se.h                  |   1 -
 drivers/common/cnxk/version.map               | 578 ------------------
 drivers/common/cpt/cpt_fpm_tables.c           |   2 +
 drivers/common/cpt/cpt_pmd_ops_helper.c       |   3 +
 drivers/common/cpt/version.map                |  11 -
 drivers/common/dpaax/caamflib.c               |   1 +
 drivers/common/dpaax/dpaa_of.c                |  12 +
 drivers/common/dpaax/dpaax_iova_table.c       |   6 +
 drivers/common/dpaax/version.map              |  25 -
 drivers/common/ionic/ionic_common_uio.c       |   4 +
 drivers/common/ionic/version.map              |  10 -
 .../common/mlx5/linux/mlx5_common_auxiliary.c |   1 +
 drivers/common/mlx5/linux/mlx5_common_os.c    |   9 +
 drivers/common/mlx5/linux/mlx5_common_verbs.c |   3 +
 drivers/common/mlx5/linux/mlx5_glue.c         |   1 +
 drivers/common/mlx5/linux/mlx5_nl.c           |  21 +
 drivers/common/mlx5/mlx5_common.c             |   9 +
 drivers/common/mlx5/mlx5_common_devx.c        |   9 +
 drivers/common/mlx5/mlx5_common_mp.c          |   8 +
 drivers/common/mlx5/mlx5_common_mr.c          |  11 +
 drivers/common/mlx5/mlx5_common_pci.c         |   2 +
 drivers/common/mlx5/mlx5_common_utils.c       |  11 +
 drivers/common/mlx5/mlx5_devx_cmds.c          |  51 ++
 drivers/common/mlx5/mlx5_malloc.c             |   4 +
 drivers/common/mlx5/version.map               | 174 ------
 drivers/common/mlx5/windows/mlx5_common_os.c  |   5 +
 drivers/common/mlx5/windows/mlx5_glue.c       |   3 +-
 drivers/common/mvep/mvep_common.c             |   2 +
 drivers/common/mvep/version.map               |   8 -
 drivers/common/nfp/nfp_common.c               |   7 +
 drivers/common/nfp/nfp_common_pci.c           |   1 +
 drivers/common/nfp/nfp_dev.c                  |   1 +
 drivers/common/nfp/version.map                |  16 -
 drivers/common/nitrox/nitrox_device.c         |   1 +
 drivers/common/nitrox/nitrox_logs.c           |   1 +
 drivers/common/nitrox/nitrox_qp.c             |   2 +
 drivers/common/nitrox/version.map             |  10 -
 drivers/common/octeontx/octeontx_mbox.c       |   6 +
 drivers/common/octeontx/version.map           |  12 -
 drivers/common/sfc_efx/sfc_efx.c              | 273 +++++++++
 drivers/common/sfc_efx/sfc_efx_mcdi.c         |   2 +
 drivers/common/sfc_efx/version.map            | 302 ---------
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c     |   7 +
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c      |   2 +
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c      |   7 +
 drivers/crypto/cnxk/version.map               |  30 -
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |   2 +
 drivers/crypto/dpaa2_sec/version.map          |   8 -
 drivers/crypto/dpaa_sec/dpaa_sec.c            |   2 +
 drivers/crypto/dpaa_sec/version.map           |   8 -
 drivers/crypto/octeontx/otx_cryptodev_ops.c   |   2 +
 drivers/crypto/octeontx/version.map           |  12 -
 .../scheduler/rte_cryptodev_scheduler.c       |  10 +
 drivers/crypto/scheduler/version.map          |  16 -
 drivers/dma/cnxk/cnxk_dmadev_fp.c             |   4 +
 drivers/dma/cnxk/version.map                  |  10 -
 drivers/event/cnxk/cnxk_worker.c              |   2 +
 drivers/event/cnxk/version.map                |  11 -
 drivers/event/dlb2/rte_pmd_dlb2.c             |   1 +
 drivers/event/dlb2/version.map                |  10 -
 drivers/mempool/cnxk/cn10k_hwpool_ops.c       |   3 +
 drivers/mempool/cnxk/version.map              |  12 -
 drivers/mempool/dpaa/dpaa_mempool.c           |   2 +
 drivers/mempool/dpaa/version.map              |   8 -
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c      |   5 +
 drivers/mempool/dpaa2/version.map             |  16 -
 drivers/meson.build                           |  72 +--
 drivers/net/atlantic/rte_pmd_atlantic.c       |   6 +
 drivers/net/atlantic/version.map              |  15 -
 drivers/net/bnxt/rte_pmd_bnxt.c               |  16 +
 drivers/net/bnxt/version.map                  |  22 -
 drivers/net/bonding/rte_eth_bond_8023ad.c     |  12 +
 drivers/net/bonding/rte_eth_bond_api.c        |  15 +
 drivers/net/bonding/version.map               |  33 -
 drivers/net/cnxk/cnxk_ethdev.c                |   3 +
 drivers/net/cnxk/cnxk_ethdev_sec.c            |   9 +
 drivers/net/cnxk/version.map                  |  27 -
 drivers/net/dpaa/dpaa_ethdev.c                |   3 +
 drivers/net/dpaa/version.map                  |  14 -
 drivers/net/dpaa2/dpaa2_ethdev.c              |  11 +
 drivers/net/dpaa2/dpaa2_mux.c                 |   3 +
 drivers/net/dpaa2/dpaa2_rxtx.c                |   1 +
 drivers/net/dpaa2/version.map                 |  35 --
 drivers/net/intel/i40e/rte_pmd_i40e.c         |  39 ++
 drivers/net/intel/i40e/version.map            |  55 --
 drivers/net/intel/iavf/iavf_ethdev.c          |   9 +
 drivers/net/intel/iavf/iavf_rxtx.c            |   8 +
 drivers/net/intel/iavf/version.map            |  33 -
 drivers/net/intel/ice/ice_diagnose.c          |   3 +
 drivers/net/intel/ice/version.map             |  16 -
 drivers/net/intel/idpf/idpf_common_device.c   |  10 +
 drivers/net/intel/idpf/idpf_common_rxtx.c     |  33 +
 drivers/net/intel/idpf/idpf_common_virtchnl.c |  29 +
 drivers/net/intel/idpf/version.map            |  80 ---
 drivers/net/intel/ipn3ke/ipn3ke_ethdev.c      |   1 +
 drivers/net/intel/ipn3ke/version.map          |   9 -
 drivers/net/intel/ixgbe/rte_pmd_ixgbe.c       |  37 ++
 drivers/net/intel/ixgbe/version.map           |  49 --
 drivers/net/mlx5/mlx5.c                       |   1 +
 drivers/net/mlx5/mlx5_flow.c                  |   4 +
 drivers/net/mlx5/mlx5_rx.c                    |   2 +
 drivers/net/mlx5/mlx5_rxq.c                   |   2 +
 drivers/net/mlx5/mlx5_tx.c                    |   1 +
 drivers/net/mlx5/mlx5_txq.c                   |   3 +
 drivers/net/mlx5/version.map                  |  28 -
 drivers/net/octeontx/octeontx_ethdev.c        |   1 +
 drivers/net/octeontx/version.map              |   7 -
 drivers/net/ring/rte_eth_ring.c               |   2 +
 drivers/net/ring/version.map                  |   8 -
 drivers/net/softnic/rte_eth_softnic.c         |   1 +
 drivers/net/softnic/rte_eth_softnic_thread.c  |   1 +
 drivers/net/softnic/version.map               |   8 -
 drivers/net/vhost/rte_eth_vhost.c             |   2 +
 drivers/net/vhost/version.map                 |   8 -
 drivers/power/kvm_vm/guest_channel.c          |   2 +
 drivers/power/kvm_vm/version.map              |   8 -
 drivers/raw/cnxk_rvu_lf/cnxk_rvu_lf.c         |  10 +
 drivers/raw/cnxk_rvu_lf/version.map           |  16 -
 drivers/raw/ifpga/rte_pmd_ifpga.c             |  11 +
 drivers/raw/ifpga/version.map                 |  17 -
 drivers/version.map                           |   3 -
 lib/acl/acl_bld.c                             |   1 +
 lib/acl/acl_run_scalar.c                      |   1 +
 lib/acl/rte_acl.c                             |  11 +
 lib/acl/version.map                           |  19 -
 lib/argparse/rte_argparse.c                   |   2 +
 lib/argparse/version.map                      |   9 -
 lib/bbdev/bbdev_trace_points.c                |   2 +
 lib/bbdev/rte_bbdev.c                         |  31 +
 lib/bbdev/version.map                         |  47 --
 lib/bitratestats/rte_bitrate.c                |   4 +
 lib/bitratestats/version.map                  |  10 -
 lib/bpf/bpf.c                                 |   2 +
 lib/bpf/bpf_convert.c                         |   1 +
 lib/bpf/bpf_dump.c                            |   1 +
 lib/bpf/bpf_exec.c                            |   2 +
 lib/bpf/bpf_load.c                            |   1 +
 lib/bpf/bpf_load_elf.c                        |   1 +
 lib/bpf/bpf_pkt.c                             |   4 +
 lib/bpf/bpf_stub.c                            |   2 +
 lib/bpf/version.map                           |  18 -
 lib/cfgfile/rte_cfgfile.c                     |  17 +
 lib/cfgfile/version.map                       |  23 -
 lib/cmdline/cmdline.c                         |   9 +
 lib/cmdline/cmdline_cirbuf.c                  |  19 +
 lib/cmdline/cmdline_parse.c                   |   4 +
 lib/cmdline/cmdline_parse_bool.c              |   1 +
 lib/cmdline/cmdline_parse_etheraddr.c         |   3 +
 lib/cmdline/cmdline_parse_ipaddr.c            |   3 +
 lib/cmdline/cmdline_parse_num.c               |   3 +
 lib/cmdline/cmdline_parse_portlist.c          |   3 +
 lib/cmdline/cmdline_parse_string.c            |   5 +
 lib/cmdline/cmdline_rdline.c                  |  15 +
 lib/cmdline/cmdline_socket.c                  |   3 +
 lib/cmdline/cmdline_vt100.c                   |   2 +
 lib/cmdline/version.map                       |  82 ---
 lib/compressdev/rte_comp.c                    |   6 +
 lib/compressdev/rte_compressdev.c             |  25 +
 lib/compressdev/rte_compressdev_pmd.c         |   3 +
 lib/compressdev/version.map                   |  40 --
 lib/cryptodev/cryptodev_pmd.c                 |   7 +
 lib/cryptodev/cryptodev_trace_points.c        |   3 +
 lib/cryptodev/rte_cryptodev.c                 |  83 +++
 lib/cryptodev/version.map                     | 114 ----
 lib/dispatcher/rte_dispatcher.c               |  13 +
 lib/dispatcher/version.map                    |  20 -
 lib/distributor/rte_distributor.c             |   9 +
 lib/distributor/version.map                   |  15 -
 lib/dmadev/rte_dmadev.c                       |  19 +
 lib/dmadev/rte_dmadev_trace_points.c          |   7 +
 lib/dmadev/version.map                        |  47 --
 lib/eal/arm/rte_cpuflags.c                    |   3 +
 lib/eal/arm/rte_hypervisor.c                  |   1 +
 lib/eal/arm/rte_power_intrinsics.c            |   4 +
 lib/eal/common/eal_common_bus.c               |  10 +
 lib/eal/common/eal_common_class.c             |   4 +
 lib/eal/common/eal_common_config.c            |   7 +
 lib/eal/common/eal_common_cpuflags.c          |   1 +
 lib/eal/common/eal_common_debug.c             |   2 +
 lib/eal/common/eal_common_dev.c               |  19 +
 lib/eal/common/eal_common_devargs.c           |   9 +
 lib/eal/common/eal_common_errno.c             |   2 +
 lib/eal/common/eal_common_fbarray.c           |  26 +
 lib/eal/common/eal_common_hexdump.c           |   2 +
 lib/eal/common/eal_common_hypervisor.c        |   1 +
 lib/eal/common/eal_common_interrupts.c        |  27 +
 lib/eal/common/eal_common_launch.c            |   5 +
 lib/eal/common/eal_common_lcore.c             |  17 +
 lib/eal/common/eal_common_lcore_var.c         |   1 +
 lib/eal/common/eal_common_mcfg.c              |  20 +
 lib/eal/common/eal_common_memory.c            |  29 +
 lib/eal/common/eal_common_memzone.c           |   9 +
 lib/eal/common/eal_common_options.c           |   4 +
 lib/eal/common/eal_common_proc.c              |   8 +
 lib/eal/common/eal_common_string_fns.c        |   3 +
 lib/eal/common/eal_common_tailqs.c            |   3 +
 lib/eal/common/eal_common_thread.c            |  14 +
 lib/eal/common/eal_common_timer.c             |   4 +
 lib/eal/common/eal_common_trace.c             |  15 +
 lib/eal/common/eal_common_trace_ctf.c         |   1 +
 lib/eal/common/eal_common_trace_points.c      |  18 +
 lib/eal/common/eal_common_trace_utils.c       |   1 +
 lib/eal/common/eal_common_uuid.c              |   4 +
 lib/eal/common/rte_bitset.c                   |   1 +
 lib/eal/common/rte_keepalive.c                |   6 +
 lib/eal/common/rte_malloc.c                   |  22 +
 lib/eal/common/rte_random.c                   |   4 +
 lib/eal/common/rte_reciprocal.c               |   2 +
 lib/eal/common/rte_service.c                  |  31 +
 lib/eal/common/rte_version.c                  |   7 +
 lib/eal/freebsd/eal.c                         |  22 +
 lib/eal/freebsd/eal_alarm.c                   |   2 +
 lib/eal/freebsd/eal_dev.c                     |   4 +
 lib/eal/freebsd/eal_interrupts.c              |  19 +
 lib/eal/freebsd/eal_memory.c                  |   3 +
 lib/eal/freebsd/eal_thread.c                  |   2 +
 lib/eal/freebsd/eal_timer.c                   |   1 +
 lib/eal/include/rte_function_versioning.h     |  96 ++-
 lib/eal/linux/eal.c                           |   7 +
 lib/eal/linux/eal_alarm.c                     |   2 +
 lib/eal/linux/eal_dev.c                       |   4 +
 lib/eal/linux/eal_interrupts.c                |  19 +
 lib/eal/linux/eal_memory.c                    |   3 +
 lib/eal/linux/eal_thread.c                    |   2 +
 lib/eal/linux/eal_timer.c                     |   4 +
 lib/eal/linux/eal_vfio.c                      |  16 +
 lib/eal/loongarch/rte_cpuflags.c              |   3 +
 lib/eal/loongarch/rte_hypervisor.c            |   1 +
 lib/eal/loongarch/rte_power_intrinsics.c      |   4 +
 lib/eal/ppc/rte_cpuflags.c                    |   3 +
 lib/eal/ppc/rte_hypervisor.c                  |   1 +
 lib/eal/ppc/rte_power_intrinsics.c            |   4 +
 lib/eal/riscv/rte_cpuflags.c                  |   3 +
 lib/eal/riscv/rte_hypervisor.c                |   1 +
 lib/eal/riscv/rte_power_intrinsics.c          |   4 +
 lib/eal/unix/eal_debug.c                      |   2 +
 lib/eal/unix/eal_filesystem.c                 |   1 +
 lib/eal/unix/eal_firmware.c                   |   1 +
 lib/eal/unix/eal_unix_memory.c                |   4 +
 lib/eal/unix/eal_unix_timer.c                 |   1 +
 lib/eal/unix/rte_thread.c                     |  13 +
 lib/eal/version.map                           | 451 --------------
 lib/eal/windows/eal.c                         |  11 +
 lib/eal/windows/eal_alarm.c                   |   2 +
 lib/eal/windows/eal_debug.c                   |   1 +
 lib/eal/windows/eal_dev.c                     |   4 +
 lib/eal/windows/eal_interrupts.c              |  19 +
 lib/eal/windows/eal_memory.c                  |   7 +
 lib/eal/windows/eal_mp.c                      |   6 +
 lib/eal/windows/eal_thread.c                  |   1 +
 lib/eal/windows/eal_timer.c                   |   1 +
 lib/eal/windows/rte_thread.c                  |  14 +
 lib/eal/x86/rte_cpuflags.c                    |   3 +
 lib/eal/x86/rte_hypervisor.c                  |   1 +
 lib/eal/x86/rte_power_intrinsics.c            |   4 +
 lib/eal/x86/rte_spinlock.c                    |   1 +
 lib/efd/rte_efd.c                             |   7 +
 lib/efd/version.map                           |  13 -
 lib/ethdev/ethdev_driver.c                    |  24 +
 lib/ethdev/ethdev_linux_ethtool.c             |   3 +
 lib/ethdev/ethdev_private.c                   |   2 +
 lib/ethdev/ethdev_trace_points.c              |   6 +
 lib/ethdev/rte_ethdev.c                       | 168 +++++
 lib/ethdev/rte_ethdev_cman.c                  |   4 +
 lib/ethdev/rte_flow.c                         |  64 ++
 lib/ethdev/rte_mtr.c                          |  21 +
 lib/ethdev/rte_tm.c                           |  31 +
 lib/ethdev/version.map                        | 378 ------------
 lib/eventdev/eventdev_private.c               |   2 +
 lib/eventdev/eventdev_trace_points.c          |  11 +
 lib/eventdev/rte_event_crypto_adapter.c       |  15 +
 lib/eventdev/rte_event_dma_adapter.c          |  15 +
 lib/eventdev/rte_event_eth_rx_adapter.c       |  23 +
 lib/eventdev/rte_event_eth_tx_adapter.c       |  17 +
 lib/eventdev/rte_event_ring.c                 |   4 +
 lib/eventdev/rte_event_timer_adapter.c        |  11 +
 lib/eventdev/rte_eventdev.c                   |  46 ++
 lib/eventdev/version.map                      | 179 ------
 lib/fib/rte_fib.c                             |  10 +
 lib/fib/rte_fib6.c                            |   9 +
 lib/fib/version.map                           |  31 -
 lib/gpudev/gpudev.c                           |  32 +
 lib/gpudev/version.map                        |  44 --
 lib/graph/graph.c                             |  16 +
 lib/graph/graph_debug.c                       |   1 +
 lib/graph/graph_stats.c                       |   4 +
 lib/graph/node.c                              |  11 +
 lib/graph/rte_graph_model_mcore_dispatch.c    |   3 +
 lib/graph/rte_graph_worker.c                  |   3 +
 lib/graph/version.map                         |  61 --
 lib/gro/rte_gro.c                             |   6 +
 lib/gro/version.map                           |  12 -
 lib/gso/rte_gso.c                             |   1 +
 lib/gso/version.map                           |   7 -
 lib/hash/rte_cuckoo_hash.c                    |  27 +
 lib/hash/rte_fbk_hash.c                       |   3 +
 lib/hash/rte_hash_crc.c                       |   2 +
 lib/hash/rte_thash.c                          |  12 +
 lib/hash/rte_thash_gf2_poly_math.c            |   1 +
 lib/hash/rte_thash_gfni.c                     |   2 +
 lib/hash/version.map                          |  66 --
 lib/ip_frag/rte_ip_frag_common.c              |   5 +
 lib/ip_frag/rte_ipv4_fragmentation.c          |   2 +
 lib/ip_frag/rte_ipv4_reassembly.c             |   1 +
 lib/ip_frag/rte_ipv6_fragmentation.c          |   1 +
 lib/ip_frag/rte_ipv6_reassembly.c             |   1 +
 lib/ip_frag/version.map                       |  16 -
 lib/ipsec/ipsec_sad.c                         |   6 +
 lib/ipsec/ipsec_telemetry.c                   |   2 +
 lib/ipsec/sa.c                                |   4 +
 lib/ipsec/ses.c                               |   1 +
 lib/ipsec/version.map                         |  23 -
 lib/jobstats/rte_jobstats.c                   |  14 +
 lib/jobstats/version.map                      |  20 -
 lib/kvargs/rte_kvargs.c                       |   8 +
 lib/kvargs/version.map                        |  14 -
 lib/latencystats/rte_latencystats.c           |   5 +
 lib/latencystats/version.map                  |  11 -
 lib/log/log.c                                 |  22 +
 lib/log/log_color.c                           |   1 +
 lib/log/log_internal.h                        |   3 -
 lib/log/log_syslog.c                          |   1 +
 lib/log/log_timestamp.c                       |   1 +
 lib/log/version.map                           |  37 --
 lib/lpm/rte_lpm.c                             |   8 +
 lib/lpm/rte_lpm6.c                            |  10 +
 lib/lpm/version.map                           |  24 -
 lib/mbuf/rte_mbuf.c                           |  17 +
 lib/mbuf/rte_mbuf_dyn.c                       |   9 +
 lib/mbuf/rte_mbuf_pool_ops.c                  |   5 +
 lib/mbuf/rte_mbuf_ptype.c                     |   8 +
 lib/mbuf/version.map                          |  45 --
 lib/member/rte_member.c                       |  13 +
 lib/member/version.map                        |  19 -
 lib/mempool/mempool_trace_points.c            |  10 +
 lib/mempool/rte_mempool.c                     |  27 +
 lib/mempool/rte_mempool_ops.c                 |   4 +
 lib/mempool/rte_mempool_ops_default.c         |   4 +
 lib/mempool/version.map                       |  65 --
 lib/meson.build                               |  56 +-
 lib/meter/rte_meter.c                         |   6 +
 lib/meter/version.map                         |  12 -
 lib/metrics/rte_metrics.c                     |   8 +
 lib/metrics/rte_metrics_telemetry.c           |  11 +
 lib/metrics/version.map                       |  26 -
 lib/mldev/mldev_utils.c                       |   2 +
 lib/mldev/mldev_utils_neon.c                  |  18 +
 lib/mldev/mldev_utils_neon_bfloat16.c         |   2 +
 lib/mldev/mldev_utils_scalar.c                |  18 +
 lib/mldev/mldev_utils_scalar_bfloat16.c       |   2 +
 lib/mldev/rte_mldev.c                         |  37 ++
 lib/mldev/rte_mldev_pmd.c                     |   2 +
 lib/mldev/version.map                         |  74 ---
 lib/net/net_crc.h                             |  15 -
 lib/net/rte_arp.c                             |   1 +
 lib/net/rte_ether.c                           |   3 +
 lib/net/rte_net.c                             |   2 +
 lib/net/rte_net_crc.c                         |  29 +-
 lib/net/version.map                           |  23 -
 lib/node/ethdev_ctrl.c                        |   2 +
 lib/node/ip4_lookup.c                         |   1 +
 lib/node/ip4_reassembly.c                     |   1 +
 lib/node/ip4_rewrite.c                        |   1 +
 lib/node/ip6_lookup.c                         |   1 +
 lib/node/ip6_rewrite.c                        |   1 +
 lib/node/udp4_input.c                         |   2 +
 lib/node/version.map                          |  25 -
 lib/pcapng/rte_pcapng.c                       |   7 +
 lib/pcapng/version.map                        |  13 -
 lib/pci/rte_pci.c                             |   3 +
 lib/pci/version.map                           |   9 -
 lib/pdcp/rte_pdcp.c                           |   5 +
 lib/pdcp/version.map                          |  16 -
 lib/pdump/rte_pdump.c                         |   9 +
 lib/pdump/version.map                         |  15 -
 lib/pipeline/rte_pipeline.c                   |  23 +
 lib/pipeline/rte_port_in_action.c             |   8 +
 lib/pipeline/rte_swx_ctl.c                    |  17 +
 lib/pipeline/rte_swx_ipsec.c                  |   7 +
 lib/pipeline/rte_swx_pipeline.c               |  73 +++
 lib/pipeline/rte_table_action.c               |  16 +
 lib/pipeline/version.map                      | 172 ------
 lib/port/rte_port_ethdev.c                    |   3 +
 lib/port/rte_port_eventdev.c                  |   3 +
 lib/port/rte_port_fd.c                        |   3 +
 lib/port/rte_port_frag.c                      |   2 +
 lib/port/rte_port_ras.c                       |   2 +
 lib/port/rte_port_ring.c                      |   6 +
 lib/port/rte_port_sched.c                     |   2 +
 lib/port/rte_port_source_sink.c               |   2 +
 lib/port/rte_port_sym_crypto.c                |   3 +
 lib/port/rte_swx_port_ethdev.c                |   2 +
 lib/port/rte_swx_port_fd.c                    |   2 +
 lib/port/rte_swx_port_ring.c                  |   2 +
 lib/port/rte_swx_port_source_sink.c           |   3 +
 lib/port/version.map                          |  50 --
 lib/power/power_common.c                      |   8 +
 lib/power/rte_power_cpufreq.c                 |  18 +
 lib/power/rte_power_pmd_mgmt.c                |  10 +
 lib/power/rte_power_qos.c                     |   2 +
 lib/power/rte_power_uncore.c                  |  14 +
 lib/power/version.map                         |  71 ---
 lib/rawdev/rte_rawdev.c                       |  30 +
 lib/rawdev/version.map                        |  36 --
 lib/rcu/rte_rcu_qsbr.c                        |  11 +
 lib/rcu/version.map                           |  17 -
 lib/regexdev/rte_regexdev.c                   |  26 +
 lib/regexdev/version.map                      |  40 --
 lib/reorder/rte_reorder.c                     |  11 +
 lib/reorder/version.map                       |  27 -
 lib/rib/rte_rib.c                             |  14 +
 lib/rib/rte_rib6.c                            |  14 +
 lib/rib/version.map                           |  34 --
 lib/ring/rte_ring.c                           |  11 +
 lib/ring/rte_soring.c                         |   3 +
 lib/ring/soring.c                             |  16 +
 lib/ring/version.map                          |  42 --
 lib/sched/rte_approx.c                        |   1 +
 lib/sched/rte_pie.c                           |   2 +
 lib/sched/rte_red.c                           |   6 +
 lib/sched/rte_sched.c                         |  15 +
 lib/sched/version.map                         |  30 -
 lib/security/rte_security.c                   |  20 +
 lib/security/version.map                      |  37 --
 lib/stack/rte_stack.c                         |   3 +
 lib/stack/version.map                         |   9 -
 lib/table/rte_swx_table_em.c                  |   2 +
 lib/table/rte_swx_table_learner.c             |  10 +
 lib/table/rte_swx_table_selector.c            |   6 +
 lib/table/rte_swx_table_wm.c                  |   1 +
 lib/table/rte_table_acl.c                     |   1 +
 lib/table/rte_table_array.c                   |   1 +
 lib/table/rte_table_hash_cuckoo.c             |   1 +
 lib/table/rte_table_hash_ext.c                |   1 +
 lib/table/rte_table_hash_key16.c              |   2 +
 lib/table/rte_table_hash_key32.c              |   2 +
 lib/table/rte_table_hash_key8.c               |   2 +
 lib/table/rte_table_hash_lru.c                |   1 +
 lib/table/rte_table_lpm.c                     |   1 +
 lib/table/rte_table_lpm_ipv6.c                |   1 +
 lib/table/rte_table_stub.c                    |   1 +
 lib/table/version.map                         |  53 --
 lib/telemetry/telemetry.c                     |   3 +
 lib/telemetry/telemetry_data.c                |  17 +
 lib/telemetry/telemetry_legacy.c              |   1 +
 lib/telemetry/version.map                     |  40 --
 lib/timer/rte_timer.c                         |  18 +
 lib/timer/version.map                         |  24 -
 lib/vhost/socket.c                            |  16 +
 lib/vhost/vdpa.c                              |  11 +
 lib/vhost/version.map                         | 111 ----
 lib/vhost/vhost.c                             |  41 ++
 lib/vhost/vhost_crypto.c                      |   6 +
 lib/vhost/vhost_user.c                        |   2 +
 lib/vhost/virtio_net.c                        |   7 +
 523 files changed, 4654 insertions(+), 6507 deletions(-)
 create mode 100755 buildtools/gen-version-map.py
 delete mode 100644 buildtools/map_to_win.py
 create mode 100644 config/rte_export.h
 create mode 100755 devtools/check-symbol-change.py
 delete mode 100755 devtools/check-symbol-change.sh
 delete mode 100755 devtools/check-symbol-maps.sh
 delete mode 100755 devtools/update-abi.sh
 delete mode 100755 devtools/update_version_map_abi.py
 delete mode 100644 drivers/baseband/acc/version.map
 delete mode 100644 drivers/baseband/fpga_5gnr_fec/version.map
 delete mode 100644 drivers/baseband/fpga_lte_fec/version.map
 delete mode 100644 drivers/bus/auxiliary/version.map
 delete mode 100644 drivers/bus/cdx/version.map
 delete mode 100644 drivers/bus/dpaa/version.map
 delete mode 100644 drivers/bus/fslmc/version.map
 delete mode 100644 drivers/bus/ifpga/version.map
 delete mode 100644 drivers/bus/pci/version.map
 delete mode 100644 drivers/bus/platform/version.map
 delete mode 100644 drivers/bus/uacce/version.map
 delete mode 100644 drivers/bus/vdev/version.map
 delete mode 100644 drivers/bus/vmbus/version.map
 delete mode 100644 drivers/common/cnxk/version.map
 delete mode 100644 drivers/common/cpt/version.map
 delete mode 100644 drivers/common/dpaax/version.map
 delete mode 100644 drivers/common/ionic/version.map
 delete mode 100644 drivers/common/mlx5/version.map
 delete mode 100644 drivers/common/mvep/version.map
 delete mode 100644 drivers/common/nfp/version.map
 delete mode 100644 drivers/common/nitrox/version.map
 delete mode 100644 drivers/common/octeontx/version.map
 delete mode 100644 drivers/common/sfc_efx/version.map
 delete mode 100644 drivers/crypto/cnxk/version.map
 delete mode 100644 drivers/crypto/dpaa2_sec/version.map
 delete mode 100644 drivers/crypto/dpaa_sec/version.map
 delete mode 100644 drivers/crypto/octeontx/version.map
 delete mode 100644 drivers/crypto/scheduler/version.map
 delete mode 100644 drivers/dma/cnxk/version.map
 delete mode 100644 drivers/event/cnxk/version.map
 delete mode 100644 drivers/event/dlb2/version.map
 delete mode 100644 drivers/mempool/cnxk/version.map
 delete mode 100644 drivers/mempool/dpaa/version.map
 delete mode 100644 drivers/mempool/dpaa2/version.map
 delete mode 100644 drivers/net/atlantic/version.map
 delete mode 100644 drivers/net/bnxt/version.map
 delete mode 100644 drivers/net/bonding/version.map
 delete mode 100644 drivers/net/cnxk/version.map
 delete mode 100644 drivers/net/dpaa/version.map
 delete mode 100644 drivers/net/dpaa2/version.map
 delete mode 100644 drivers/net/intel/i40e/version.map
 delete mode 100644 drivers/net/intel/iavf/version.map
 delete mode 100644 drivers/net/intel/ice/version.map
 delete mode 100644 drivers/net/intel/idpf/version.map
 delete mode 100644 drivers/net/intel/ipn3ke/version.map
 delete mode 100644 drivers/net/intel/ixgbe/version.map
 delete mode 100644 drivers/net/mlx5/version.map
 delete mode 100644 drivers/net/octeontx/version.map
 delete mode 100644 drivers/net/ring/version.map
 delete mode 100644 drivers/net/softnic/version.map
 delete mode 100644 drivers/net/vhost/version.map
 delete mode 100644 drivers/power/kvm_vm/version.map
 delete mode 100644 drivers/raw/cnxk_rvu_lf/version.map
 delete mode 100644 drivers/raw/ifpga/version.map
 delete mode 100644 drivers/version.map
 delete mode 100644 lib/acl/version.map
 delete mode 100644 lib/argparse/version.map
 delete mode 100644 lib/bbdev/version.map
 delete mode 100644 lib/bitratestats/version.map
 delete mode 100644 lib/bpf/version.map
 delete mode 100644 lib/cfgfile/version.map
 delete mode 100644 lib/cmdline/version.map
 delete mode 100644 lib/compressdev/version.map
 delete mode 100644 lib/cryptodev/version.map
 delete mode 100644 lib/dispatcher/version.map
 delete mode 100644 lib/distributor/version.map
 delete mode 100644 lib/dmadev/version.map
 delete mode 100644 lib/eal/version.map
 delete mode 100644 lib/efd/version.map
 delete mode 100644 lib/ethdev/version.map
 delete mode 100644 lib/eventdev/version.map
 delete mode 100644 lib/fib/version.map
 delete mode 100644 lib/gpudev/version.map
 delete mode 100644 lib/graph/version.map
 delete mode 100644 lib/gro/version.map
 delete mode 100644 lib/gso/version.map
 delete mode 100644 lib/hash/version.map
 delete mode 100644 lib/ip_frag/version.map
 delete mode 100644 lib/ipsec/version.map
 delete mode 100644 lib/jobstats/version.map
 delete mode 100644 lib/kvargs/version.map
 delete mode 100644 lib/latencystats/version.map
 delete mode 100644 lib/log/version.map
 delete mode 100644 lib/lpm/version.map
 delete mode 100644 lib/mbuf/version.map
 delete mode 100644 lib/member/version.map
 delete mode 100644 lib/mempool/version.map
 delete mode 100644 lib/meter/version.map
 delete mode 100644 lib/metrics/version.map
 delete mode 100644 lib/mldev/version.map
 delete mode 100644 lib/net/version.map
 delete mode 100644 lib/node/version.map
 delete mode 100644 lib/pcapng/version.map
 delete mode 100644 lib/pci/version.map
 delete mode 100644 lib/pdcp/version.map
 delete mode 100644 lib/pdump/version.map
 delete mode 100644 lib/pipeline/version.map
 delete mode 100644 lib/port/version.map
 delete mode 100644 lib/power/version.map
 delete mode 100644 lib/rawdev/version.map
 delete mode 100644 lib/rcu/version.map
 delete mode 100644 lib/regexdev/version.map
 delete mode 100644 lib/reorder/version.map
 delete mode 100644 lib/rib/version.map
 delete mode 100644 lib/ring/version.map
 delete mode 100644 lib/sched/version.map
 delete mode 100644 lib/security/version.map
 delete mode 100644 lib/stack/version.map
 delete mode 100644 lib/table/version.map
 delete mode 100644 lib/telemetry/version.map
 delete mode 100644 lib/timer/version.map
 delete mode 100644 lib/vhost/version.map

-- 
2.48.1


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] ethdev: fix get_reg_info
  @ 2025-03-07  9:33  3%   ` fengchengwen
  0 siblings, 0 replies; 200+ results
From: fengchengwen @ 2025-03-07  9:33 UTC (permalink / raw)
  To: Stephen Hemminger, Thierry Herbelot; +Cc: dev, Thomas Monjalon, stable

On 2025/2/20 2:45, Stephen Hemminger wrote:
> On Tue, 18 Feb 2025 12:58:28 +0100
> Thierry Herbelot <thierry.herbelot@6wind.com> wrote:
> 
>> 'width' and 'offset' are input parameters when dumping the register
>> info of an Ethernet device. They should be copied in the new request
>> before calling the device callback function.
>>
>> Fixes: 083db2ed9e9 ('ethdev: add report of register names and filter')
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
> 
> Why does the ethdev code create an on stack temporary variable.
> Looks like it only wants to make sure that names element is NULL.

It mainly for ABI compatibility.

The original solution is to add an ext API (rte_eth_dev_get_reg_info_ext) and deprecate the original API (rte_eth_dev_get_reg_info).

> 
> Really should be one function and when extended fields were added
> should have used API versioning.
> Probably too late now, although rte_eth_dev_get_reg_info_ext()
> is an experimental API.




^ permalink raw reply	[relevance 3%]

* [RFC v2 2/2] build: generate symbol maps
  2025-03-06 12:50  6% ` [RFC v2 1/2] " David Marchand
@ 2025-03-06 12:50  7%   ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-06 12:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue, Jasvinder Singh

Rather than maintain a file in parallel of the code, symbols to be
exported can be marked with a token.
The build framework then generates map files in a format that can satisfy
GNU linker.

Apply those macros to lib/net as an example.

Documentation is missing.
Converting from .map to Windows export file is not done.
Checks on map files are left in place, though they could be removed once
the whole tree is converted.
Experimental and internal symbol types are not handled.
Probably something else is missing, but this patch is still at RFC level.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 buildtools/gen-version-map.py | 65 +++++++++++++++++++++++++++++++++++
 buildtools/meson.build        |  1 +
 drivers/meson.build           |  2 --
 lib/meson.build               | 19 ++++++++--
 lib/net/rte_arp.c             |  1 +
 lib/net/rte_ether.c           |  3 ++
 lib/net/rte_net.c             |  2 ++
 lib/net/rte_net_crc.c         |  1 +
 lib/net/version.map           | 23 -------------
 meson.build                   |  3 +-
 10 files changed, 91 insertions(+), 29 deletions(-)
 create mode 100755 buildtools/gen-version-map.py
 delete mode 100644 lib/net/version.map

diff --git a/buildtools/gen-version-map.py b/buildtools/gen-version-map.py
new file mode 100755
index 0000000000..2b03f328ea
--- /dev/null
+++ b/buildtools/gen-version-map.py
@@ -0,0 +1,65 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2024 Red Hat, Inc.
+
+"""Generate a version map file used by GNU linker."""
+
+import re
+import sys
+
+# From meson.build
+sym_export_regexp = re.compile(r"^RTE_EXPORT_SYMBOL\(([^,]+)\)$")
+# From rte_function_versioning.h
+sym_ver_regexp = re.compile(r"^RTE_VERSION_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+sym_default_regexp = re.compile(r"^RTE_DEFAULT_SYMBOL\(([^,]+), [^,]+, ([^,]+),")
+
+with open("../ABI_VERSION") as f:
+    abi = re.match("([0-9]+).[0-9]", f.readline()).group(1)
+
+symbols = {}
+
+for file in sys.argv[2:]:
+    with open(file, encoding="utf-8") as f:
+        for ln in f.readlines():
+            node = None
+            symbol = None
+            if sym_export_regexp.match(ln):
+                symbol = sym_export_regexp.match(ln).group(1)
+            elif sym_ver_regexp.match(ln):
+                node = sym_ver_regexp.match(ln).group(1)
+                symbol = sym_ver_regexp.match(ln).group(2)
+            elif sym_default_regexp.match(ln):
+                node = sym_default_regexp.match(ln).group(1)
+                symbol = sym_default_regexp.match(ln).group(2)
+
+            if not symbol:
+                continue
+
+            if not node:
+                node = abi
+            if node not in symbols:
+                symbols[node] = []
+            symbols[node].append(symbol)
+
+with open(sys.argv[1], "w") as outfile:
+    local_token = False
+    if abi in symbols:
+        outfile.writelines(f"DPDK_{abi} {{\n\tglobal:\n\n")
+        for symbol in sorted(symbols[abi]):
+            outfile.writelines(f"\t{symbol};\n")
+        outfile.writelines("\n")
+        if not local_token:
+            outfile.writelines("\tlocal: *;\n")
+            local_token = True
+        outfile.writelines("};\n")
+        del symbols[abi]
+    for key in sorted(symbols.keys()):
+        outfile.writelines(f"DPDK_{key} {{\n\tglobal:\n\n")
+        for symbol in sorted(symbols[key]):
+            outfile.writelines(f"\t{symbol};\n")
+        outfile.writelines("\n")
+        if not local_token:
+            outfile.writelines("\tlocal: *;\n")
+            local_token = True
+        outfile.writelines(f"}} DPDK_{abi};\n")
+        del symbols[key]
diff --git a/buildtools/meson.build b/buildtools/meson.build
index 4e2c1217a2..b745e9afa4 100644
--- a/buildtools/meson.build
+++ b/buildtools/meson.build
@@ -16,6 +16,7 @@ else
     py3 = ['meson', 'runpython']
 endif
 echo = py3 + ['-c', 'import sys; print(*sys.argv[1:])']
+gen_version_map = py3 + files('gen-version-map.py')
 list_dir_globs = py3 + files('list-dir-globs.py')
 map_to_win_cmd = py3 + files('map_to_win.py')
 sphinx_wrapper = py3 + files('call-sphinx-build.py')
diff --git a/drivers/meson.build b/drivers/meson.build
index 05391a575d..d5fe3749c4 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -5,8 +5,6 @@ if is_ms_compiler
     subdir_done()
 endif
 
-fs = import('fs')
-
 # Defines the order of dependencies evaluation
 subdirs = [
         'common',
diff --git a/lib/meson.build b/lib/meson.build
index ce92cb5537..4db1864241 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -110,6 +110,7 @@ endif
 default_cflags = machine_args
 default_cflags += ['-DALLOW_EXPERIMENTAL_API']
 default_cflags += ['-DALLOW_INTERNAL_API']
+default_cflags += ['-DRTE_EXPORT_SYMBOL(a)=']
 
 if cc.has_argument('-Wno-format-truncation')
     default_cflags += '-Wno-format-truncation'
@@ -254,6 +255,9 @@ foreach l:libraries
             include_directories: includes,
             dependencies: static_deps)
 
+    version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
+    lk_deps = []
+
     if not use_function_versioning or is_windows
         # use pre-build objects to build shared lib
         sources = []
@@ -262,10 +266,19 @@ foreach l:libraries
         # for compat we need to rebuild with
         # RTE_BUILD_SHARED_LIB defined
         cflags += '-DRTE_BUILD_SHARED_LIB'
-    endif
 
-    version_map = '@0@/@1@/version.map'.format(meson.current_source_dir(), l)
-    lk_deps = [version_map]
+        # POC: generate version.map if absent.
+        if not fs.is_file(version_map)
+            map_file = custom_target(libname + '_map',
+                    command: [gen_version_map, '@OUTPUT@', '@INPUT@'],
+                    input: sources,
+                    output: '@0@_version.map'.format(libname))
+            version_map = map_file.full_path()
+            lk_deps += [map_file]
+        else
+            lk_deps += [version_map]
+        endif
+    endif
 
     if is_ms_linker
         def_file = custom_target(libname + '_def',
diff --git a/lib/net/rte_arp.c b/lib/net/rte_arp.c
index 22af519586..cd0f49a7a9 100644
--- a/lib/net/rte_arp.c
+++ b/lib/net/rte_arp.c
@@ -47,3 +47,4 @@ rte_net_make_rarp_packet(struct rte_mempool *mpool,
 
 	return mbuf;
 }
+RTE_EXPORT_SYMBOL(rte_net_make_rarp_packet)
diff --git a/lib/net/rte_ether.c b/lib/net/rte_ether.c
index f59c20289d..9d02db1676 100644
--- a/lib/net/rte_ether.c
+++ b/lib/net/rte_ether.c
@@ -17,6 +17,7 @@ rte_eth_random_addr(uint8_t *addr)
 	addr[0] &= (uint8_t)~RTE_ETHER_GROUP_ADDR;	/* clear multicast bit */
 	addr[0] |= RTE_ETHER_LOCAL_ADMIN_ADDR;	/* set local assignment bit */
 }
+RTE_EXPORT_SYMBOL(rte_eth_random_addr)
 
 void
 rte_ether_format_addr(char *buf, uint16_t size,
@@ -25,6 +26,7 @@ rte_ether_format_addr(char *buf, uint16_t size,
 	snprintf(buf, size, RTE_ETHER_ADDR_PRT_FMT,
 		RTE_ETHER_ADDR_BYTES(eth_addr));
 }
+RTE_EXPORT_SYMBOL(rte_ether_format_addr)
 
 static int8_t get_xdigit(char ch)
 {
@@ -153,3 +155,4 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
 	rte_errno = EINVAL;
 	return -1;
 }
+RTE_EXPORT_SYMBOL(rte_ether_unformat_addr)
diff --git a/lib/net/rte_net.c b/lib/net/rte_net.c
index 0c32e78a13..9a1bc3fb7d 100644
--- a/lib/net/rte_net.c
+++ b/lib/net/rte_net.c
@@ -306,6 +306,7 @@ rte_net_skip_ip6_ext(uint16_t proto, const struct rte_mbuf *m, uint32_t *off,
 	}
 	return -1;
 }
+RTE_EXPORT_SYMBOL(rte_net_skip_ip6_ext)
 
 /* parse mbuf data to get packet type */
 uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
@@ -601,3 +602,4 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
 
 	return pkt_type;
 }
+RTE_EXPORT_SYMBOL(rte_net_get_ptype)
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index dd93d43c2e..03be816509 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -417,6 +417,7 @@ void rte_net_crc_free(struct rte_net_crc *crc)
 {
 	rte_free(crc);
 }
+RTE_EXPORT_SYMBOL(rte_net_crc_free)
 
 RTE_VERSION_SYMBOL(25, uint32_t, rte_net_crc_calc, (const void *data, uint32_t data_len,
 	enum rte_net_crc_type type)
diff --git a/lib/net/version.map b/lib/net/version.map
deleted file mode 100644
index f4dd673fa3..0000000000
--- a/lib/net/version.map
+++ /dev/null
@@ -1,23 +0,0 @@
-DPDK_25 {
-	global:
-
-	rte_eth_random_addr;
-	rte_ether_format_addr;
-	rte_ether_unformat_addr;
-	rte_net_crc_calc;
-	rte_net_crc_free;
-	rte_net_crc_set_alg;
-	rte_net_get_ptype;
-	rte_net_make_rarp_packet;
-	rte_net_skip_ip6_ext;
-
-	local: *;
-};
-
-DPDK_26 {
-	global:
-
-	rte_net_crc_calc;
-	rte_net_crc_set_alg;
-
-} DPDK_25;
diff --git a/meson.build b/meson.build
index 8436d1dff8..dfb4cb3aee 100644
--- a/meson.build
+++ b/meson.build
@@ -13,10 +13,11 @@ project('DPDK', 'c',
         meson_version: '>= 0.57'
 )
 
+fs = import('fs')
+
 # check for developer mode
 developer_mode = false
 if get_option('developer_mode').auto()
-    fs = import('fs')
     developer_mode = fs.exists('.git')
 else
     developer_mode = get_option('developer_mode').enabled()
-- 
2.48.1


^ permalink raw reply	[relevance 7%]

* [RFC v2 1/2] eal: add new function versioning macros
  2025-03-05 21:23  6% [RFC] eal: add new function versioning macros David Marchand
@ 2025-03-06 12:50  6% ` David Marchand
  2025-03-06 12:50  7%   ` [RFC v2 2/2] build: generate symbol maps David Marchand
  2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
  2025-03-17 15:42  3% ` [RFC v4 " David Marchand
  2 siblings, 1 reply; 200+ results
From: David Marchand @ 2025-03-06 12:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, andremue, Tyler Retzlaff, Jasvinder Singh

For versioning symbols:
- MSVC uses pragmas on the symbol,
- GNU linker uses special asm directives,

To accommodate both GNU linker and MSVC linker, introduce new macros for
exporting and versioning symbols that will surround the whole function.

This has the advantage of hiding all the ugly details in the macros.
Now versioning a symbol is just a call to a single macro:
- RTE_VERSION_SYMBOL (resp. RTE_VERSION_EXPERIMENTAL_SYMBOL), for
  keeping an old implementation code under a versioned function (resp.
  experimental function),
- RTE_DEFAULT_SYMBOL, for declaring the new default versioned function,
  and handling the static link special case, instead of
  BIND_DEFAULT_SYMBOL + MAP_STATIC_SYMBOL,

Documentation has been updated though it needs some polishing.
The experimental macro has not been tested.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since RFC v1:
- renamed and prefixed macros,
- reindented in prevision of second patch,

---
 doc/guides/contributing/abi_versioning.rst | 130 ++++-----------------
 lib/eal/include/rte_function_versioning.h  |  27 +++++
 lib/net/rte_net_crc.c                      |  30 ++---
 3 files changed, 57 insertions(+), 130 deletions(-)

diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 7afd1c1886..c4baf6433a 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -277,86 +277,49 @@ list of exported symbols when DPDK is compiled as a shared library.
 
 Next, we need to specify in the code which function maps to the rte_acl_create
 symbol at which versions.  First, at the site of the initial symbol definition,
-we need to update the function so that it is uniquely named, and not in conflict
-with the public symbol name
+we wrap the function with ``RTE_VERSION_SYMBOL``, passing the current ABI version,
+the function return type, and the function name, then the full implementation of the
+function.
 
 .. code-block:: c
 
  -struct rte_acl_ctx *
  -rte_acl_create(const struct rte_acl_param *param)
- +struct rte_acl_ctx * __vsym
- +rte_acl_create_v21(const struct rte_acl_param *param)
+ +RTE_VERSION_SYMBOL(21, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param)
  {
         size_t sz;
         struct rte_acl_ctx *ctx;
         ...
-
-Note that the base name of the symbol was kept intact, as this is conducive to
-the macros used for versioning symbols and we have annotated the function as
-``__vsym``, an implementation of a versioned symbol . That is our next step,
-mapping this new symbol name to the initial symbol name at version node 21.
-Immediately after the function, we add the VERSION_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   VERSION_SYMBOL(rte_acl_create, _v21, 21);
+ -}
+ +})
 
 Remembering to also add the rte_function_versioning.h header to the requisite c
 file where these changes are being made. The macro instructs the linker to
 create a new symbol ``rte_acl_create@DPDK_21``, which matches the symbol created
-in older builds, but now points to the above newly named function. We have now
-mapped the original rte_acl_create symbol to the original function (but with a
-new name).
+in older builds, but now points to the above newly named function ``rte_acl_create_v21``.
+We have now mapped the original rte_acl_create symbol to the original function
+(but with a new name).
 
 Please see the section :ref:`Enabling versioning macros
 <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
-Next, we need to create the new ``v22`` version of the symbol. We create a new
-function name, with the ``v22`` suffix, and implement it appropriately.
+Next, we need to create the new version of the symbol. We create a new
+function name and implement it appropriately, then wrap it in a call to ``RTE_DEFAULT_SYMBOL``.
 
 .. code-block:: c
 
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
+   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create, (const struct rte_acl_param *param,
+        int debug);
    {
         struct rte_acl_ctx *ctx = rte_acl_create_v21(param);
 
         ctx->debug = debug;
 
         return ctx;
-   }
-
-This code serves as our new API call. Its the same as our old call, but adds the
-new parameter in place. Next we need to map this function to the new default
-symbol ``rte_acl_create@DPDK_22``. To do this, immediately after the function,
-we add the BIND_DEFAULT_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
+   })
 
 The macro instructs the linker to create the new default symbol
-``rte_acl_create@DPDK_22``, which points to the above newly named function.
-
-We finally modify the prototype of the call in the public header file,
-such that it contains both versions of the symbol and the public API.
-
-.. code-block:: c
-
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v21(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
-
+``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
+(declared by the macro).
 
 And that's it, on the next shared library rebuild, there will be two versions of
 rte_acl_create, an old DPDK_21 version, used by previously built applications,
@@ -365,43 +328,10 @@ and a new DPDK_22 version, used by future built applications.
 .. note::
 
    **Before you leave**, please take care reviewing the sections on
-   :ref:`mapping static symbols <mapping_static_symbols>`,
    :ref:`enabling versioning macros <enabling_versioning_macros>`,
    and :ref:`ABI deprecation <abi_deprecation>`.
 
 
-.. _mapping_static_symbols:
-
-Mapping static symbols
-______________________
-
-Now we've taken what was a public symbol, and duplicated it into two uniquely
-and differently named symbols. We've then mapped each of those back to the
-public symbol ``rte_acl_create`` with different version tags. This only applies
-to dynamic linking, as static linking has no notion of versioning. That leaves
-this code in a position of no longer having a symbol simply named
-``rte_acl_create`` and a static build will fail on that missing symbol.
-
-To correct this, we can simply map a function of our choosing back to the public
-symbol in the static build with the ``MAP_STATIC_SYMBOL`` macro.  Generally the
-assumption is that the most recent version of the symbol is the one you want to
-map.  So, back in the C file where, immediately after ``rte_acl_create_v22`` is
-defined, we add this
-
-
-.. code-block:: c
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug)
-   {
-        ...
-   }
-   MAP_STATIC_SYMBOL(struct rte_acl_ctx *rte_acl_create(const struct rte_acl_param *param, int debug), rte_acl_create_v22);
-
-That tells the compiler that, when building a static library, any calls to the
-symbol ``rte_acl_create`` should be linked to ``rte_acl_create_v22``
-
-
 .. _enabling_versioning_macros:
 
 Enabling versioning macros
@@ -519,26 +449,17 @@ and ``DPDK_22`` version nodes.
     * Create an acl context object for apps to
     * manipulate
     */
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param)
+   RTE_DEFAULT_SYMBOL(22, struct rte_acl_ctx *, rte_acl_create,
+        (const struct rte_acl_param *param)
    {
    ...
-   }
+   })
 
-   __rte_experimental
-   struct rte_acl_ctx *
-   rte_acl_create_e(const struct rte_acl_param *param)
-   {
-      return rte_acl_create(param);
-   }
-   VERSION_SYMBOL_EXPERIMENTAL(rte_acl_create, _e);
-
-   struct rte_acl_ctx *
-   rte_acl_create_v22(const struct rte_acl_param *param)
+   RTE_VERSION_EXPERIMENTAL_SYMBOL(struct rte_acl_ctx *, rte_acl_create,
+        (const struct rte_acl_param *param)
    {
       return rte_acl_create(param);
-   }
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
+   })
 
 In the map file, we map the symbol to both the ``EXPERIMENTAL``
 and ``DPDK_22`` version nodes.
@@ -564,13 +485,6 @@ and ``DPDK_22`` version nodes.
         rte_acl_create;
    };
 
-.. note::
-
-   Please note, similar to :ref:`symbol versioning <example_abi_macro_usage>`,
-   when aliasing to experimental you will also need to take care of
-   :ref:`mapping static symbols <mapping_static_symbols>`.
-
-
 .. _abi_deprecation:
 
 Deprecating part of a public API
diff --git a/lib/eal/include/rte_function_versioning.h b/lib/eal/include/rte_function_versioning.h
index eb6dd2bc17..259b960ef5 100644
--- a/lib/eal/include/rte_function_versioning.h
+++ b/lib/eal/include/rte_function_versioning.h
@@ -96,4 +96,31 @@
  */
 #endif
 
+#ifdef RTE_BUILD_SHARED_LIB
+
+#define RTE_VERSION_SYMBOL(ver, type, name, ...) \
+__rte_used type name ## _v ## ver __VA_ARGS__ \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@DPDK_" RTE_STR(ver));
+
+#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, ...) \
+__rte_used type name ## _exp __VA_ARGS__ \
+__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL")
+
+#define RTE_DEFAULT_SYMBOL(ver, type, name, ...) \
+__rte_used type name ## _v ## ver __VA_ARGS__ \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@@DPDK_" RTE_STR(ver));
+
+#else /* !RTE_BUILD_SHARED_LIB */
+
+#define RTE_VERSION_SYMBOL(ver, type, name, ...) \
+type name ## _v ## ver __VA_ARGS__
+
+#define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, ...) \
+type name ## _exp __VA_ARGS__
+
+#define RTE_DEFAULT_SYMBOL(ver, type, name, ...) \
+type name __VA_ARGS__
+
+#endif /* RTE_BUILD_SHARED_LIB */
+
 #endif /* _RTE_FUNCTION_VERSIONING_H_ */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 2fb3eec231..dd93d43c2e 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -345,8 +345,7 @@ handlers_init(enum rte_net_crc_alg alg)
 
 /* Public API */
 
-void
-rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
+RTE_VERSION_SYMBOL(25, void, rte_net_crc_set_alg, (enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -372,10 +371,9 @@ rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 
 	if (handlers == NULL)
 		handlers = handlers_scalar;
-}
-VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+})
 
-struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+RTE_DEFAULT_SYMBOL(26, struct rte_net_crc *, rte_net_crc_set_alg, (enum rte_net_crc_alg alg,
 	enum rte_net_crc_type type)
 {
 	uint16_t max_simd_bitwidth;
@@ -413,20 +411,14 @@ struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
 		break;
 	}
 	return crc;
-}
-BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
-MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
-	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
-	rte_net_crc_set_alg_v26);
+})
 
 void rte_net_crc_free(struct rte_net_crc *crc)
 {
 	rte_free(crc);
 }
 
-uint32_t
-rte_net_crc_calc_v25(const void *data,
-	uint32_t data_len,
+RTE_VERSION_SYMBOL(25, uint32_t, rte_net_crc_calc, (const void *data, uint32_t data_len,
 	enum rte_net_crc_type type)
 {
 	uint32_t ret;
@@ -436,19 +428,13 @@ rte_net_crc_calc_v25(const void *data,
 	ret = f_handle(data, data_len);
 
 	return ret;
-}
-VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+})
 
-uint32_t
-rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+RTE_DEFAULT_SYMBOL(26, uint32_t, rte_net_crc_calc, (const struct rte_net_crc *ctx,
 	const void *data, const uint32_t data_len)
 {
 	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
-}
-BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
-MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len),
-	rte_net_crc_calc_v26);
+})
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
-- 
2.48.1


^ permalink raw reply	[relevance 6%]

* [RFC] eal: add new function versioning macros
@ 2025-03-05 21:23  6% David Marchand
  2025-03-06 12:50  6% ` [RFC v2 1/2] " David Marchand
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: David Marchand @ 2025-03-05 21:23 UTC (permalink / raw)
  To: dev; +Cc: thomas, andremue, Tyler Retzlaff, Jasvinder Singh

For versioning symbols:
- MSVC uses pragmas on the function symbol,
- GNU linker uses special asm directives,

To accommodate both GNU linker and MSVC linker, introduce new macros for
versioning functions that will surround the whole function.

This has the advantage of hiding all the ugly details in the macros.
Now versioning a function is just a call to a single macro:
- VERSION_FUNCTION (resp. VERSION_FUNCTION_EXPERIMENTAL), for keeping an
  old implementation code under a versioned function,
- DEFAULT_FUNCTION, for declaring the new default versioned function,
  and handling the static link special case, instead of
  BIND_DEFAULT_SYMBOL + MAP_STATIC_SYMBOL,

Documentation has been updated though it needs some polishing.
The experimental macro has not been tested.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 doc/guides/contributing/abi_versioning.rst | 133 ++++-----------------
 lib/eal/include/rte_function_versioning.h  |  27 +++++
 lib/net/rte_net_crc.c                      |  41 +++----
 3 files changed, 69 insertions(+), 132 deletions(-)

diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 7afd1c1886..b83383fd0b 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -277,86 +277,52 @@ list of exported symbols when DPDK is compiled as a shared library.
 
 Next, we need to specify in the code which function maps to the rte_acl_create
 symbol at which versions.  First, at the site of the initial symbol definition,
-we need to update the function so that it is uniquely named, and not in conflict
-with the public symbol name
+we wrap the function with ``VERSION_FUNCTION``, passing the current ABI version,
+the function return type, and the function name, then the full implementation of the
+function.
 
 .. code-block:: c
 
  -struct rte_acl_ctx *
  -rte_acl_create(const struct rte_acl_param *param)
- +struct rte_acl_ctx * __vsym
- +rte_acl_create_v21(const struct rte_acl_param *param)
+ +VERSION_FUNCTION(21,
+ +struct rte_acl_ctx *,
+ +rte_acl_create, (const struct rte_acl_param *param)
  {
         size_t sz;
         struct rte_acl_ctx *ctx;
         ...
-
-Note that the base name of the symbol was kept intact, as this is conducive to
-the macros used for versioning symbols and we have annotated the function as
-``__vsym``, an implementation of a versioned symbol . That is our next step,
-mapping this new symbol name to the initial symbol name at version node 21.
-Immediately after the function, we add the VERSION_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   VERSION_SYMBOL(rte_acl_create, _v21, 21);
+ -}
+ +})
 
 Remembering to also add the rte_function_versioning.h header to the requisite c
 file where these changes are being made. The macro instructs the linker to
 create a new symbol ``rte_acl_create@DPDK_21``, which matches the symbol created
-in older builds, but now points to the above newly named function. We have now
-mapped the original rte_acl_create symbol to the original function (but with a
-new name).
+in older builds, but now points to the above newly named function ``rte_acl_create_v21``.
+We have now mapped the original rte_acl_create symbol to the original function
+(but with a new name).
 
 Please see the section :ref:`Enabling versioning macros
 <enabling_versioning_macros>` to enable this macro in the meson/ninja build.
-Next, we need to create the new ``v22`` version of the symbol. We create a new
-function name, with the ``v22`` suffix, and implement it appropriately.
+Next, we need to create the new version of the symbol. We create a new
+function name and implement it appropriately, then wrap it in a call to ``DEFAULT_FUNCTION``.
 
 .. code-block:: c
 
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
+   DEFAULT_FUNCTION(22,
+   struct rte_acl_ctx *,
+   rte_acl_create, (const struct rte_acl_param *param, int debug);
    {
         struct rte_acl_ctx *ctx = rte_acl_create_v21(param);
 
         ctx->debug = debug;
 
         return ctx;
-   }
-
-This code serves as our new API call. Its the same as our old call, but adds the
-new parameter in place. Next we need to map this function to the new default
-symbol ``rte_acl_create@DPDK_22``. To do this, immediately after the function,
-we add the BIND_DEFAULT_SYMBOL macro.
-
-.. code-block:: c
-
-   #include <rte_function_versioning.h>
-
-   ...
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
+   })
 
 The macro instructs the linker to create the new default symbol
-``rte_acl_create@DPDK_22``, which points to the above newly named function.
-
-We finally modify the prototype of the call in the public header file,
-such that it contains both versions of the symbol and the public API.
-
-.. code-block:: c
-
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v21(const struct rte_acl_param *param);
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug);
-
+``rte_acl_create@DPDK_22``, which points to the function named ``rte_acl_create_v22``
+(declared by the macro).
 
 And that's it, on the next shared library rebuild, there will be two versions of
 rte_acl_create, an old DPDK_21 version, used by previously built applications,
@@ -365,43 +331,10 @@ and a new DPDK_22 version, used by future built applications.
 .. note::
 
    **Before you leave**, please take care reviewing the sections on
-   :ref:`mapping static symbols <mapping_static_symbols>`,
    :ref:`enabling versioning macros <enabling_versioning_macros>`,
    and :ref:`ABI deprecation <abi_deprecation>`.
 
 
-.. _mapping_static_symbols:
-
-Mapping static symbols
-______________________
-
-Now we've taken what was a public symbol, and duplicated it into two uniquely
-and differently named symbols. We've then mapped each of those back to the
-public symbol ``rte_acl_create`` with different version tags. This only applies
-to dynamic linking, as static linking has no notion of versioning. That leaves
-this code in a position of no longer having a symbol simply named
-``rte_acl_create`` and a static build will fail on that missing symbol.
-
-To correct this, we can simply map a function of our choosing back to the public
-symbol in the static build with the ``MAP_STATIC_SYMBOL`` macro.  Generally the
-assumption is that the most recent version of the symbol is the one you want to
-map.  So, back in the C file where, immediately after ``rte_acl_create_v22`` is
-defined, we add this
-
-
-.. code-block:: c
-
-   struct rte_acl_ctx * __vsym
-   rte_acl_create_v22(const struct rte_acl_param *param, int debug)
-   {
-        ...
-   }
-   MAP_STATIC_SYMBOL(struct rte_acl_ctx *rte_acl_create(const struct rte_acl_param *param, int debug), rte_acl_create_v22);
-
-That tells the compiler that, when building a static library, any calls to the
-symbol ``rte_acl_create`` should be linked to ``rte_acl_create_v22``
-
-
 .. _enabling_versioning_macros:
 
 Enabling versioning macros
@@ -519,26 +452,19 @@ and ``DPDK_22`` version nodes.
     * Create an acl context object for apps to
     * manipulate
     */
-   struct rte_acl_ctx *
-   rte_acl_create(const struct rte_acl_param *param)
+   DEFAULT_FUNCTION(22,
+   struct rte_acl_ctx *,
+   rte_acl_create, (const struct rte_acl_param *param)
    {
    ...
-   }
-
-   __rte_experimental
-   struct rte_acl_ctx *
-   rte_acl_create_e(const struct rte_acl_param *param)
-   {
-      return rte_acl_create(param);
-   }
-   VERSION_SYMBOL_EXPERIMENTAL(rte_acl_create, _e);
+   })
 
+   VERSION_FUNCTION_EXPERIMENTAL(
    struct rte_acl_ctx *
-   rte_acl_create_v22(const struct rte_acl_param *param)
+   rte_acl_create, (const struct rte_acl_param *param)
    {
       return rte_acl_create(param);
-   }
-   BIND_DEFAULT_SYMBOL(rte_acl_create, _v22, 22);
+   })
 
 In the map file, we map the symbol to both the ``EXPERIMENTAL``
 and ``DPDK_22`` version nodes.
@@ -564,13 +490,6 @@ and ``DPDK_22`` version nodes.
         rte_acl_create;
    };
 
-.. note::
-
-   Please note, similar to :ref:`symbol versioning <example_abi_macro_usage>`,
-   when aliasing to experimental you will also need to take care of
-   :ref:`mapping static symbols <mapping_static_symbols>`.
-
-
 .. _abi_deprecation:
 
 Deprecating part of a public API
diff --git a/lib/eal/include/rte_function_versioning.h b/lib/eal/include/rte_function_versioning.h
index eb6dd2bc17..7a33a45928 100644
--- a/lib/eal/include/rte_function_versioning.h
+++ b/lib/eal/include/rte_function_versioning.h
@@ -96,4 +96,31 @@
  */
 #endif
 
+#ifdef RTE_BUILD_SHARED_LIB
+
+#define VERSION_FUNCTION(ver, type, name, ...) \
+__rte_used type name ## _v ## ver __VA_ARGS__ \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@DPDK_" RTE_STR(ver));
+
+#define VERSION_FUNCTION_EXPERIMENTAL(type, name, ...) \
+__rte_used type name ## _exp __VA_ARGS__ \
+__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL")
+
+#define DEFAULT_FUNCTION(ver, type, name, ...) \
+__rte_used type name ## _v ## ver __VA_ARGS__ \
+__asm__(".symver " RTE_STR(name) "_v" RTE_STR(ver) ", " RTE_STR(name) "@@DPDK_" RTE_STR(ver));
+
+#else /* !RTE_BUILD_SHARED_LIB */
+
+#define VERSION_FUNCTION(ver, type, name, ...) \
+type name ## _v ## ver __VA_ARGS__
+
+#define VERSION_FUNCTION_EXPERIMENTAL(type, name, ...) \
+type name ## _exp __VA_ARGS__
+
+#define DEFAULT_FUNCTION(ver, type, name, ...) \
+type name __VA_ARGS__
+
+#endif /* RTE_BUILD_SHARED_LIB */
+
 #endif /* _RTE_FUNCTION_VERSIONING_H_ */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 2fb3eec231..a1c17e0735 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -345,8 +345,9 @@ handlers_init(enum rte_net_crc_alg alg)
 
 /* Public API */
 
-void
-rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
+VERSION_FUNCTION(25,
+void,
+rte_net_crc_set_alg, (enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -372,11 +373,11 @@ rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 
 	if (handlers == NULL)
 		handlers = handlers_scalar;
-}
-VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+})
 
-struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
-	enum rte_net_crc_type type)
+DEFAULT_FUNCTION(26,
+struct rte_net_crc *,
+rte_net_crc_set_alg, (enum rte_net_crc_alg alg, enum rte_net_crc_type type)
 {
 	uint16_t max_simd_bitwidth;
 	struct rte_net_crc *crc;
@@ -413,21 +414,16 @@ struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
 		break;
 	}
 	return crc;
-}
-BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
-MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
-	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
-	rte_net_crc_set_alg_v26);
+})
 
 void rte_net_crc_free(struct rte_net_crc *crc)
 {
 	rte_free(crc);
 }
 
-uint32_t
-rte_net_crc_calc_v25(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type)
+VERSION_FUNCTION(25,
+uint32_t,
+rte_net_crc_calc, (const void *data, uint32_t data_len, enum rte_net_crc_type type)
 {
 	uint32_t ret;
 	rte_net_crc_handler f_handle;
@@ -436,19 +432,14 @@ rte_net_crc_calc_v25(const void *data,
 	ret = f_handle(data, data_len);
 
 	return ret;
-}
-VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+})
 
-uint32_t
-rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len)
+DEFAULT_FUNCTION(26,
+uint32_t,
+rte_net_crc_calc, (const struct rte_net_crc *ctx, const void *data, const uint32_t data_len)
 {
 	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
-}
-BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
-MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
-	const void *data, const uint32_t data_len),
-	rte_net_crc_calc_v26);
+})
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
-- 
2.48.1


^ permalink raw reply	[relevance 6%]

* [v6 4/6] crypto/virtio: add vDPA backend
  @ 2025-03-05  6:16  1% ` Gowrishankar Muthukrishnan
  0 siblings, 0 replies; 200+ results
From: Gowrishankar Muthukrishnan @ 2025-03-05  6:16 UTC (permalink / raw)
  To: dev, Jay Zhou; +Cc: anoobj, Akhil Goyal, Gowrishankar Muthukrishnan

Add vDPA backend to virtio_user crypto.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
---
 drivers/crypto/virtio/meson.build             |   7 +
 drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
 drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
 drivers/crypto/virtio/virtio_logs.h           |   6 +-
 drivers/crypto/virtio/virtio_pci.h            |   7 +
 drivers/crypto/virtio/virtio_ring.h           |   6 -
 drivers/crypto/virtio/virtio_user/vhost.h     |  90 +++
 .../crypto/virtio/virtio_user/vhost_vdpa.c    | 710 +++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.c      | 749 ++++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.h      |  85 ++
 drivers/crypto/virtio/virtio_user_cryptodev.c | 575 ++++++++++++++
 11 files changed, 2265 insertions(+), 30 deletions(-)
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost.h
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
 create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c

diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
index d2c3b3ad07..3763e86746 100644
--- a/drivers/crypto/virtio/meson.build
+++ b/drivers/crypto/virtio/meson.build
@@ -16,3 +16,10 @@ sources = files(
         'virtio_rxtx.c',
         'virtqueue.c',
 )
+
+if is_linux
+    sources += files('virtio_user_cryptodev.c',
+        'virtio_user/vhost_vdpa.c',
+        'virtio_user/virtio_user_dev.c')
+    deps += ['bus_vdev']
+endif
diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
index 92fea557ab..bc737f1e68 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.c
+++ b/drivers/crypto/virtio/virtio_cryptodev.c
@@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
 	return 0;
 }
 
-/*
- * This function is based on probe() function
- * It returns 0 on success.
- */
-static int
-crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
-		struct rte_cryptodev_pmd_init_params *init_params)
+int
+crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev)
 {
-	struct rte_cryptodev *cryptodev;
 	struct virtio_crypto_hw *hw;
 
-	PMD_INIT_FUNC_TRACE();
-
-	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
-					init_params);
-	if (cryptodev == NULL)
-		return -ENODEV;
-
 	cryptodev->driver_id = cryptodev_virtio_driver_id;
 	cryptodev->dev_ops = &virtio_crypto_dev_ops;
 
@@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
 	hw->dev_id = cryptodev->data->dev_id;
 	hw->virtio_dev_capabilities = virtio_capabilities;
 
-	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
-		cryptodev->data->dev_id, pci_dev->id.vendor_id,
-		pci_dev->id.device_id);
+	if (pci_dev) {
+		/* pci device init */
+		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
+			cryptodev->data->dev_id, pci_dev->id.vendor_id,
+			pci_dev->id.device_id);
 
-	/* pci device init */
-	if (vtpci_cryptodev_init(pci_dev, hw))
+		if (vtpci_cryptodev_init(pci_dev, hw))
+			return -1;
+	}
+
+	if (virtio_crypto_init_device(cryptodev, features) < 0)
 		return -1;
 
-	if (virtio_crypto_init_device(cryptodev,
-			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
+	return 0;
+}
+
+/*
+ * This function is based on probe() function
+ * It returns 0 on success.
+ */
+static int
+crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
+		struct rte_cryptodev_pmd_init_params *init_params)
+{
+	struct rte_cryptodev *cryptodev;
+
+	PMD_INIT_FUNC_TRACE();
+
+	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
+					init_params);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
+			pci_dev) < 0)
 		return -1;
 
 	rte_cryptodev_pmd_probing_finish(cryptodev);
diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
index f8498246e2..fad73d54a8 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.h
+++ b/drivers/crypto/virtio/virtio_cryptodev.h
@@ -76,4 +76,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
 		struct rte_crypto_op **tx_pkts,
 		uint16_t nb_pkts);
 
+int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev);
+
 #endif /* _VIRTIO_CRYPTODEV_H_ */
diff --git a/drivers/crypto/virtio/virtio_logs.h b/drivers/crypto/virtio/virtio_logs.h
index 988514919f..1cc51f7990 100644
--- a/drivers/crypto/virtio/virtio_logs.h
+++ b/drivers/crypto/virtio/virtio_logs.h
@@ -15,8 +15,10 @@ extern int virtio_crypto_logtype_init;
 
 #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
 
-extern int virtio_crypto_logtype_init;
-#define RTE_LOGTYPE_VIRTIO_CRYPTO_INIT virtio_crypto_logtype_init
+extern int virtio_crypto_logtype_driver;
+#define RTE_LOGTYPE_VIRTIO_CRYPTO_DRIVER virtio_crypto_logtype_driver
+#define PMD_DRV_LOG(level, ...) \
+	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_DRIVER, "%s(): ", __func__, __VA_ARGS__)
 
 #define VIRTIO_CRYPTO_INIT_LOG_IMPL(level, ...) \
 	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_INIT, "%s(): ", __func__, __VA_ARGS__)
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
index 79945cb88e..c75777e005 100644
--- a/drivers/crypto/virtio/virtio_pci.h
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -20,6 +20,9 @@ struct virtqueue;
 #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
 #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
 
+/* VirtIO device IDs. */
+#define VIRTIO_ID_CRYPTO  20
+
 /* VirtIO ABI version, this must match exactly. */
 #define VIRTIO_PCI_ABI_VERSION 0
 
@@ -56,8 +59,12 @@ struct virtqueue;
 #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
 #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
 #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
 #define VIRTIO_CONFIG_STATUS_FAILED    0x80
 
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_VRING_ALIGN 4096
+
 /*
  * Each virtqueue indirect descriptor list must be physically contiguous.
  * To allow us to malloc(9) each list individually, limit the number
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
index c74d1172b7..4b418f6e60 100644
--- a/drivers/crypto/virtio/virtio_ring.h
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
 				sizeof(struct vring_packed_desc_event)), align);
 }
 
-static inline void
-vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
-{
-	vring_init_split(vr, p, 0, align, num);
-}
-
 /*
  * The following is used with VIRTIO_RING_F_EVENT_IDX.
  * Assuming a given event_idx value from the other size, if we have
diff --git a/drivers/crypto/virtio/virtio_user/vhost.h b/drivers/crypto/virtio/virtio_user/vhost.h
new file mode 100644
index 0000000000..29cc1a14d4
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#ifndef _VIRTIO_USER_VHOST_H
+#define _VIRTIO_USER_VHOST_H
+
+#include <stdint.h>
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#include <rte_errno.h>
+
+#include "../virtio_logs.h"
+
+struct vhost_vring_state {
+	unsigned int index;
+	unsigned int num;
+};
+
+struct vhost_vring_file {
+	unsigned int index;
+	int fd;
+};
+
+struct vhost_vring_addr {
+	unsigned int index;
+	/* Option flags. */
+	unsigned int flags;
+	/* Flag values: */
+	/* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+	/* Start of array of descriptors (virtually contiguous) */
+	uint64_t desc_user_addr;
+	/* Used structure address. Must be 32 bit aligned */
+	uint64_t used_user_addr;
+	/* Available structure address. Must be 16 bit aligned */
+	uint64_t avail_user_addr;
+	/* Logging support. */
+	/* Log writes to used structure, at offset calculated from specified
+	 * address. Address must be 32 bit aligned.
+	 */
+	uint64_t log_guest_addr;
+};
+
+#ifndef VHOST_BACKEND_F_IOTLB_MSG_V2
+#define VHOST_BACKEND_F_IOTLB_MSG_V2 1
+#endif
+
+#ifndef VHOST_BACKEND_F_IOTLB_BATCH
+#define VHOST_BACKEND_F_IOTLB_BATCH 2
+#endif
+
+struct virtio_user_dev;
+
+struct virtio_user_backend_ops {
+	int (*setup)(struct virtio_user_dev *dev);
+	int (*destroy)(struct virtio_user_dev *dev);
+	int (*get_backend_features)(uint64_t *features);
+	int (*set_owner)(struct virtio_user_dev *dev);
+	int (*get_features)(struct virtio_user_dev *dev, uint64_t *features);
+	int (*set_features)(struct virtio_user_dev *dev, uint64_t features);
+	int (*set_memory_table)(struct virtio_user_dev *dev);
+	int (*set_vring_num)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*get_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_call)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_kick)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_addr)(struct virtio_user_dev *dev, struct vhost_vring_addr *addr);
+	int (*get_status)(struct virtio_user_dev *dev, uint8_t *status);
+	int (*set_status)(struct virtio_user_dev *dev, uint8_t status);
+	int (*get_config)(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len);
+	int (*set_config)(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off,
+			uint32_t len);
+	int (*cvq_enable)(struct virtio_user_dev *dev, int enable);
+	int (*enable_qp)(struct virtio_user_dev *dev, uint16_t pair_idx, int enable);
+	int (*dma_map)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*dma_unmap)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*update_link_state)(struct virtio_user_dev *dev);
+	int (*server_disconnect)(struct virtio_user_dev *dev);
+	int (*server_reconnect)(struct virtio_user_dev *dev);
+	int (*get_intr_fd)(struct virtio_user_dev *dev);
+	int (*map_notification_area)(struct virtio_user_dev *dev);
+	int (*unmap_notification_area)(struct virtio_user_dev *dev);
+};
+
+extern struct virtio_user_backend_ops virtio_ops_vdpa;
+
+#endif
diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
new file mode 100644
index 0000000000..b5839875e6
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
@@ -0,0 +1,710 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include <rte_memory.h>
+
+#include "vhost.h"
+#include "virtio_user_dev.h"
+#include "../virtio_pci.h"
+
+struct vhost_vdpa_data {
+	int vhostfd;
+	uint64_t protocol_features;
+};
+
+#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
+	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
+	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
+
+/* vhost kernel & vdpa ioctls */
+#define VHOST_VIRTIO 0xAF
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
+#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
+#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
+#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
+#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
+#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	uint64_t iova;
+	uint64_t size;
+	uint64_t uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	uint8_t perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+#define VHOST_IOTLB_BATCH_BEGIN    5
+#define VHOST_IOTLB_BATCH_END      6
+	uint8_t type;
+};
+
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_vdpa_config {
+	uint32_t off;
+	uint32_t len;
+	uint8_t buf[];
+};
+
+struct vhost_msg {
+	uint32_t type;
+	uint32_t reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		uint8_t padding[64];
+	};
+};
+
+static int
+vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
+{
+	int ret;
+
+	ret = ioctl(fd, request, arg);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
+				request, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_owner(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_OWNER, NULL);
+}
+
+static int
+vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
+}
+
+static int
+vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int ret;
+
+	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get features");
+		return -1;
+	}
+
+	/* Negotiated vDPA backend features */
+	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to get backend features");
+		return -1;
+	}
+
+	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
+
+	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to set backend features");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	/* WORKAROUND */
+	features |= 1ULL << VIRTIO_F_IOMMU_PLATFORM;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_iotlb_batch_begin(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch begin (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_iotlb_batch_end(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_END;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch end (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_UPDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.uaddr = (uint64_t)(uintptr_t)addr;
+	msg.iotlb.size = len;
+	msg.iotlb.perm = VHOST_ACCESS_RW;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", addr: %p, len: 0x%zx",
+			__func__, iova, addr, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB update (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_unmap(struct virtio_user_dev *dev, __rte_unused void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.size = len;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", len: 0x%zx",
+			__func__, iova, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB invalidate (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_map(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_dma_unmap_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_unmap(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_map_contig(const struct rte_memseg_list *msl,
+		const struct rte_memseg *ms, size_t len, void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	if (msl->external)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, len);
+}
+
+static int
+vhost_vdpa_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
+		void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	/* skip external memory that isn't a heap */
+	if (msl->external && !msl->heap)
+		return 0;
+
+	/* skip any segments with invalid IOVA addresses */
+	if (ms->iova == RTE_BAD_IOVA)
+		return 0;
+
+	/* if IOVA mode is VA, we've already mapped the internal segments */
+	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, ms->len);
+}
+
+static int
+vhost_vdpa_set_memory_table(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	vhost_vdpa_dma_unmap(dev, NULL, 0, SIZE_MAX);
+
+	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
+		/* with IOVA as VA mode, we can get away with mapping contiguous
+		 * chunks rather than going page-by-page.
+		 */
+		ret = rte_memseg_contig_walk_thread_unsafe(
+				vhost_vdpa_map_contig, dev);
+		if (ret)
+			goto batch_end;
+		/* we have to continue the walk because we've skipped the
+		 * external segments during the config walk.
+		 */
+	}
+	ret = rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev);
+
+batch_end:
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
+}
+
+static int
+vhost_vdpa_set_vring_num(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_NUM, state);
+}
+
+static int
+vhost_vdpa_set_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_get_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_set_vring_call(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_CALL, file);
+}
+
+static int
+vhost_vdpa_set_vring_kick(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_KICK, file);
+}
+
+static int
+vhost_vdpa_set_vring_addr(struct virtio_user_dev *dev, struct vhost_vring_addr *addr)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_ADDR, addr);
+}
+
+static int
+vhost_vdpa_get_status(struct virtio_user_dev *dev, uint8_t *status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_GET_STATUS, status);
+}
+
+static int
+vhost_vdpa_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_STATUS, &status);
+}
+
+static int
+vhost_vdpa_get_config(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_GET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+		goto out;
+	}
+
+	memcpy(data, config->buf, len);
+out:
+	free(config);
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_config(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	memcpy(config->buf, data, len);
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_SET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to set vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+	}
+
+	free(config);
+
+	return ret;
+}
+
+/**
+ * Set up environment to talk with a vhost vdpa backend.
+ *
+ * @return
+ *   - (-1) if fail to set up;
+ *   - (>=0) if successful.
+ */
+static int
+vhost_vdpa_setup(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data;
+	uint32_t did = (uint32_t)-1;
+
+	data = malloc(sizeof(*data));
+	if (!data) {
+		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
+		return -1;
+	}
+
+	data->vhostfd = open(dev->path, O_RDWR);
+	if (data->vhostfd < 0) {
+		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
+				dev->path, strerror(errno));
+		free(data);
+		return -1;
+	}
+
+	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
+			did != VIRTIO_ID_CRYPTO) {
+		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
+		close(data->vhostfd);
+		free(data);
+		return -1;
+	}
+
+	dev->backend_data = data;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_destroy(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	if (!data)
+		return 0;
+
+	close(data->vhostfd);
+
+	free(data);
+	dev->backend_data = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
+{
+	struct vhost_vring_state state = {
+		.index = dev->max_queue_pairs,
+		.num   = enable,
+	};
+
+	return vhost_vdpa_set_vring_enable(dev, &state);
+}
+
+static int
+vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
+				uint16_t pair_idx,
+				int enable)
+{
+	struct vhost_vring_state state = {
+		.index = pair_idx,
+		.num   = enable,
+	};
+
+	if (dev->qp_enabled[pair_idx] == enable)
+		return 0;
+
+	if (vhost_vdpa_set_vring_enable(dev, &state))
+		return -1;
+
+	dev->qp_enabled[pair_idx] = enable;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_backend_features(uint64_t *features)
+{
+	*features = 0;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
+{
+	/* TODO: It is W/A until a cleaner approach to find cpt status */
+	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_intr_fd(struct virtio_user_dev *dev __rte_unused)
+{
+	/* No link state interrupt with Vhost-vDPA */
+	return -1;
+}
+
+static int
+vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
+{
+	int nr_vrings = dev->max_queue_pairs;
+
+	return nr_vrings;
+}
+
+static int
+vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
+{
+	int i, nr_vrings;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	for (i = 0; i < nr_vrings; i++) {
+		if (dev->notify_area[i])
+			munmap(dev->notify_area[i], getpagesize());
+	}
+	free(dev->notify_area);
+	dev->notify_area = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int nr_vrings, i, page_size = getpagesize();
+	uint16_t **notify_area;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	/* CQ is another vring */
+	nr_vrings++;
+
+	notify_area = malloc(nr_vrings * sizeof(*notify_area));
+	if (!notify_area) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
+		return -1;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
+					data->vhostfd, i * page_size);
+		if (notify_area[i] == MAP_FAILED) {
+			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
+					dev->path, i);
+			i--;
+			goto map_err;
+		}
+	}
+	dev->notify_area = notify_area;
+
+	return 0;
+
+map_err:
+	for (; i >= 0; i--)
+		munmap(notify_area[i], page_size);
+	free(notify_area);
+
+	return -1;
+}
+
+struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
+	.setup = vhost_vdpa_setup,
+	.destroy = vhost_vdpa_destroy,
+	.get_backend_features = vhost_vdpa_get_backend_features,
+	.set_owner = vhost_vdpa_set_owner,
+	.get_features = vhost_vdpa_get_features,
+	.set_features = vhost_vdpa_set_features,
+	.set_memory_table = vhost_vdpa_set_memory_table,
+	.set_vring_num = vhost_vdpa_set_vring_num,
+	.set_vring_base = vhost_vdpa_set_vring_base,
+	.get_vring_base = vhost_vdpa_get_vring_base,
+	.set_vring_call = vhost_vdpa_set_vring_call,
+	.set_vring_kick = vhost_vdpa_set_vring_kick,
+	.set_vring_addr = vhost_vdpa_set_vring_addr,
+	.get_status = vhost_vdpa_get_status,
+	.set_status = vhost_vdpa_set_status,
+	.get_config = vhost_vdpa_get_config,
+	.set_config = vhost_vdpa_set_config,
+	.cvq_enable = vhost_vdpa_cvq_enable,
+	.enable_qp = vhost_vdpa_enable_queue_pair,
+	.dma_map = vhost_vdpa_dma_map_batch,
+	.dma_unmap = vhost_vdpa_dma_unmap_batch,
+	.update_link_state = vhost_vdpa_update_link_state,
+	.get_intr_fd = vhost_vdpa_get_intr_fd,
+	.map_notification_area = vhost_vdpa_map_notification_area,
+	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
+};
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
new file mode 100644
index 0000000000..c8478d72ce
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
@@ -0,0 +1,749 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+
+#include <rte_alarm.h>
+#include <rte_string_fns.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_io.h>
+
+#include "vhost.h"
+#include "virtio_logs.h"
+#include "cryptodev_pmd.h"
+#include "virtio_crypto.h"
+#include "virtio_cvq.h"
+#include "virtio_user_dev.h"
+#include "virtqueue.h"
+
+#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
+
+const char * const crypto_virtio_user_backend_strings[] = {
+	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
+	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
+};
+
+static int
+virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	if (dev->kickfds[queue_sel] >= 0) {
+		close(dev->kickfds[queue_sel]);
+		dev->kickfds[queue_sel] = -1;
+	}
+
+	if (dev->callfds[queue_sel] >= 0) {
+		close(dev->callfds[queue_sel]);
+		dev->callfds[queue_sel] = -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* May use invalid flag, but some backend uses kickfd and
+	 * callfd as criteria to judge if dev is alive. so finally we
+	 * use real event_fd.
+	 */
+	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->callfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->kickfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	struct vhost_vring_state state;
+	int ret;
+
+	state.index = queue_sel;
+	ret = dev->ops->get_vring_base(dev, &state);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
+	 * firstly because vhost depends on this msg to allocate virtqueue
+	 * pair.
+	 */
+	struct vhost_vring_file file;
+	int ret;
+
+	file.index = queue_sel;
+	file.fd = dev->callfds[queue_sel];
+	ret = dev->ops->set_vring_call(dev, &file);
+	if (ret < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	int ret;
+	struct vhost_vring_file file;
+	struct vhost_vring_state state;
+	struct vring *vring = &dev->vrings.split[queue_sel];
+	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
+	uint64_t desc_addr, avail_addr, used_addr;
+	struct vhost_vring_addr addr = {
+		.index = queue_sel,
+		.log_guest_addr = 0,
+		.flags = 0, /* disable log */
+	};
+
+	if (queue_sel == dev->max_queue_pairs) {
+		if (!dev->scvq) {
+			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
+					dev->path);
+			goto err;
+		}
+
+		/* Use shadow control queue information */
+		vring = &dev->scvq->vq_split.ring;
+		pq_vring = &dev->scvq->vq_packed.ring;
+	}
+
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+		desc_addr = pq_vring->desc_iova;
+		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
+		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
+						VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	} else {
+		desc_addr = vring->desc_iova;
+		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
+		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
+					VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	}
+
+	state.index = queue_sel;
+	state.num = vring->num;
+	ret = dev->ops->set_vring_num(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	state.index = queue_sel;
+	state.num = 0; /* no reservation */
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+		state.num |= (1 << 15);
+	ret = dev->ops->set_vring_base(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	ret = dev->ops->set_vring_addr(dev, &addr);
+	if (ret < 0)
+		goto err;
+
+	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
+	 * lastly because vhost depends on this msg to judge if
+	 * virtio is ready.
+	 */
+	file.index = queue_sel;
+	file.fd = dev->kickfds[queue_sel];
+	ret = dev->ops->set_vring_kick(dev, &file);
+	if (ret < 0)
+		goto err;
+
+	return 0;
+err:
+	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
+
+	return -1;
+}
+
+static int
+virtio_user_foreach_queue(struct virtio_user_dev *dev,
+			int (*fn)(struct virtio_user_dev *, uint32_t))
+{
+	uint32_t i, nr_vq;
+
+	nr_vq = dev->max_queue_pairs;
+
+	for (i = 0; i < nr_vq; i++)
+		if (fn(dev, i) < 0)
+			return -1;
+
+	return 0;
+}
+
+int
+crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
+{
+	uint64_t features;
+	int ret = -1;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 0: tell vhost to create queues */
+	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
+		goto error;
+
+	features = dev->features;
+
+	ret = dev->ops->set_features(dev, features);
+	if (ret < 0)
+		goto error;
+	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
+error:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return ret;
+}
+
+int
+crypto_virtio_user_start_device(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	/*
+	 * XXX workaround!
+	 *
+	 * We need to make sure that the locks will be
+	 * taken in the correct order to avoid deadlocks.
+	 *
+	 * Before releasing this lock, this thread should
+	 * not trigger any memory hotplug events.
+	 *
+	 * This is a temporary workaround, and should be
+	 * replaced when we get proper supports from the
+	 * memory subsystem in the future.
+	 */
+	rte_mcfg_mem_read_lock();
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 2: share memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto error;
+
+	/* Step 3: kick queues */
+	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
+	if (ret < 0)
+		goto error;
+
+	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
+	if (ret < 0)
+		goto error;
+
+	/* Step 4: enable queues */
+	for (int i = 0; i < dev->max_queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto error;
+	}
+
+	dev->started = true;
+
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	return 0;
+error:
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
+
+	/* TODO: free resource here or caller to check */
+	return -1;
+}
+
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
+{
+	uint32_t i;
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	if (!dev->started)
+		goto out;
+
+	for (i = 0; i < dev->max_queue_pairs; ++i) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	if (dev->scvq) {
+		ret = dev->ops->cvq_enable(dev, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	/* Stop the backend. */
+	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
+		goto err;
+
+	dev->started = false;
+
+out:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return 0;
+err:
+	pthread_mutex_unlock(&dev->mutex);
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
+
+	return -1;
+}
+
+static int
+virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
+{
+	int ret;
+
+	if (!dev->ops->get_config) {
+		dev->max_queue_pairs = user_max_qp;
+		return 0;
+	}
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
+			offsetof(struct virtio_crypto_config, max_dataqueues),
+			sizeof(uint16_t));
+	if (ret) {
+		/*
+		 * We need to know the max queue pair from the device so that
+		 * the control queue gets the right index.
+		 */
+		dev->max_queue_pairs = 1;
+		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
+
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
+{
+	struct virtio_crypto_config config;
+	int ret;
+
+	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
+	dev->cipher_algo = 0;
+	dev->auth_algo = 0;
+	dev->akcipher_algo = 0;
+
+	if (!dev->ops->get_config)
+		return 0;
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
+	if (ret) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
+		return ret;
+	}
+
+	dev->crypto_services = config.crypto_services;
+	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
+						config.cipher_algo_l;
+	dev->hash_algo = config.hash_algo;
+	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
+						config.mac_algo_l;
+	dev->aead_algo = config.aead_algo;
+	dev->akcipher_algo = config.akcipher_algo;
+	return 0;
+}
+
+static int
+virtio_user_dev_init_notify(struct virtio_user_dev *dev)
+{
+
+	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
+		goto err;
+
+	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
+		if (dev->ops->map_notification_area &&
+				dev->ops->map_notification_area(dev))
+			goto err;
+
+	return 0;
+err:
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	return -1;
+}
+
+static void
+virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
+{
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	if (dev->ops->unmap_notification_area && dev->notify_area)
+		dev->ops->unmap_notification_area(dev);
+}
+
+static void
+virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
+			const void *addr,
+			size_t len __rte_unused,
+			void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+	struct rte_memseg_list *msl;
+	uint16_t i;
+	int ret = 0;
+
+	/* ignore externally allocated memory */
+	msl = rte_mem_virt2memseg_list(addr);
+	if (msl->external)
+		return;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	if (dev->started == false)
+		goto exit;
+
+	/* Step 1: pause the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* Step 2: update memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto exit;
+
+	/* Step 3: resume the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto exit;
+	}
+
+exit:
+	pthread_mutex_unlock(&dev->mutex);
+
+	if (ret < 0)
+		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
+}
+
+static int
+virtio_user_dev_setup(struct virtio_user_dev *dev)
+{
+	if (dev->is_server) {
+		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
+			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
+			return -1;
+		}
+	}
+
+	switch (dev->backend_type) {
+	case VIRTIO_USER_BACKEND_VHOST_VDPA:
+		dev->ops = &virtio_crypto_ops_vdpa;
+		break;
+	default:
+		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_alloc_vrings(struct virtio_user_dev *dev)
+{
+	int i, size, nr_vrings;
+	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
+
+	nr_vrings = dev->max_queue_pairs + 1;
+
+	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
+	if (!dev->callfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
+		return -1;
+	}
+
+	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
+	if (!dev->kickfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
+		goto free_callfds;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		dev->callfds[i] = -1;
+		dev->kickfds[i] = -1;
+	}
+
+	if (packed_ring)
+		size = sizeof(*dev->vrings.packed);
+	else
+		size = sizeof(*dev->vrings.split);
+	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
+	if (!dev->vrings.ptr) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
+		goto free_kickfds;
+	}
+
+	if (packed_ring) {
+		dev->packed_queues = rte_zmalloc("virtio_user_dev",
+				nr_vrings * sizeof(*dev->packed_queues), 0);
+		if (!dev->packed_queues) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
+					dev->path);
+			goto free_vrings;
+		}
+	}
+
+	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
+			nr_vrings * sizeof(*dev->qp_enabled), 0);
+	if (!dev->qp_enabled) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
+		goto free_packed_queues;
+	}
+
+	return 0;
+
+free_packed_queues:
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+free_vrings:
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+free_kickfds:
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+free_callfds:
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+
+	return -1;
+}
+
+static void
+virtio_user_free_vrings(struct virtio_user_dev *dev)
+{
+	rte_free(dev->qp_enabled);
+	dev->qp_enabled = NULL;
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+}
+
+#define VIRTIO_USER_SUPPORTED_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+int
+crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server)
+{
+	uint64_t backend_features;
+
+	pthread_mutex_init(&dev->mutex, NULL);
+	strlcpy(dev->path, path, PATH_MAX);
+
+	dev->started = 0;
+	dev->queue_pairs = 1; /* mq disabled by default */
+	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
+	dev->queue_size = queue_size;
+	dev->is_server = server;
+	dev->frontend_features = 0;
+	dev->unsupported_features = 0;
+	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
+	dev->hw.modern = 1;
+
+	if (virtio_user_dev_setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->set_owner(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
+		goto destroy;
+	}
+
+	if (dev->ops->get_backend_features(&backend_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
+		goto destroy;
+	}
+
+	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
+
+	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_cipher_services(dev)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
+		goto destroy;
+	}
+
+	dev->frontend_features &= ~dev->unsupported_features;
+	dev->device_features &= ~dev->unsupported_features;
+
+	if (virtio_user_alloc_vrings(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_notify(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
+		goto free_vrings;
+	}
+
+	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
+				virtio_user_mem_event_cb, dev)) {
+		if (rte_errno != ENOTSUP) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
+					dev->path);
+			goto notify_uninit;
+		}
+	}
+
+	return 0;
+
+notify_uninit:
+	virtio_user_dev_uninit_notify(dev);
+free_vrings:
+	virtio_user_free_vrings(dev);
+destroy:
+	dev->ops->destroy(dev);
+
+	return -1;
+}
+
+void
+crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
+{
+	crypto_virtio_user_stop_device(dev);
+
+	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
+
+	virtio_user_dev_uninit_notify(dev);
+
+	virtio_user_free_vrings(dev);
+
+	if (dev->is_server)
+		unlink(dev->path);
+
+	dev->ops->destroy(dev);
+}
+
+#define CVQ_MAX_DATA_DESCS 32
+
+int
+crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	dev->status = status;
+	ret = dev->ops->set_status(dev, status);
+	if (ret && ret != -ENOTSUP)
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
+{
+	int ret;
+	uint8_t status;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	ret = dev->ops->get_status(dev, &status);
+	if (!ret) {
+		dev->status = status;
+		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
+			"\t-RESET: %u "
+			"\t-ACKNOWLEDGE: %u "
+			"\t-DRIVER: %u "
+			"\t-DRIVER_OK: %u "
+			"\t-FEATURES_OK: %u "
+			"\t-DEVICE_NEED_RESET: %u "
+			"\t-FAILED: %u",
+			dev->status,
+			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
+	} else if (ret != -ENOTSUP) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
+	}
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
+{
+	if (dev->ops->update_link_state)
+		return dev->ops->update_link_state(dev);
+
+	return 0;
+}
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
new file mode 100644
index 0000000000..9cd9856e5d
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#ifndef _VIRTIO_USER_DEV_H
+#define _VIRTIO_USER_DEV_H
+
+#include <limits.h>
+#include <stdbool.h>
+
+#include "../virtio_pci.h"
+#include "../virtio_ring.h"
+
+extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
+
+enum virtio_user_backend_type {
+	VIRTIO_USER_BACKEND_UNKNOWN,
+	VIRTIO_USER_BACKEND_VHOST_USER,
+	VIRTIO_USER_BACKEND_VHOST_VDPA,
+};
+
+struct virtio_user_queue {
+	uint16_t used_idx;
+	bool avail_wrap_counter;
+	bool used_wrap_counter;
+};
+
+struct virtio_user_dev {
+	struct virtio_crypto_hw hw;
+	enum virtio_user_backend_type backend_type;
+	bool		is_server;  /* server or client mode */
+
+	int		*callfds;
+	int		*kickfds;
+	uint16_t	max_queue_pairs;
+	uint16_t	queue_pairs;
+	uint32_t	queue_size;
+	uint64_t	features; /* the negotiated features with driver,
+				   * and will be sync with device
+				   */
+	uint64_t	device_features; /* supported features by device */
+	uint64_t	frontend_features; /* enabled frontend features */
+	uint64_t	unsupported_features; /* unsupported features mask */
+	uint8_t		status;
+	uint32_t	crypto_status;
+	uint32_t	crypto_services;
+	uint64_t	cipher_algo;
+	uint32_t	hash_algo;
+	uint64_t	auth_algo;
+	uint32_t	aead_algo;
+	uint32_t	akcipher_algo;
+	char		path[PATH_MAX];
+
+	union {
+		void			*ptr;
+		struct vring		*split;
+		struct vring_packed	*packed;
+	} vrings;
+
+	struct virtio_user_queue *packed_queues;
+	bool		*qp_enabled;
+
+	struct virtio_user_backend_ops *ops;
+	pthread_mutex_t	mutex;
+	bool		started;
+
+	bool			hw_cvq;
+	struct virtqueue	*scvq;
+
+	void *backend_data;
+
+	uint16_t **notify_area;
+};
+
+int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
+int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server);
+void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
+int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+extern const char * const crypto_virtio_user_backend_strings[];
+#endif
diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
new file mode 100644
index 0000000000..992e8fb43b
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
@@ -0,0 +1,575 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <bus_vdev_driver.h>
+#include <rte_cryptodev.h>
+#include <cryptodev_pmd.h>
+#include <rte_alarm.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+
+#include "virtio_user/virtio_user_dev.h"
+#include "virtio_user/vhost.h"
+#include "virtio_cryptodev.h"
+#include "virtio_logs.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
+
+static void
+virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		     void *dst, int length __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (offset == offsetof(struct virtio_crypto_config, status)) {
+		crypto_virtio_user_dev_update_link_state(dev);
+		*(uint32_t *)dst = dev->crypto_status;
+	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
+		*(uint16_t *)dst = dev->max_queue_pairs;
+	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
+		*(uint32_t *)dst = dev->crypto_services;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
+		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
+		*(uint32_t *)dst = dev->cipher_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
+		*(uint32_t *)dst = dev->hash_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
+		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
+		*(uint32_t *)dst = dev->auth_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
+		*(uint32_t *)dst = dev->aead_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
+		*(uint32_t *)dst = dev->akcipher_algo;
+}
+
+static void
+virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		      const void *src, int length)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(src);
+
+	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
+		    offset, length);
+}
+
+static void
+virtio_user_reset(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
+		crypto_virtio_user_stop_device(dev);
+}
+
+static void
+virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint8_t old_status = dev->status;
+
+	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
+			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
+		crypto_virtio_user_dev_set_features(dev);
+		/* Feature negotiation should be only done in probe time.
+		 * So we skip any more request here.
+		 */
+		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
+	}
+
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
+		if (crypto_virtio_user_start_device(dev)) {
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
+		virtio_user_reset(hw);
+	}
+
+	crypto_virtio_user_dev_set_status(dev, status);
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
+		if (dev->ops->cvq_enable(dev, 1) < 0) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	}
+}
+
+static uint8_t
+virtio_user_get_status(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	crypto_virtio_user_dev_update_status(dev);
+
+	return dev->status;
+}
+
+#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+static uint64_t
+virtio_user_get_features(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* unmask feature bits defined in vhost user protocol */
+	return (dev->device_features | dev->frontend_features) &
+		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
+}
+
+static void
+virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	dev->features = features & (dev->device_features | dev->frontend_features);
+}
+
+static uint8_t
+virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
+{
+	/* rxq interrupts and config interrupt are separated in virtio-user,
+	 * here we only report config change.
+	 */
+	return VIRTIO_PCI_CAP_ISR_CFG;
+}
+
+static uint16_t
+virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
+		    uint16_t vec __rte_unused)
+{
+	return 0;
+}
+
+static uint16_t
+virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
+			  struct virtqueue *vq __rte_unused,
+			  uint16_t vec)
+{
+	/* pretend we have done that */
+	return vec;
+}
+
+/* This function is to get the queue size, aka, number of descs, of a specified
+ * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
+ * max supported queues.
+ */
+static uint16_t
+virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* Currently, each queue has same queue size */
+	return dev->queue_size;
+}
+
+static void
+virtio_user_setup_queue_packed(struct virtqueue *vq,
+			       struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	struct vring_packed *vring;
+	uint64_t desc_addr;
+	uint64_t avail_addr;
+	uint64_t used_addr;
+	uint16_t i;
+
+	vring  = &dev->vrings.packed[queue_idx];
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries *
+		sizeof(struct vring_packed_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr +
+			   sizeof(struct vring_packed_desc_event),
+			   VIRTIO_VRING_ALIGN);
+	vring->num = vq->vq_nentries;
+	vring->desc_iova = vq->vq_ring_mem;
+	vring->desc = (void *)(uintptr_t)desc_addr;
+	vring->driver = (void *)(uintptr_t)avail_addr;
+	vring->device = (void *)(uintptr_t)used_addr;
+	dev->packed_queues[queue_idx].avail_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_idx = 0;
+
+	for (i = 0; i < vring->num; i++)
+		vring->desc[i].flags = 0;
+}
+
+static void
+virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	uint64_t desc_addr, avail_addr, used_addr;
+
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_VRING_ALIGN);
+
+	dev->vrings.split[queue_idx].num = vq->vq_nentries;
+	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
+	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
+	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
+	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
+}
+
+static int
+virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (vtpci_with_packed_queue(hw))
+		virtio_user_setup_queue_packed(vq, dev);
+	else
+		virtio_user_setup_queue_split(vq, dev);
+
+	if (dev->notify_area)
+		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
+
+	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
+		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
+
+	return 0;
+}
+
+static void
+virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(vq);
+}
+
+static void
+virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint64_t notify_data = 1;
+
+	if (!dev->notify_area) {
+		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
+			  sizeof(notify_data)) < 0)
+			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
+				    strerror(errno));
+		return;
+	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
+		rte_write16(vq->vq_queue_index, vq->notify_addr);
+		return;
+	}
+
+	if (vtpci_with_packed_queue(hw)) {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:30]: avail index
+		 * Bit[31]: avail wrap counter
+		 */
+		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
+				VRING_PACKED_DESC_F_AVAIL)) << 31) |
+				((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	} else {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:31]: avail index
+		 */
+		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	}
+	rte_write32(notify_data, vq->notify_addr);
+}
+
+const struct virtio_pci_ops crypto_virtio_user_ops = {
+	.read_dev_cfg	= virtio_user_read_dev_config,
+	.write_dev_cfg	= virtio_user_write_dev_config,
+	.reset		= virtio_user_reset,
+	.get_status	= virtio_user_get_status,
+	.set_status	= virtio_user_set_status,
+	.get_features	= virtio_user_get_features,
+	.set_features	= virtio_user_set_features,
+	.get_isr	= virtio_user_get_isr,
+	.set_config_irq	= virtio_user_set_config_irq,
+	.set_queue_irq	= virtio_user_set_queue_irq,
+	.get_queue_num	= virtio_user_get_queue_num,
+	.setup_queue	= virtio_user_setup_queue,
+	.del_queue	= virtio_user_del_queue,
+	.notify_queue	= virtio_user_notify_queue,
+};
+
+static const char * const valid_args[] = {
+#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
+	VIRTIO_USER_ARG_QUEUES_NUM,
+#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
+	VIRTIO_USER_ARG_QUEUE_SIZE,
+#define VIRTIO_USER_ARG_PATH           "path"
+	VIRTIO_USER_ARG_PATH,
+	NULL
+};
+
+#define VIRTIO_USER_DEF_Q_NUM	1
+#define VIRTIO_USER_DEF_Q_SZ	256
+#define VIRTIO_USER_DEF_SERVER_MODE	0
+
+static int
+get_string_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	uint64_t integer = 0;
+	if (!value || !extra_args)
+		return -EINVAL;
+	errno = 0;
+	integer = strtoull(value, NULL, 0);
+	/* extra_args keeps default value, it should be replaced
+	 * only in case of successful parsing of the 'value' arg
+	 */
+	if (errno == 0)
+		*(uint64_t *)extra_args = integer;
+	return -errno;
+}
+
+static struct rte_cryptodev *
+virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev_pmd_init_params init_params = {
+		.name = "",
+		.private_data_size = sizeof(struct virtio_user_dev),
+	};
+	struct rte_cryptodev_data *data;
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	struct virtio_crypto_hw *hw;
+
+	init_params.socket_id = vdev->device.numa_node;
+	init_params.private_data_size = sizeof(struct virtio_user_dev);
+	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
+	if (cryptodev == NULL) {
+		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
+		return NULL;
+	}
+
+	data = cryptodev->data;
+	dev = data->dev_private;
+	hw = &dev->hw;
+
+	hw->dev_id = data->dev_id;
+	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
+
+	return cryptodev;
+}
+
+static void
+virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
+{
+	rte_cryptodev_pmd_destroy(cryptodev);
+}
+
+static int
+virtio_user_pmd_probe(struct rte_vdev_device *vdev)
+{
+	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
+	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
+	struct rte_cryptodev *cryptodev = NULL;
+	struct rte_kvargs *kvlist = NULL;
+	struct virtio_user_dev *dev;
+	char *path = NULL;
+	int ret = -1;
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
+
+	if (!kvlist) {
+		PMD_INIT_LOG(ERR, "error when parsing param");
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
+					&get_string_arg, &path) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_PATH);
+			goto end;
+		}
+	} else {
+		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
+				VIRTIO_USER_ARG_PATH);
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
+					&get_integer_arg, &queues) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUES_NUM);
+			goto end;
+		}
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
+					&get_integer_arg, &queue_size) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUE_SIZE);
+			goto end;
+		}
+	}
+
+	cryptodev = virtio_user_cryptodev_alloc(vdev);
+	if (!cryptodev) {
+		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
+		goto end;
+	}
+
+	dev = cryptodev->data->dev_private;
+	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
+			server_mode) < 0) {
+		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
+			NULL) < 0) {
+		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
+		crypto_virtio_user_dev_uninit(dev);
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	rte_cryptodev_pmd_probing_finish(cryptodev);
+
+	ret = 0;
+end:
+	rte_kvargs_free(kvlist);
+	free(path);
+	return ret;
+}
+
+static int
+virtio_user_pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev *cryptodev;
+	const char *name;
+	int devid;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	PMD_DRV_LOG(INFO, "Removing %s", name);
+
+	devid = rte_cryptodev_get_dev_id(name);
+	if (devid < 0)
+		return -EINVAL;
+
+	rte_cryptodev_stop(devid);
+
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
+		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_map)
+		return dev->ops->dma_map(dev, addr, iova, len);
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_unmap)
+		return dev->ops->dma_unmap(dev, addr, iova, len);
+
+	return 0;
+}
+
+static struct rte_vdev_driver virtio_user_driver = {
+	.probe = virtio_user_pmd_probe,
+	.remove = virtio_user_pmd_remove,
+	.dma_map = virtio_user_pmd_dma_map,
+	.dma_unmap = virtio_user_pmd_dma_unmap,
+};
+
+static struct cryptodev_driver virtio_crypto_drv;
+
+uint8_t cryptodev_virtio_user_driver_id;
+
+RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
+RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
+	virtio_user_driver.driver,
+	cryptodev_virtio_user_driver_id);
+RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
+	"path=<path> "
+	"queues=<int> "
+	"queue_size=<int>");
-- 
2.25.1


^ permalink raw reply	[relevance 1%]

* [PATCH v4 2/5] dmadev: avoid copies in tracepoints
  @ 2025-03-04 16:06  4%   ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-03-04 16:06 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Chengwen Feng, Kevin Laatz, Bruce Richardson

No need to copy values in intermediate variables.
Use the right trace point emitters.
Update the pcie struct to avoid aliasing warning.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v3:
- added anonymous union around pcie struct (which triggered an abidiff
  warning that needs waiving) and kept original call to
  rte_trace_point_emit_u64,

Changes since v2:
- split this change into multiple changes,
  only kept trivial parts in this patch,

---
 devtools/libabigail.abignore  |  5 +++++
 lib/dmadev/rte_dmadev.h       | 29 ++++++++++++++++-------------
 lib/dmadev/rte_dmadev_trace.h | 20 ++++++--------------
 3 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index ce501632b3..88aa1ec981 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -36,3 +36,8 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+        name = rte_dma_port_param
+        type_kind = struct
+        has_size_change = no
+        has_data_member = {pcie}
diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
index 2f9304a9db..26f9d4b095 100644
--- a/lib/dmadev/rte_dmadev.h
+++ b/lib/dmadev/rte_dmadev.h
@@ -523,19 +523,22 @@ struct rte_dma_port_param {
 		 * and capabilities.
 		 */
 		__extension__
-		struct {
-			uint64_t coreid : 4; /**< PCIe core id used. */
-			uint64_t pfid : 8; /**< PF id used. */
-			uint64_t vfen : 1; /**< VF enable bit. */
-			uint64_t vfid : 16; /**< VF id used. */
-			/** The pasid filed in TLP packet. */
-			uint64_t pasid : 20;
-			/** The attributes filed in TLP packet. */
-			uint64_t attr : 3;
-			/** The processing hint filed in TLP packet. */
-			uint64_t ph : 2;
-			/** The steering tag filed in TLP packet. */
-			uint64_t st : 16;
+		union {
+			struct {
+				uint64_t coreid : 4; /**< PCIe core id used. */
+				uint64_t pfid : 8; /**< PF id used. */
+				uint64_t vfen : 1; /**< VF enable bit. */
+				uint64_t vfid : 16; /**< VF id used. */
+				/** The pasid filed in TLP packet. */
+				uint64_t pasid : 20;
+				/** The attributes filed in TLP packet. */
+				uint64_t attr : 3;
+				/** The processing hint filed in TLP packet. */
+				uint64_t ph : 2;
+				/** The steering tag filed in TLP packet. */
+				uint64_t st : 16;
+			};
+			uint64_t val;
 		} pcie;
 	};
 	uint64_t reserved[2]; /**< Reserved for future fields. */
diff --git a/lib/dmadev/rte_dmadev_trace.h b/lib/dmadev/rte_dmadev_trace.h
index be089c065c..1beb938168 100644
--- a/lib/dmadev/rte_dmadev_trace.h
+++ b/lib/dmadev/rte_dmadev_trace.h
@@ -46,11 +46,10 @@ RTE_TRACE_POINT(
 	const struct rte_dma_conf __dev_conf = {0};
 	dev_conf = &__dev_conf;
 #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
-	int enable_silent = (int)dev_conf->enable_silent;
 	rte_trace_point_emit_i16(dev_id);
 	rte_trace_point_emit_u16(dev_conf->nb_vchans);
 	rte_trace_point_emit_u16(dev_conf->priority);
-	rte_trace_point_emit_int(enable_silent);
+	rte_trace_point_emit_u8(dev_conf->enable_silent);
 	rte_trace_point_emit_int(ret);
 )
 
@@ -83,21 +82,14 @@ RTE_TRACE_POINT(
 	const struct rte_dma_vchan_conf __conf = {0};
 	conf = &__conf;
 #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
-	int src_port_type = conf->src_port.port_type;
-	int dst_port_type = conf->dst_port.port_type;
-	int direction = conf->direction;
-	uint64_t src_pcie_cfg;
-	uint64_t dst_pcie_cfg;
 	rte_trace_point_emit_i16(dev_id);
 	rte_trace_point_emit_u16(vchan);
-	rte_trace_point_emit_int(direction);
+	rte_trace_point_emit_int(conf->direction);
 	rte_trace_point_emit_u16(conf->nb_desc);
-	rte_trace_point_emit_int(src_port_type);
-	memcpy(&src_pcie_cfg, &conf->src_port.pcie, sizeof(uint64_t));
-	rte_trace_point_emit_u64(src_pcie_cfg);
-	memcpy(&dst_pcie_cfg, &conf->dst_port.pcie, sizeof(uint64_t));
-	rte_trace_point_emit_int(dst_port_type);
-	rte_trace_point_emit_u64(dst_pcie_cfg);
+	rte_trace_point_emit_int(conf->src_port.port_type);
+	rte_trace_point_emit_u64(conf->src_port.pcie.val);
+	rte_trace_point_emit_int(conf->dst_port.port_type);
+	rte_trace_point_emit_u64(conf->dst_port.pcie.val);
 	rte_trace_point_emit_ptr(conf->auto_free.m2d.pool);
 	rte_trace_point_emit_int(ret);
 )
-- 
2.48.1


^ permalink raw reply	[relevance 4%]

* [v5 4/6] crypto/virtio: add vDPA backend
  @ 2025-02-26 18:58  1% ` Gowrishankar Muthukrishnan
  0 siblings, 0 replies; 200+ results
From: Gowrishankar Muthukrishnan @ 2025-02-26 18:58 UTC (permalink / raw)
  To: dev, Jay Zhou; +Cc: anoobj, Akhil Goyal, Gowrishankar Muthukrishnan

Add vDPA backend to virtio_user crypto.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
---
 drivers/crypto/virtio/meson.build             |   7 +
 drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
 drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
 drivers/crypto/virtio/virtio_logs.h           |   6 +-
 drivers/crypto/virtio/virtio_pci.h            |   7 +
 drivers/crypto/virtio/virtio_ring.h           |   6 -
 drivers/crypto/virtio/virtio_user/vhost.h     |  90 +++
 .../crypto/virtio/virtio_user/vhost_vdpa.c    | 710 +++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.c      | 749 ++++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.h      |  85 ++
 drivers/crypto/virtio/virtio_user_cryptodev.c | 575 ++++++++++++++
 11 files changed, 2265 insertions(+), 30 deletions(-)
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost.h
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
 create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c

diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
index d2c3b3ad07..3763e86746 100644
--- a/drivers/crypto/virtio/meson.build
+++ b/drivers/crypto/virtio/meson.build
@@ -16,3 +16,10 @@ sources = files(
         'virtio_rxtx.c',
         'virtqueue.c',
 )
+
+if is_linux
+    sources += files('virtio_user_cryptodev.c',
+        'virtio_user/vhost_vdpa.c',
+        'virtio_user/virtio_user_dev.c')
+    deps += ['bus_vdev']
+endif
diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
index 92fea557ab..bc737f1e68 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.c
+++ b/drivers/crypto/virtio/virtio_cryptodev.c
@@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
 	return 0;
 }
 
-/*
- * This function is based on probe() function
- * It returns 0 on success.
- */
-static int
-crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
-		struct rte_cryptodev_pmd_init_params *init_params)
+int
+crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev)
 {
-	struct rte_cryptodev *cryptodev;
 	struct virtio_crypto_hw *hw;
 
-	PMD_INIT_FUNC_TRACE();
-
-	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
-					init_params);
-	if (cryptodev == NULL)
-		return -ENODEV;
-
 	cryptodev->driver_id = cryptodev_virtio_driver_id;
 	cryptodev->dev_ops = &virtio_crypto_dev_ops;
 
@@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
 	hw->dev_id = cryptodev->data->dev_id;
 	hw->virtio_dev_capabilities = virtio_capabilities;
 
-	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
-		cryptodev->data->dev_id, pci_dev->id.vendor_id,
-		pci_dev->id.device_id);
+	if (pci_dev) {
+		/* pci device init */
+		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
+			cryptodev->data->dev_id, pci_dev->id.vendor_id,
+			pci_dev->id.device_id);
 
-	/* pci device init */
-	if (vtpci_cryptodev_init(pci_dev, hw))
+		if (vtpci_cryptodev_init(pci_dev, hw))
+			return -1;
+	}
+
+	if (virtio_crypto_init_device(cryptodev, features) < 0)
 		return -1;
 
-	if (virtio_crypto_init_device(cryptodev,
-			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
+	return 0;
+}
+
+/*
+ * This function is based on probe() function
+ * It returns 0 on success.
+ */
+static int
+crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
+		struct rte_cryptodev_pmd_init_params *init_params)
+{
+	struct rte_cryptodev *cryptodev;
+
+	PMD_INIT_FUNC_TRACE();
+
+	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
+					init_params);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
+			pci_dev) < 0)
 		return -1;
 
 	rte_cryptodev_pmd_probing_finish(cryptodev);
diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
index f8498246e2..fad73d54a8 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.h
+++ b/drivers/crypto/virtio/virtio_cryptodev.h
@@ -76,4 +76,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
 		struct rte_crypto_op **tx_pkts,
 		uint16_t nb_pkts);
 
+int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev);
+
 #endif /* _VIRTIO_CRYPTODEV_H_ */
diff --git a/drivers/crypto/virtio/virtio_logs.h b/drivers/crypto/virtio/virtio_logs.h
index 988514919f..1cc51f7990 100644
--- a/drivers/crypto/virtio/virtio_logs.h
+++ b/drivers/crypto/virtio/virtio_logs.h
@@ -15,8 +15,10 @@ extern int virtio_crypto_logtype_init;
 
 #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
 
-extern int virtio_crypto_logtype_init;
-#define RTE_LOGTYPE_VIRTIO_CRYPTO_INIT virtio_crypto_logtype_init
+extern int virtio_crypto_logtype_driver;
+#define RTE_LOGTYPE_VIRTIO_CRYPTO_DRIVER virtio_crypto_logtype_driver
+#define PMD_DRV_LOG(level, ...) \
+	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_DRIVER, "%s(): ", __func__, __VA_ARGS__)
 
 #define VIRTIO_CRYPTO_INIT_LOG_IMPL(level, ...) \
 	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_INIT, "%s(): ", __func__, __VA_ARGS__)
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
index 79945cb88e..c75777e005 100644
--- a/drivers/crypto/virtio/virtio_pci.h
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -20,6 +20,9 @@ struct virtqueue;
 #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
 #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
 
+/* VirtIO device IDs. */
+#define VIRTIO_ID_CRYPTO  20
+
 /* VirtIO ABI version, this must match exactly. */
 #define VIRTIO_PCI_ABI_VERSION 0
 
@@ -56,8 +59,12 @@ struct virtqueue;
 #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
 #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
 #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
 #define VIRTIO_CONFIG_STATUS_FAILED    0x80
 
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_VRING_ALIGN 4096
+
 /*
  * Each virtqueue indirect descriptor list must be physically contiguous.
  * To allow us to malloc(9) each list individually, limit the number
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
index c74d1172b7..4b418f6e60 100644
--- a/drivers/crypto/virtio/virtio_ring.h
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
 				sizeof(struct vring_packed_desc_event)), align);
 }
 
-static inline void
-vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
-{
-	vring_init_split(vr, p, 0, align, num);
-}
-
 /*
  * The following is used with VIRTIO_RING_F_EVENT_IDX.
  * Assuming a given event_idx value from the other size, if we have
diff --git a/drivers/crypto/virtio/virtio_user/vhost.h b/drivers/crypto/virtio/virtio_user/vhost.h
new file mode 100644
index 0000000000..29cc1a14d4
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#ifndef _VIRTIO_USER_VHOST_H
+#define _VIRTIO_USER_VHOST_H
+
+#include <stdint.h>
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#include <rte_errno.h>
+
+#include "../virtio_logs.h"
+
+struct vhost_vring_state {
+	unsigned int index;
+	unsigned int num;
+};
+
+struct vhost_vring_file {
+	unsigned int index;
+	int fd;
+};
+
+struct vhost_vring_addr {
+	unsigned int index;
+	/* Option flags. */
+	unsigned int flags;
+	/* Flag values: */
+	/* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+	/* Start of array of descriptors (virtually contiguous) */
+	uint64_t desc_user_addr;
+	/* Used structure address. Must be 32 bit aligned */
+	uint64_t used_user_addr;
+	/* Available structure address. Must be 16 bit aligned */
+	uint64_t avail_user_addr;
+	/* Logging support. */
+	/* Log writes to used structure, at offset calculated from specified
+	 * address. Address must be 32 bit aligned.
+	 */
+	uint64_t log_guest_addr;
+};
+
+#ifndef VHOST_BACKEND_F_IOTLB_MSG_V2
+#define VHOST_BACKEND_F_IOTLB_MSG_V2 1
+#endif
+
+#ifndef VHOST_BACKEND_F_IOTLB_BATCH
+#define VHOST_BACKEND_F_IOTLB_BATCH 2
+#endif
+
+struct virtio_user_dev;
+
+struct virtio_user_backend_ops {
+	int (*setup)(struct virtio_user_dev *dev);
+	int (*destroy)(struct virtio_user_dev *dev);
+	int (*get_backend_features)(uint64_t *features);
+	int (*set_owner)(struct virtio_user_dev *dev);
+	int (*get_features)(struct virtio_user_dev *dev, uint64_t *features);
+	int (*set_features)(struct virtio_user_dev *dev, uint64_t features);
+	int (*set_memory_table)(struct virtio_user_dev *dev);
+	int (*set_vring_num)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*get_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_call)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_kick)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_addr)(struct virtio_user_dev *dev, struct vhost_vring_addr *addr);
+	int (*get_status)(struct virtio_user_dev *dev, uint8_t *status);
+	int (*set_status)(struct virtio_user_dev *dev, uint8_t status);
+	int (*get_config)(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len);
+	int (*set_config)(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off,
+			uint32_t len);
+	int (*cvq_enable)(struct virtio_user_dev *dev, int enable);
+	int (*enable_qp)(struct virtio_user_dev *dev, uint16_t pair_idx, int enable);
+	int (*dma_map)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*dma_unmap)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*update_link_state)(struct virtio_user_dev *dev);
+	int (*server_disconnect)(struct virtio_user_dev *dev);
+	int (*server_reconnect)(struct virtio_user_dev *dev);
+	int (*get_intr_fd)(struct virtio_user_dev *dev);
+	int (*map_notification_area)(struct virtio_user_dev *dev);
+	int (*unmap_notification_area)(struct virtio_user_dev *dev);
+};
+
+extern struct virtio_user_backend_ops virtio_ops_vdpa;
+
+#endif
diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
new file mode 100644
index 0000000000..b5839875e6
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
@@ -0,0 +1,710 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include <rte_memory.h>
+
+#include "vhost.h"
+#include "virtio_user_dev.h"
+#include "../virtio_pci.h"
+
+struct vhost_vdpa_data {
+	int vhostfd;
+	uint64_t protocol_features;
+};
+
+#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
+	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
+	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
+
+/* vhost kernel & vdpa ioctls */
+#define VHOST_VIRTIO 0xAF
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
+#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
+#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
+#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
+#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
+#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	uint64_t iova;
+	uint64_t size;
+	uint64_t uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	uint8_t perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+#define VHOST_IOTLB_BATCH_BEGIN    5
+#define VHOST_IOTLB_BATCH_END      6
+	uint8_t type;
+};
+
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_vdpa_config {
+	uint32_t off;
+	uint32_t len;
+	uint8_t buf[];
+};
+
+struct vhost_msg {
+	uint32_t type;
+	uint32_t reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		uint8_t padding[64];
+	};
+};
+
+static int
+vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
+{
+	int ret;
+
+	ret = ioctl(fd, request, arg);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
+				request, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_owner(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_OWNER, NULL);
+}
+
+static int
+vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
+}
+
+static int
+vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int ret;
+
+	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get features");
+		return -1;
+	}
+
+	/* Negotiated vDPA backend features */
+	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to get backend features");
+		return -1;
+	}
+
+	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
+
+	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to set backend features");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	/* WORKAROUND */
+	features |= 1ULL << VIRTIO_F_IOMMU_PLATFORM;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_iotlb_batch_begin(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch begin (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_iotlb_batch_end(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_END;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch end (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_UPDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.uaddr = (uint64_t)(uintptr_t)addr;
+	msg.iotlb.size = len;
+	msg.iotlb.perm = VHOST_ACCESS_RW;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", addr: %p, len: 0x%zx",
+			__func__, iova, addr, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB update (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_unmap(struct virtio_user_dev *dev, __rte_unused void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.size = len;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", len: 0x%zx",
+			__func__, iova, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB invalidate (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_map(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_dma_unmap_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_unmap(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_map_contig(const struct rte_memseg_list *msl,
+		const struct rte_memseg *ms, size_t len, void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	if (msl->external)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, len);
+}
+
+static int
+vhost_vdpa_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
+		void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	/* skip external memory that isn't a heap */
+	if (msl->external && !msl->heap)
+		return 0;
+
+	/* skip any segments with invalid IOVA addresses */
+	if (ms->iova == RTE_BAD_IOVA)
+		return 0;
+
+	/* if IOVA mode is VA, we've already mapped the internal segments */
+	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, ms->len);
+}
+
+static int
+vhost_vdpa_set_memory_table(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	vhost_vdpa_dma_unmap(dev, NULL, 0, SIZE_MAX);
+
+	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
+		/* with IOVA as VA mode, we can get away with mapping contiguous
+		 * chunks rather than going page-by-page.
+		 */
+		ret = rte_memseg_contig_walk_thread_unsafe(
+				vhost_vdpa_map_contig, dev);
+		if (ret)
+			goto batch_end;
+		/* we have to continue the walk because we've skipped the
+		 * external segments during the config walk.
+		 */
+	}
+	ret = rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev);
+
+batch_end:
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
+}
+
+static int
+vhost_vdpa_set_vring_num(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_NUM, state);
+}
+
+static int
+vhost_vdpa_set_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_get_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_set_vring_call(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_CALL, file);
+}
+
+static int
+vhost_vdpa_set_vring_kick(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_KICK, file);
+}
+
+static int
+vhost_vdpa_set_vring_addr(struct virtio_user_dev *dev, struct vhost_vring_addr *addr)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_ADDR, addr);
+}
+
+static int
+vhost_vdpa_get_status(struct virtio_user_dev *dev, uint8_t *status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_GET_STATUS, status);
+}
+
+static int
+vhost_vdpa_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_STATUS, &status);
+}
+
+static int
+vhost_vdpa_get_config(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_GET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+		goto out;
+	}
+
+	memcpy(data, config->buf, len);
+out:
+	free(config);
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_config(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	memcpy(config->buf, data, len);
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_SET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to set vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+	}
+
+	free(config);
+
+	return ret;
+}
+
+/**
+ * Set up environment to talk with a vhost vdpa backend.
+ *
+ * @return
+ *   - (-1) if fail to set up;
+ *   - (>=0) if successful.
+ */
+static int
+vhost_vdpa_setup(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data;
+	uint32_t did = (uint32_t)-1;
+
+	data = malloc(sizeof(*data));
+	if (!data) {
+		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
+		return -1;
+	}
+
+	data->vhostfd = open(dev->path, O_RDWR);
+	if (data->vhostfd < 0) {
+		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
+				dev->path, strerror(errno));
+		free(data);
+		return -1;
+	}
+
+	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
+			did != VIRTIO_ID_CRYPTO) {
+		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
+		close(data->vhostfd);
+		free(data);
+		return -1;
+	}
+
+	dev->backend_data = data;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_destroy(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	if (!data)
+		return 0;
+
+	close(data->vhostfd);
+
+	free(data);
+	dev->backend_data = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
+{
+	struct vhost_vring_state state = {
+		.index = dev->max_queue_pairs,
+		.num   = enable,
+	};
+
+	return vhost_vdpa_set_vring_enable(dev, &state);
+}
+
+static int
+vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
+				uint16_t pair_idx,
+				int enable)
+{
+	struct vhost_vring_state state = {
+		.index = pair_idx,
+		.num   = enable,
+	};
+
+	if (dev->qp_enabled[pair_idx] == enable)
+		return 0;
+
+	if (vhost_vdpa_set_vring_enable(dev, &state))
+		return -1;
+
+	dev->qp_enabled[pair_idx] = enable;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_backend_features(uint64_t *features)
+{
+	*features = 0;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
+{
+	/* TODO: It is W/A until a cleaner approach to find cpt status */
+	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_intr_fd(struct virtio_user_dev *dev __rte_unused)
+{
+	/* No link state interrupt with Vhost-vDPA */
+	return -1;
+}
+
+static int
+vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
+{
+	int nr_vrings = dev->max_queue_pairs;
+
+	return nr_vrings;
+}
+
+static int
+vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
+{
+	int i, nr_vrings;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	for (i = 0; i < nr_vrings; i++) {
+		if (dev->notify_area[i])
+			munmap(dev->notify_area[i], getpagesize());
+	}
+	free(dev->notify_area);
+	dev->notify_area = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int nr_vrings, i, page_size = getpagesize();
+	uint16_t **notify_area;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	/* CQ is another vring */
+	nr_vrings++;
+
+	notify_area = malloc(nr_vrings * sizeof(*notify_area));
+	if (!notify_area) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
+		return -1;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
+					data->vhostfd, i * page_size);
+		if (notify_area[i] == MAP_FAILED) {
+			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
+					dev->path, i);
+			i--;
+			goto map_err;
+		}
+	}
+	dev->notify_area = notify_area;
+
+	return 0;
+
+map_err:
+	for (; i >= 0; i--)
+		munmap(notify_area[i], page_size);
+	free(notify_area);
+
+	return -1;
+}
+
+struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
+	.setup = vhost_vdpa_setup,
+	.destroy = vhost_vdpa_destroy,
+	.get_backend_features = vhost_vdpa_get_backend_features,
+	.set_owner = vhost_vdpa_set_owner,
+	.get_features = vhost_vdpa_get_features,
+	.set_features = vhost_vdpa_set_features,
+	.set_memory_table = vhost_vdpa_set_memory_table,
+	.set_vring_num = vhost_vdpa_set_vring_num,
+	.set_vring_base = vhost_vdpa_set_vring_base,
+	.get_vring_base = vhost_vdpa_get_vring_base,
+	.set_vring_call = vhost_vdpa_set_vring_call,
+	.set_vring_kick = vhost_vdpa_set_vring_kick,
+	.set_vring_addr = vhost_vdpa_set_vring_addr,
+	.get_status = vhost_vdpa_get_status,
+	.set_status = vhost_vdpa_set_status,
+	.get_config = vhost_vdpa_get_config,
+	.set_config = vhost_vdpa_set_config,
+	.cvq_enable = vhost_vdpa_cvq_enable,
+	.enable_qp = vhost_vdpa_enable_queue_pair,
+	.dma_map = vhost_vdpa_dma_map_batch,
+	.dma_unmap = vhost_vdpa_dma_unmap_batch,
+	.update_link_state = vhost_vdpa_update_link_state,
+	.get_intr_fd = vhost_vdpa_get_intr_fd,
+	.map_notification_area = vhost_vdpa_map_notification_area,
+	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
+};
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
new file mode 100644
index 0000000000..c8478d72ce
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
@@ -0,0 +1,749 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+
+#include <rte_alarm.h>
+#include <rte_string_fns.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_io.h>
+
+#include "vhost.h"
+#include "virtio_logs.h"
+#include "cryptodev_pmd.h"
+#include "virtio_crypto.h"
+#include "virtio_cvq.h"
+#include "virtio_user_dev.h"
+#include "virtqueue.h"
+
+#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
+
+const char * const crypto_virtio_user_backend_strings[] = {
+	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
+	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
+};
+
+static int
+virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	if (dev->kickfds[queue_sel] >= 0) {
+		close(dev->kickfds[queue_sel]);
+		dev->kickfds[queue_sel] = -1;
+	}
+
+	if (dev->callfds[queue_sel] >= 0) {
+		close(dev->callfds[queue_sel]);
+		dev->callfds[queue_sel] = -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* May use invalid flag, but some backend uses kickfd and
+	 * callfd as criteria to judge if dev is alive. so finally we
+	 * use real event_fd.
+	 */
+	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->callfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->kickfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	struct vhost_vring_state state;
+	int ret;
+
+	state.index = queue_sel;
+	ret = dev->ops->get_vring_base(dev, &state);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
+	 * firstly because vhost depends on this msg to allocate virtqueue
+	 * pair.
+	 */
+	struct vhost_vring_file file;
+	int ret;
+
+	file.index = queue_sel;
+	file.fd = dev->callfds[queue_sel];
+	ret = dev->ops->set_vring_call(dev, &file);
+	if (ret < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	int ret;
+	struct vhost_vring_file file;
+	struct vhost_vring_state state;
+	struct vring *vring = &dev->vrings.split[queue_sel];
+	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
+	uint64_t desc_addr, avail_addr, used_addr;
+	struct vhost_vring_addr addr = {
+		.index = queue_sel,
+		.log_guest_addr = 0,
+		.flags = 0, /* disable log */
+	};
+
+	if (queue_sel == dev->max_queue_pairs) {
+		if (!dev->scvq) {
+			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
+					dev->path);
+			goto err;
+		}
+
+		/* Use shadow control queue information */
+		vring = &dev->scvq->vq_split.ring;
+		pq_vring = &dev->scvq->vq_packed.ring;
+	}
+
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+		desc_addr = pq_vring->desc_iova;
+		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
+		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
+						VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	} else {
+		desc_addr = vring->desc_iova;
+		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
+		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
+					VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	}
+
+	state.index = queue_sel;
+	state.num = vring->num;
+	ret = dev->ops->set_vring_num(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	state.index = queue_sel;
+	state.num = 0; /* no reservation */
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+		state.num |= (1 << 15);
+	ret = dev->ops->set_vring_base(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	ret = dev->ops->set_vring_addr(dev, &addr);
+	if (ret < 0)
+		goto err;
+
+	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
+	 * lastly because vhost depends on this msg to judge if
+	 * virtio is ready.
+	 */
+	file.index = queue_sel;
+	file.fd = dev->kickfds[queue_sel];
+	ret = dev->ops->set_vring_kick(dev, &file);
+	if (ret < 0)
+		goto err;
+
+	return 0;
+err:
+	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
+
+	return -1;
+}
+
+static int
+virtio_user_foreach_queue(struct virtio_user_dev *dev,
+			int (*fn)(struct virtio_user_dev *, uint32_t))
+{
+	uint32_t i, nr_vq;
+
+	nr_vq = dev->max_queue_pairs;
+
+	for (i = 0; i < nr_vq; i++)
+		if (fn(dev, i) < 0)
+			return -1;
+
+	return 0;
+}
+
+int
+crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
+{
+	uint64_t features;
+	int ret = -1;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 0: tell vhost to create queues */
+	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
+		goto error;
+
+	features = dev->features;
+
+	ret = dev->ops->set_features(dev, features);
+	if (ret < 0)
+		goto error;
+	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
+error:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return ret;
+}
+
+int
+crypto_virtio_user_start_device(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	/*
+	 * XXX workaround!
+	 *
+	 * We need to make sure that the locks will be
+	 * taken in the correct order to avoid deadlocks.
+	 *
+	 * Before releasing this lock, this thread should
+	 * not trigger any memory hotplug events.
+	 *
+	 * This is a temporary workaround, and should be
+	 * replaced when we get proper supports from the
+	 * memory subsystem in the future.
+	 */
+	rte_mcfg_mem_read_lock();
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 2: share memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto error;
+
+	/* Step 3: kick queues */
+	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
+	if (ret < 0)
+		goto error;
+
+	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
+	if (ret < 0)
+		goto error;
+
+	/* Step 4: enable queues */
+	for (int i = 0; i < dev->max_queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto error;
+	}
+
+	dev->started = true;
+
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	return 0;
+error:
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
+
+	/* TODO: free resource here or caller to check */
+	return -1;
+}
+
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
+{
+	uint32_t i;
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	if (!dev->started)
+		goto out;
+
+	for (i = 0; i < dev->max_queue_pairs; ++i) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	if (dev->scvq) {
+		ret = dev->ops->cvq_enable(dev, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	/* Stop the backend. */
+	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
+		goto err;
+
+	dev->started = false;
+
+out:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return 0;
+err:
+	pthread_mutex_unlock(&dev->mutex);
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
+
+	return -1;
+}
+
+static int
+virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
+{
+	int ret;
+
+	if (!dev->ops->get_config) {
+		dev->max_queue_pairs = user_max_qp;
+		return 0;
+	}
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
+			offsetof(struct virtio_crypto_config, max_dataqueues),
+			sizeof(uint16_t));
+	if (ret) {
+		/*
+		 * We need to know the max queue pair from the device so that
+		 * the control queue gets the right index.
+		 */
+		dev->max_queue_pairs = 1;
+		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
+
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
+{
+	struct virtio_crypto_config config;
+	int ret;
+
+	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
+	dev->cipher_algo = 0;
+	dev->auth_algo = 0;
+	dev->akcipher_algo = 0;
+
+	if (!dev->ops->get_config)
+		return 0;
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
+	if (ret) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
+		return ret;
+	}
+
+	dev->crypto_services = config.crypto_services;
+	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
+						config.cipher_algo_l;
+	dev->hash_algo = config.hash_algo;
+	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
+						config.mac_algo_l;
+	dev->aead_algo = config.aead_algo;
+	dev->akcipher_algo = config.akcipher_algo;
+	return 0;
+}
+
+static int
+virtio_user_dev_init_notify(struct virtio_user_dev *dev)
+{
+
+	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
+		goto err;
+
+	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
+		if (dev->ops->map_notification_area &&
+				dev->ops->map_notification_area(dev))
+			goto err;
+
+	return 0;
+err:
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	return -1;
+}
+
+static void
+virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
+{
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	if (dev->ops->unmap_notification_area && dev->notify_area)
+		dev->ops->unmap_notification_area(dev);
+}
+
+static void
+virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
+			const void *addr,
+			size_t len __rte_unused,
+			void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+	struct rte_memseg_list *msl;
+	uint16_t i;
+	int ret = 0;
+
+	/* ignore externally allocated memory */
+	msl = rte_mem_virt2memseg_list(addr);
+	if (msl->external)
+		return;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	if (dev->started == false)
+		goto exit;
+
+	/* Step 1: pause the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* Step 2: update memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto exit;
+
+	/* Step 3: resume the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto exit;
+	}
+
+exit:
+	pthread_mutex_unlock(&dev->mutex);
+
+	if (ret < 0)
+		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
+}
+
+static int
+virtio_user_dev_setup(struct virtio_user_dev *dev)
+{
+	if (dev->is_server) {
+		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
+			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
+			return -1;
+		}
+	}
+
+	switch (dev->backend_type) {
+	case VIRTIO_USER_BACKEND_VHOST_VDPA:
+		dev->ops = &virtio_crypto_ops_vdpa;
+		break;
+	default:
+		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_alloc_vrings(struct virtio_user_dev *dev)
+{
+	int i, size, nr_vrings;
+	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
+
+	nr_vrings = dev->max_queue_pairs + 1;
+
+	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
+	if (!dev->callfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
+		return -1;
+	}
+
+	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
+	if (!dev->kickfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
+		goto free_callfds;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		dev->callfds[i] = -1;
+		dev->kickfds[i] = -1;
+	}
+
+	if (packed_ring)
+		size = sizeof(*dev->vrings.packed);
+	else
+		size = sizeof(*dev->vrings.split);
+	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
+	if (!dev->vrings.ptr) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
+		goto free_kickfds;
+	}
+
+	if (packed_ring) {
+		dev->packed_queues = rte_zmalloc("virtio_user_dev",
+				nr_vrings * sizeof(*dev->packed_queues), 0);
+		if (!dev->packed_queues) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
+					dev->path);
+			goto free_vrings;
+		}
+	}
+
+	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
+			nr_vrings * sizeof(*dev->qp_enabled), 0);
+	if (!dev->qp_enabled) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
+		goto free_packed_queues;
+	}
+
+	return 0;
+
+free_packed_queues:
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+free_vrings:
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+free_kickfds:
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+free_callfds:
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+
+	return -1;
+}
+
+static void
+virtio_user_free_vrings(struct virtio_user_dev *dev)
+{
+	rte_free(dev->qp_enabled);
+	dev->qp_enabled = NULL;
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+}
+
+#define VIRTIO_USER_SUPPORTED_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+int
+crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server)
+{
+	uint64_t backend_features;
+
+	pthread_mutex_init(&dev->mutex, NULL);
+	strlcpy(dev->path, path, PATH_MAX);
+
+	dev->started = 0;
+	dev->queue_pairs = 1; /* mq disabled by default */
+	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
+	dev->queue_size = queue_size;
+	dev->is_server = server;
+	dev->frontend_features = 0;
+	dev->unsupported_features = 0;
+	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
+	dev->hw.modern = 1;
+
+	if (virtio_user_dev_setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->set_owner(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
+		goto destroy;
+	}
+
+	if (dev->ops->get_backend_features(&backend_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
+		goto destroy;
+	}
+
+	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
+
+	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_cipher_services(dev)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
+		goto destroy;
+	}
+
+	dev->frontend_features &= ~dev->unsupported_features;
+	dev->device_features &= ~dev->unsupported_features;
+
+	if (virtio_user_alloc_vrings(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_notify(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
+		goto free_vrings;
+	}
+
+	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
+				virtio_user_mem_event_cb, dev)) {
+		if (rte_errno != ENOTSUP) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
+					dev->path);
+			goto notify_uninit;
+		}
+	}
+
+	return 0;
+
+notify_uninit:
+	virtio_user_dev_uninit_notify(dev);
+free_vrings:
+	virtio_user_free_vrings(dev);
+destroy:
+	dev->ops->destroy(dev);
+
+	return -1;
+}
+
+void
+crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
+{
+	crypto_virtio_user_stop_device(dev);
+
+	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
+
+	virtio_user_dev_uninit_notify(dev);
+
+	virtio_user_free_vrings(dev);
+
+	if (dev->is_server)
+		unlink(dev->path);
+
+	dev->ops->destroy(dev);
+}
+
+#define CVQ_MAX_DATA_DESCS 32
+
+int
+crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	dev->status = status;
+	ret = dev->ops->set_status(dev, status);
+	if (ret && ret != -ENOTSUP)
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
+{
+	int ret;
+	uint8_t status;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	ret = dev->ops->get_status(dev, &status);
+	if (!ret) {
+		dev->status = status;
+		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
+			"\t-RESET: %u "
+			"\t-ACKNOWLEDGE: %u "
+			"\t-DRIVER: %u "
+			"\t-DRIVER_OK: %u "
+			"\t-FEATURES_OK: %u "
+			"\t-DEVICE_NEED_RESET: %u "
+			"\t-FAILED: %u",
+			dev->status,
+			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
+	} else if (ret != -ENOTSUP) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
+	}
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
+{
+	if (dev->ops->update_link_state)
+		return dev->ops->update_link_state(dev);
+
+	return 0;
+}
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
new file mode 100644
index 0000000000..9cd9856e5d
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#ifndef _VIRTIO_USER_DEV_H
+#define _VIRTIO_USER_DEV_H
+
+#include <limits.h>
+#include <stdbool.h>
+
+#include "../virtio_pci.h"
+#include "../virtio_ring.h"
+
+extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
+
+enum virtio_user_backend_type {
+	VIRTIO_USER_BACKEND_UNKNOWN,
+	VIRTIO_USER_BACKEND_VHOST_USER,
+	VIRTIO_USER_BACKEND_VHOST_VDPA,
+};
+
+struct virtio_user_queue {
+	uint16_t used_idx;
+	bool avail_wrap_counter;
+	bool used_wrap_counter;
+};
+
+struct virtio_user_dev {
+	struct virtio_crypto_hw hw;
+	enum virtio_user_backend_type backend_type;
+	bool		is_server;  /* server or client mode */
+
+	int		*callfds;
+	int		*kickfds;
+	uint16_t	max_queue_pairs;
+	uint16_t	queue_pairs;
+	uint32_t	queue_size;
+	uint64_t	features; /* the negotiated features with driver,
+				   * and will be sync with device
+				   */
+	uint64_t	device_features; /* supported features by device */
+	uint64_t	frontend_features; /* enabled frontend features */
+	uint64_t	unsupported_features; /* unsupported features mask */
+	uint8_t		status;
+	uint32_t	crypto_status;
+	uint32_t	crypto_services;
+	uint64_t	cipher_algo;
+	uint32_t	hash_algo;
+	uint64_t	auth_algo;
+	uint32_t	aead_algo;
+	uint32_t	akcipher_algo;
+	char		path[PATH_MAX];
+
+	union {
+		void			*ptr;
+		struct vring		*split;
+		struct vring_packed	*packed;
+	} vrings;
+
+	struct virtio_user_queue *packed_queues;
+	bool		*qp_enabled;
+
+	struct virtio_user_backend_ops *ops;
+	pthread_mutex_t	mutex;
+	bool		started;
+
+	bool			hw_cvq;
+	struct virtqueue	*scvq;
+
+	void *backend_data;
+
+	uint16_t **notify_area;
+};
+
+int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
+int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server);
+void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
+int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+extern const char * const crypto_virtio_user_backend_strings[];
+#endif
diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
new file mode 100644
index 0000000000..992e8fb43b
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
@@ -0,0 +1,575 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <bus_vdev_driver.h>
+#include <rte_cryptodev.h>
+#include <cryptodev_pmd.h>
+#include <rte_alarm.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+
+#include "virtio_user/virtio_user_dev.h"
+#include "virtio_user/vhost.h"
+#include "virtio_cryptodev.h"
+#include "virtio_logs.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
+
+static void
+virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		     void *dst, int length __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (offset == offsetof(struct virtio_crypto_config, status)) {
+		crypto_virtio_user_dev_update_link_state(dev);
+		*(uint32_t *)dst = dev->crypto_status;
+	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
+		*(uint16_t *)dst = dev->max_queue_pairs;
+	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
+		*(uint32_t *)dst = dev->crypto_services;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
+		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
+		*(uint32_t *)dst = dev->cipher_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
+		*(uint32_t *)dst = dev->hash_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
+		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
+		*(uint32_t *)dst = dev->auth_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
+		*(uint32_t *)dst = dev->aead_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
+		*(uint32_t *)dst = dev->akcipher_algo;
+}
+
+static void
+virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		      const void *src, int length)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(src);
+
+	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
+		    offset, length);
+}
+
+static void
+virtio_user_reset(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
+		crypto_virtio_user_stop_device(dev);
+}
+
+static void
+virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint8_t old_status = dev->status;
+
+	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
+			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
+		crypto_virtio_user_dev_set_features(dev);
+		/* Feature negotiation should be only done in probe time.
+		 * So we skip any more request here.
+		 */
+		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
+	}
+
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
+		if (crypto_virtio_user_start_device(dev)) {
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
+		virtio_user_reset(hw);
+	}
+
+	crypto_virtio_user_dev_set_status(dev, status);
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
+		if (dev->ops->cvq_enable(dev, 1) < 0) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	}
+}
+
+static uint8_t
+virtio_user_get_status(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	crypto_virtio_user_dev_update_status(dev);
+
+	return dev->status;
+}
+
+#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+static uint64_t
+virtio_user_get_features(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* unmask feature bits defined in vhost user protocol */
+	return (dev->device_features | dev->frontend_features) &
+		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
+}
+
+static void
+virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	dev->features = features & (dev->device_features | dev->frontend_features);
+}
+
+static uint8_t
+virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
+{
+	/* rxq interrupts and config interrupt are separated in virtio-user,
+	 * here we only report config change.
+	 */
+	return VIRTIO_PCI_CAP_ISR_CFG;
+}
+
+static uint16_t
+virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
+		    uint16_t vec __rte_unused)
+{
+	return 0;
+}
+
+static uint16_t
+virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
+			  struct virtqueue *vq __rte_unused,
+			  uint16_t vec)
+{
+	/* pretend we have done that */
+	return vec;
+}
+
+/* This function is to get the queue size, aka, number of descs, of a specified
+ * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
+ * max supported queues.
+ */
+static uint16_t
+virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* Currently, each queue has same queue size */
+	return dev->queue_size;
+}
+
+static void
+virtio_user_setup_queue_packed(struct virtqueue *vq,
+			       struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	struct vring_packed *vring;
+	uint64_t desc_addr;
+	uint64_t avail_addr;
+	uint64_t used_addr;
+	uint16_t i;
+
+	vring  = &dev->vrings.packed[queue_idx];
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries *
+		sizeof(struct vring_packed_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr +
+			   sizeof(struct vring_packed_desc_event),
+			   VIRTIO_VRING_ALIGN);
+	vring->num = vq->vq_nentries;
+	vring->desc_iova = vq->vq_ring_mem;
+	vring->desc = (void *)(uintptr_t)desc_addr;
+	vring->driver = (void *)(uintptr_t)avail_addr;
+	vring->device = (void *)(uintptr_t)used_addr;
+	dev->packed_queues[queue_idx].avail_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_idx = 0;
+
+	for (i = 0; i < vring->num; i++)
+		vring->desc[i].flags = 0;
+}
+
+static void
+virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	uint64_t desc_addr, avail_addr, used_addr;
+
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_VRING_ALIGN);
+
+	dev->vrings.split[queue_idx].num = vq->vq_nentries;
+	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
+	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
+	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
+	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
+}
+
+static int
+virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (vtpci_with_packed_queue(hw))
+		virtio_user_setup_queue_packed(vq, dev);
+	else
+		virtio_user_setup_queue_split(vq, dev);
+
+	if (dev->notify_area)
+		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
+
+	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
+		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
+
+	return 0;
+}
+
+static void
+virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(vq);
+}
+
+static void
+virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint64_t notify_data = 1;
+
+	if (!dev->notify_area) {
+		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
+			  sizeof(notify_data)) < 0)
+			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
+				    strerror(errno));
+		return;
+	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
+		rte_write16(vq->vq_queue_index, vq->notify_addr);
+		return;
+	}
+
+	if (vtpci_with_packed_queue(hw)) {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:30]: avail index
+		 * Bit[31]: avail wrap counter
+		 */
+		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
+				VRING_PACKED_DESC_F_AVAIL)) << 31) |
+				((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	} else {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:31]: avail index
+		 */
+		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	}
+	rte_write32(notify_data, vq->notify_addr);
+}
+
+const struct virtio_pci_ops crypto_virtio_user_ops = {
+	.read_dev_cfg	= virtio_user_read_dev_config,
+	.write_dev_cfg	= virtio_user_write_dev_config,
+	.reset		= virtio_user_reset,
+	.get_status	= virtio_user_get_status,
+	.set_status	= virtio_user_set_status,
+	.get_features	= virtio_user_get_features,
+	.set_features	= virtio_user_set_features,
+	.get_isr	= virtio_user_get_isr,
+	.set_config_irq	= virtio_user_set_config_irq,
+	.set_queue_irq	= virtio_user_set_queue_irq,
+	.get_queue_num	= virtio_user_get_queue_num,
+	.setup_queue	= virtio_user_setup_queue,
+	.del_queue	= virtio_user_del_queue,
+	.notify_queue	= virtio_user_notify_queue,
+};
+
+static const char * const valid_args[] = {
+#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
+	VIRTIO_USER_ARG_QUEUES_NUM,
+#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
+	VIRTIO_USER_ARG_QUEUE_SIZE,
+#define VIRTIO_USER_ARG_PATH           "path"
+	VIRTIO_USER_ARG_PATH,
+	NULL
+};
+
+#define VIRTIO_USER_DEF_Q_NUM	1
+#define VIRTIO_USER_DEF_Q_SZ	256
+#define VIRTIO_USER_DEF_SERVER_MODE	0
+
+static int
+get_string_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	uint64_t integer = 0;
+	if (!value || !extra_args)
+		return -EINVAL;
+	errno = 0;
+	integer = strtoull(value, NULL, 0);
+	/* extra_args keeps default value, it should be replaced
+	 * only in case of successful parsing of the 'value' arg
+	 */
+	if (errno == 0)
+		*(uint64_t *)extra_args = integer;
+	return -errno;
+}
+
+static struct rte_cryptodev *
+virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev_pmd_init_params init_params = {
+		.name = "",
+		.private_data_size = sizeof(struct virtio_user_dev),
+	};
+	struct rte_cryptodev_data *data;
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	struct virtio_crypto_hw *hw;
+
+	init_params.socket_id = vdev->device.numa_node;
+	init_params.private_data_size = sizeof(struct virtio_user_dev);
+	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
+	if (cryptodev == NULL) {
+		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
+		return NULL;
+	}
+
+	data = cryptodev->data;
+	dev = data->dev_private;
+	hw = &dev->hw;
+
+	hw->dev_id = data->dev_id;
+	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
+
+	return cryptodev;
+}
+
+static void
+virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
+{
+	rte_cryptodev_pmd_destroy(cryptodev);
+}
+
+static int
+virtio_user_pmd_probe(struct rte_vdev_device *vdev)
+{
+	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
+	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
+	struct rte_cryptodev *cryptodev = NULL;
+	struct rte_kvargs *kvlist = NULL;
+	struct virtio_user_dev *dev;
+	char *path = NULL;
+	int ret = -1;
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
+
+	if (!kvlist) {
+		PMD_INIT_LOG(ERR, "error when parsing param");
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
+					&get_string_arg, &path) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_PATH);
+			goto end;
+		}
+	} else {
+		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
+				VIRTIO_USER_ARG_PATH);
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
+					&get_integer_arg, &queues) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUES_NUM);
+			goto end;
+		}
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
+					&get_integer_arg, &queue_size) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUE_SIZE);
+			goto end;
+		}
+	}
+
+	cryptodev = virtio_user_cryptodev_alloc(vdev);
+	if (!cryptodev) {
+		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
+		goto end;
+	}
+
+	dev = cryptodev->data->dev_private;
+	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
+			server_mode) < 0) {
+		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
+			NULL) < 0) {
+		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
+		crypto_virtio_user_dev_uninit(dev);
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	rte_cryptodev_pmd_probing_finish(cryptodev);
+
+	ret = 0;
+end:
+	rte_kvargs_free(kvlist);
+	free(path);
+	return ret;
+}
+
+static int
+virtio_user_pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev *cryptodev;
+	const char *name;
+	int devid;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	PMD_DRV_LOG(INFO, "Removing %s", name);
+
+	devid = rte_cryptodev_get_dev_id(name);
+	if (devid < 0)
+		return -EINVAL;
+
+	rte_cryptodev_stop(devid);
+
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
+		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_map)
+		return dev->ops->dma_map(dev, addr, iova, len);
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_unmap)
+		return dev->ops->dma_unmap(dev, addr, iova, len);
+
+	return 0;
+}
+
+static struct rte_vdev_driver virtio_user_driver = {
+	.probe = virtio_user_pmd_probe,
+	.remove = virtio_user_pmd_remove,
+	.dma_map = virtio_user_pmd_dma_map,
+	.dma_unmap = virtio_user_pmd_dma_unmap,
+};
+
+static struct cryptodev_driver virtio_crypto_drv;
+
+uint8_t cryptodev_virtio_user_driver_id;
+
+RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
+RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
+	virtio_user_driver.driver,
+	cryptodev_virtio_user_driver_id);
+RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
+	"path=<path> "
+	"queues=<int> "
+	"queue_size=<int>");
-- 
2.25.1


^ permalink raw reply	[relevance 1%]

* [v4 4/6] crypto/virtio: add vDPA backend
  @ 2025-02-22  9:16  1% ` Gowrishankar Muthukrishnan
  0 siblings, 0 replies; 200+ results
From: Gowrishankar Muthukrishnan @ 2025-02-22  9:16 UTC (permalink / raw)
  To: dev, Jay Zhou; +Cc: anoobj, Akhil Goyal, Gowrishankar Muthukrishnan

Add vDPA backend to virtio_user crypto.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
---
Depends-on: series-34682 ("vhost: add RSA support")
v4:
 - fixed CI issue.
---
 drivers/crypto/virtio/meson.build             |   7 +
 drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
 drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
 drivers/crypto/virtio/virtio_logs.h           |   6 +-
 drivers/crypto/virtio/virtio_pci.h            |   7 +
 drivers/crypto/virtio/virtio_ring.h           |   6 -
 drivers/crypto/virtio/virtio_user/vhost.h     |  90 +++
 .../crypto/virtio/virtio_user/vhost_vdpa.c    | 710 +++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.c      | 749 ++++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.h      |  85 ++
 drivers/crypto/virtio/virtio_user_cryptodev.c | 575 ++++++++++++++
 11 files changed, 2265 insertions(+), 30 deletions(-)
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost.h
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
 create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c

diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
index d2c3b3ad07..3763e86746 100644
--- a/drivers/crypto/virtio/meson.build
+++ b/drivers/crypto/virtio/meson.build
@@ -16,3 +16,10 @@ sources = files(
         'virtio_rxtx.c',
         'virtqueue.c',
 )
+
+if is_linux
+    sources += files('virtio_user_cryptodev.c',
+        'virtio_user/vhost_vdpa.c',
+        'virtio_user/virtio_user_dev.c')
+    deps += ['bus_vdev']
+endif
diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
index 92fea557ab..bc737f1e68 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.c
+++ b/drivers/crypto/virtio/virtio_cryptodev.c
@@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
 	return 0;
 }
 
-/*
- * This function is based on probe() function
- * It returns 0 on success.
- */
-static int
-crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
-		struct rte_cryptodev_pmd_init_params *init_params)
+int
+crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev)
 {
-	struct rte_cryptodev *cryptodev;
 	struct virtio_crypto_hw *hw;
 
-	PMD_INIT_FUNC_TRACE();
-
-	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
-					init_params);
-	if (cryptodev == NULL)
-		return -ENODEV;
-
 	cryptodev->driver_id = cryptodev_virtio_driver_id;
 	cryptodev->dev_ops = &virtio_crypto_dev_ops;
 
@@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
 	hw->dev_id = cryptodev->data->dev_id;
 	hw->virtio_dev_capabilities = virtio_capabilities;
 
-	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
-		cryptodev->data->dev_id, pci_dev->id.vendor_id,
-		pci_dev->id.device_id);
+	if (pci_dev) {
+		/* pci device init */
+		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
+			cryptodev->data->dev_id, pci_dev->id.vendor_id,
+			pci_dev->id.device_id);
 
-	/* pci device init */
-	if (vtpci_cryptodev_init(pci_dev, hw))
+		if (vtpci_cryptodev_init(pci_dev, hw))
+			return -1;
+	}
+
+	if (virtio_crypto_init_device(cryptodev, features) < 0)
 		return -1;
 
-	if (virtio_crypto_init_device(cryptodev,
-			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
+	return 0;
+}
+
+/*
+ * This function is based on probe() function
+ * It returns 0 on success.
+ */
+static int
+crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
+		struct rte_cryptodev_pmd_init_params *init_params)
+{
+	struct rte_cryptodev *cryptodev;
+
+	PMD_INIT_FUNC_TRACE();
+
+	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
+					init_params);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
+			pci_dev) < 0)
 		return -1;
 
 	rte_cryptodev_pmd_probing_finish(cryptodev);
diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
index f8498246e2..fad73d54a8 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.h
+++ b/drivers/crypto/virtio/virtio_cryptodev.h
@@ -76,4 +76,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
 		struct rte_crypto_op **tx_pkts,
 		uint16_t nb_pkts);
 
+int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev);
+
 #endif /* _VIRTIO_CRYPTODEV_H_ */
diff --git a/drivers/crypto/virtio/virtio_logs.h b/drivers/crypto/virtio/virtio_logs.h
index 988514919f..1cc51f7990 100644
--- a/drivers/crypto/virtio/virtio_logs.h
+++ b/drivers/crypto/virtio/virtio_logs.h
@@ -15,8 +15,10 @@ extern int virtio_crypto_logtype_init;
 
 #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
 
-extern int virtio_crypto_logtype_init;
-#define RTE_LOGTYPE_VIRTIO_CRYPTO_INIT virtio_crypto_logtype_init
+extern int virtio_crypto_logtype_driver;
+#define RTE_LOGTYPE_VIRTIO_CRYPTO_DRIVER virtio_crypto_logtype_driver
+#define PMD_DRV_LOG(level, ...) \
+	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_DRIVER, "%s(): ", __func__, __VA_ARGS__)
 
 #define VIRTIO_CRYPTO_INIT_LOG_IMPL(level, ...) \
 	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_INIT, "%s(): ", __func__, __VA_ARGS__)
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
index 79945cb88e..c75777e005 100644
--- a/drivers/crypto/virtio/virtio_pci.h
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -20,6 +20,9 @@ struct virtqueue;
 #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
 #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
 
+/* VirtIO device IDs. */
+#define VIRTIO_ID_CRYPTO  20
+
 /* VirtIO ABI version, this must match exactly. */
 #define VIRTIO_PCI_ABI_VERSION 0
 
@@ -56,8 +59,12 @@ struct virtqueue;
 #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
 #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
 #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
 #define VIRTIO_CONFIG_STATUS_FAILED    0x80
 
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_VRING_ALIGN 4096
+
 /*
  * Each virtqueue indirect descriptor list must be physically contiguous.
  * To allow us to malloc(9) each list individually, limit the number
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
index c74d1172b7..4b418f6e60 100644
--- a/drivers/crypto/virtio/virtio_ring.h
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
 				sizeof(struct vring_packed_desc_event)), align);
 }
 
-static inline void
-vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
-{
-	vring_init_split(vr, p, 0, align, num);
-}
-
 /*
  * The following is used with VIRTIO_RING_F_EVENT_IDX.
  * Assuming a given event_idx value from the other size, if we have
diff --git a/drivers/crypto/virtio/virtio_user/vhost.h b/drivers/crypto/virtio/virtio_user/vhost.h
new file mode 100644
index 0000000000..29cc1a14d4
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#ifndef _VIRTIO_USER_VHOST_H
+#define _VIRTIO_USER_VHOST_H
+
+#include <stdint.h>
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#include <rte_errno.h>
+
+#include "../virtio_logs.h"
+
+struct vhost_vring_state {
+	unsigned int index;
+	unsigned int num;
+};
+
+struct vhost_vring_file {
+	unsigned int index;
+	int fd;
+};
+
+struct vhost_vring_addr {
+	unsigned int index;
+	/* Option flags. */
+	unsigned int flags;
+	/* Flag values: */
+	/* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+	/* Start of array of descriptors (virtually contiguous) */
+	uint64_t desc_user_addr;
+	/* Used structure address. Must be 32 bit aligned */
+	uint64_t used_user_addr;
+	/* Available structure address. Must be 16 bit aligned */
+	uint64_t avail_user_addr;
+	/* Logging support. */
+	/* Log writes to used structure, at offset calculated from specified
+	 * address. Address must be 32 bit aligned.
+	 */
+	uint64_t log_guest_addr;
+};
+
+#ifndef VHOST_BACKEND_F_IOTLB_MSG_V2
+#define VHOST_BACKEND_F_IOTLB_MSG_V2 1
+#endif
+
+#ifndef VHOST_BACKEND_F_IOTLB_BATCH
+#define VHOST_BACKEND_F_IOTLB_BATCH 2
+#endif
+
+struct virtio_user_dev;
+
+struct virtio_user_backend_ops {
+	int (*setup)(struct virtio_user_dev *dev);
+	int (*destroy)(struct virtio_user_dev *dev);
+	int (*get_backend_features)(uint64_t *features);
+	int (*set_owner)(struct virtio_user_dev *dev);
+	int (*get_features)(struct virtio_user_dev *dev, uint64_t *features);
+	int (*set_features)(struct virtio_user_dev *dev, uint64_t features);
+	int (*set_memory_table)(struct virtio_user_dev *dev);
+	int (*set_vring_num)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*get_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_call)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_kick)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_addr)(struct virtio_user_dev *dev, struct vhost_vring_addr *addr);
+	int (*get_status)(struct virtio_user_dev *dev, uint8_t *status);
+	int (*set_status)(struct virtio_user_dev *dev, uint8_t status);
+	int (*get_config)(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len);
+	int (*set_config)(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off,
+			uint32_t len);
+	int (*cvq_enable)(struct virtio_user_dev *dev, int enable);
+	int (*enable_qp)(struct virtio_user_dev *dev, uint16_t pair_idx, int enable);
+	int (*dma_map)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*dma_unmap)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*update_link_state)(struct virtio_user_dev *dev);
+	int (*server_disconnect)(struct virtio_user_dev *dev);
+	int (*server_reconnect)(struct virtio_user_dev *dev);
+	int (*get_intr_fd)(struct virtio_user_dev *dev);
+	int (*map_notification_area)(struct virtio_user_dev *dev);
+	int (*unmap_notification_area)(struct virtio_user_dev *dev);
+};
+
+extern struct virtio_user_backend_ops virtio_ops_vdpa;
+
+#endif
diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
new file mode 100644
index 0000000000..b5839875e6
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
@@ -0,0 +1,710 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include <rte_memory.h>
+
+#include "vhost.h"
+#include "virtio_user_dev.h"
+#include "../virtio_pci.h"
+
+struct vhost_vdpa_data {
+	int vhostfd;
+	uint64_t protocol_features;
+};
+
+#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
+	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
+	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
+
+/* vhost kernel & vdpa ioctls */
+#define VHOST_VIRTIO 0xAF
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
+#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
+#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
+#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
+#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
+#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	uint64_t iova;
+	uint64_t size;
+	uint64_t uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	uint8_t perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+#define VHOST_IOTLB_BATCH_BEGIN    5
+#define VHOST_IOTLB_BATCH_END      6
+	uint8_t type;
+};
+
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_vdpa_config {
+	uint32_t off;
+	uint32_t len;
+	uint8_t buf[];
+};
+
+struct vhost_msg {
+	uint32_t type;
+	uint32_t reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		uint8_t padding[64];
+	};
+};
+
+static int
+vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
+{
+	int ret;
+
+	ret = ioctl(fd, request, arg);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
+				request, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_owner(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_OWNER, NULL);
+}
+
+static int
+vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
+}
+
+static int
+vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int ret;
+
+	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get features");
+		return -1;
+	}
+
+	/* Negotiated vDPA backend features */
+	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to get backend features");
+		return -1;
+	}
+
+	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
+
+	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to set backend features");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	/* WORKAROUND */
+	features |= 1ULL << VIRTIO_F_IOMMU_PLATFORM;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_iotlb_batch_begin(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch begin (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_iotlb_batch_end(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_END;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch end (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_UPDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.uaddr = (uint64_t)(uintptr_t)addr;
+	msg.iotlb.size = len;
+	msg.iotlb.perm = VHOST_ACCESS_RW;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", addr: %p, len: 0x%zx",
+			__func__, iova, addr, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB update (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_unmap(struct virtio_user_dev *dev, __rte_unused void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.size = len;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", len: 0x%zx",
+			__func__, iova, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB invalidate (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_map(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_dma_unmap_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_unmap(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_map_contig(const struct rte_memseg_list *msl,
+		const struct rte_memseg *ms, size_t len, void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	if (msl->external)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, len);
+}
+
+static int
+vhost_vdpa_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
+		void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	/* skip external memory that isn't a heap */
+	if (msl->external && !msl->heap)
+		return 0;
+
+	/* skip any segments with invalid IOVA addresses */
+	if (ms->iova == RTE_BAD_IOVA)
+		return 0;
+
+	/* if IOVA mode is VA, we've already mapped the internal segments */
+	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, ms->len);
+}
+
+static int
+vhost_vdpa_set_memory_table(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	vhost_vdpa_dma_unmap(dev, NULL, 0, SIZE_MAX);
+
+	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
+		/* with IOVA as VA mode, we can get away with mapping contiguous
+		 * chunks rather than going page-by-page.
+		 */
+		ret = rte_memseg_contig_walk_thread_unsafe(
+				vhost_vdpa_map_contig, dev);
+		if (ret)
+			goto batch_end;
+		/* we have to continue the walk because we've skipped the
+		 * external segments during the config walk.
+		 */
+	}
+	ret = rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev);
+
+batch_end:
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
+}
+
+static int
+vhost_vdpa_set_vring_num(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_NUM, state);
+}
+
+static int
+vhost_vdpa_set_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_get_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_set_vring_call(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_CALL, file);
+}
+
+static int
+vhost_vdpa_set_vring_kick(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_KICK, file);
+}
+
+static int
+vhost_vdpa_set_vring_addr(struct virtio_user_dev *dev, struct vhost_vring_addr *addr)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_ADDR, addr);
+}
+
+static int
+vhost_vdpa_get_status(struct virtio_user_dev *dev, uint8_t *status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_GET_STATUS, status);
+}
+
+static int
+vhost_vdpa_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_STATUS, &status);
+}
+
+static int
+vhost_vdpa_get_config(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_GET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+		goto out;
+	}
+
+	memcpy(data, config->buf, len);
+out:
+	free(config);
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_config(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	memcpy(config->buf, data, len);
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_SET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to set vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+	}
+
+	free(config);
+
+	return ret;
+}
+
+/**
+ * Set up environment to talk with a vhost vdpa backend.
+ *
+ * @return
+ *   - (-1) if fail to set up;
+ *   - (>=0) if successful.
+ */
+static int
+vhost_vdpa_setup(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data;
+	uint32_t did = (uint32_t)-1;
+
+	data = malloc(sizeof(*data));
+	if (!data) {
+		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
+		return -1;
+	}
+
+	data->vhostfd = open(dev->path, O_RDWR);
+	if (data->vhostfd < 0) {
+		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
+				dev->path, strerror(errno));
+		free(data);
+		return -1;
+	}
+
+	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
+			did != VIRTIO_ID_CRYPTO) {
+		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
+		close(data->vhostfd);
+		free(data);
+		return -1;
+	}
+
+	dev->backend_data = data;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_destroy(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	if (!data)
+		return 0;
+
+	close(data->vhostfd);
+
+	free(data);
+	dev->backend_data = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
+{
+	struct vhost_vring_state state = {
+		.index = dev->max_queue_pairs,
+		.num   = enable,
+	};
+
+	return vhost_vdpa_set_vring_enable(dev, &state);
+}
+
+static int
+vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
+				uint16_t pair_idx,
+				int enable)
+{
+	struct vhost_vring_state state = {
+		.index = pair_idx,
+		.num   = enable,
+	};
+
+	if (dev->qp_enabled[pair_idx] == enable)
+		return 0;
+
+	if (vhost_vdpa_set_vring_enable(dev, &state))
+		return -1;
+
+	dev->qp_enabled[pair_idx] = enable;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_backend_features(uint64_t *features)
+{
+	*features = 0;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
+{
+	/* TODO: It is W/A until a cleaner approach to find cpt status */
+	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_intr_fd(struct virtio_user_dev *dev __rte_unused)
+{
+	/* No link state interrupt with Vhost-vDPA */
+	return -1;
+}
+
+static int
+vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
+{
+	int nr_vrings = dev->max_queue_pairs;
+
+	return nr_vrings;
+}
+
+static int
+vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
+{
+	int i, nr_vrings;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	for (i = 0; i < nr_vrings; i++) {
+		if (dev->notify_area[i])
+			munmap(dev->notify_area[i], getpagesize());
+	}
+	free(dev->notify_area);
+	dev->notify_area = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int nr_vrings, i, page_size = getpagesize();
+	uint16_t **notify_area;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	/* CQ is another vring */
+	nr_vrings++;
+
+	notify_area = malloc(nr_vrings * sizeof(*notify_area));
+	if (!notify_area) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
+		return -1;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
+					data->vhostfd, i * page_size);
+		if (notify_area[i] == MAP_FAILED) {
+			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
+					dev->path, i);
+			i--;
+			goto map_err;
+		}
+	}
+	dev->notify_area = notify_area;
+
+	return 0;
+
+map_err:
+	for (; i >= 0; i--)
+		munmap(notify_area[i], page_size);
+	free(notify_area);
+
+	return -1;
+}
+
+struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
+	.setup = vhost_vdpa_setup,
+	.destroy = vhost_vdpa_destroy,
+	.get_backend_features = vhost_vdpa_get_backend_features,
+	.set_owner = vhost_vdpa_set_owner,
+	.get_features = vhost_vdpa_get_features,
+	.set_features = vhost_vdpa_set_features,
+	.set_memory_table = vhost_vdpa_set_memory_table,
+	.set_vring_num = vhost_vdpa_set_vring_num,
+	.set_vring_base = vhost_vdpa_set_vring_base,
+	.get_vring_base = vhost_vdpa_get_vring_base,
+	.set_vring_call = vhost_vdpa_set_vring_call,
+	.set_vring_kick = vhost_vdpa_set_vring_kick,
+	.set_vring_addr = vhost_vdpa_set_vring_addr,
+	.get_status = vhost_vdpa_get_status,
+	.set_status = vhost_vdpa_set_status,
+	.get_config = vhost_vdpa_get_config,
+	.set_config = vhost_vdpa_set_config,
+	.cvq_enable = vhost_vdpa_cvq_enable,
+	.enable_qp = vhost_vdpa_enable_queue_pair,
+	.dma_map = vhost_vdpa_dma_map_batch,
+	.dma_unmap = vhost_vdpa_dma_unmap_batch,
+	.update_link_state = vhost_vdpa_update_link_state,
+	.get_intr_fd = vhost_vdpa_get_intr_fd,
+	.map_notification_area = vhost_vdpa_map_notification_area,
+	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
+};
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
new file mode 100644
index 0000000000..c8478d72ce
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
@@ -0,0 +1,749 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+
+#include <rte_alarm.h>
+#include <rte_string_fns.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_io.h>
+
+#include "vhost.h"
+#include "virtio_logs.h"
+#include "cryptodev_pmd.h"
+#include "virtio_crypto.h"
+#include "virtio_cvq.h"
+#include "virtio_user_dev.h"
+#include "virtqueue.h"
+
+#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
+
+const char * const crypto_virtio_user_backend_strings[] = {
+	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
+	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
+};
+
+static int
+virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	if (dev->kickfds[queue_sel] >= 0) {
+		close(dev->kickfds[queue_sel]);
+		dev->kickfds[queue_sel] = -1;
+	}
+
+	if (dev->callfds[queue_sel] >= 0) {
+		close(dev->callfds[queue_sel]);
+		dev->callfds[queue_sel] = -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* May use invalid flag, but some backend uses kickfd and
+	 * callfd as criteria to judge if dev is alive. so finally we
+	 * use real event_fd.
+	 */
+	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->callfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->kickfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	struct vhost_vring_state state;
+	int ret;
+
+	state.index = queue_sel;
+	ret = dev->ops->get_vring_base(dev, &state);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
+	 * firstly because vhost depends on this msg to allocate virtqueue
+	 * pair.
+	 */
+	struct vhost_vring_file file;
+	int ret;
+
+	file.index = queue_sel;
+	file.fd = dev->callfds[queue_sel];
+	ret = dev->ops->set_vring_call(dev, &file);
+	if (ret < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	int ret;
+	struct vhost_vring_file file;
+	struct vhost_vring_state state;
+	struct vring *vring = &dev->vrings.split[queue_sel];
+	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
+	uint64_t desc_addr, avail_addr, used_addr;
+	struct vhost_vring_addr addr = {
+		.index = queue_sel,
+		.log_guest_addr = 0,
+		.flags = 0, /* disable log */
+	};
+
+	if (queue_sel == dev->max_queue_pairs) {
+		if (!dev->scvq) {
+			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
+					dev->path);
+			goto err;
+		}
+
+		/* Use shadow control queue information */
+		vring = &dev->scvq->vq_split.ring;
+		pq_vring = &dev->scvq->vq_packed.ring;
+	}
+
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+		desc_addr = pq_vring->desc_iova;
+		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
+		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
+						VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	} else {
+		desc_addr = vring->desc_iova;
+		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
+		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
+					VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	}
+
+	state.index = queue_sel;
+	state.num = vring->num;
+	ret = dev->ops->set_vring_num(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	state.index = queue_sel;
+	state.num = 0; /* no reservation */
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+		state.num |= (1 << 15);
+	ret = dev->ops->set_vring_base(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	ret = dev->ops->set_vring_addr(dev, &addr);
+	if (ret < 0)
+		goto err;
+
+	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
+	 * lastly because vhost depends on this msg to judge if
+	 * virtio is ready.
+	 */
+	file.index = queue_sel;
+	file.fd = dev->kickfds[queue_sel];
+	ret = dev->ops->set_vring_kick(dev, &file);
+	if (ret < 0)
+		goto err;
+
+	return 0;
+err:
+	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
+
+	return -1;
+}
+
+static int
+virtio_user_foreach_queue(struct virtio_user_dev *dev,
+			int (*fn)(struct virtio_user_dev *, uint32_t))
+{
+	uint32_t i, nr_vq;
+
+	nr_vq = dev->max_queue_pairs;
+
+	for (i = 0; i < nr_vq; i++)
+		if (fn(dev, i) < 0)
+			return -1;
+
+	return 0;
+}
+
+int
+crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
+{
+	uint64_t features;
+	int ret = -1;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 0: tell vhost to create queues */
+	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
+		goto error;
+
+	features = dev->features;
+
+	ret = dev->ops->set_features(dev, features);
+	if (ret < 0)
+		goto error;
+	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
+error:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return ret;
+}
+
+int
+crypto_virtio_user_start_device(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	/*
+	 * XXX workaround!
+	 *
+	 * We need to make sure that the locks will be
+	 * taken in the correct order to avoid deadlocks.
+	 *
+	 * Before releasing this lock, this thread should
+	 * not trigger any memory hotplug events.
+	 *
+	 * This is a temporary workaround, and should be
+	 * replaced when we get proper supports from the
+	 * memory subsystem in the future.
+	 */
+	rte_mcfg_mem_read_lock();
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 2: share memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto error;
+
+	/* Step 3: kick queues */
+	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
+	if (ret < 0)
+		goto error;
+
+	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
+	if (ret < 0)
+		goto error;
+
+	/* Step 4: enable queues */
+	for (int i = 0; i < dev->max_queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto error;
+	}
+
+	dev->started = true;
+
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	return 0;
+error:
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
+
+	/* TODO: free resource here or caller to check */
+	return -1;
+}
+
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
+{
+	uint32_t i;
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	if (!dev->started)
+		goto out;
+
+	for (i = 0; i < dev->max_queue_pairs; ++i) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	if (dev->scvq) {
+		ret = dev->ops->cvq_enable(dev, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	/* Stop the backend. */
+	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
+		goto err;
+
+	dev->started = false;
+
+out:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return 0;
+err:
+	pthread_mutex_unlock(&dev->mutex);
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
+
+	return -1;
+}
+
+static int
+virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
+{
+	int ret;
+
+	if (!dev->ops->get_config) {
+		dev->max_queue_pairs = user_max_qp;
+		return 0;
+	}
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
+			offsetof(struct virtio_crypto_config, max_dataqueues),
+			sizeof(uint16_t));
+	if (ret) {
+		/*
+		 * We need to know the max queue pair from the device so that
+		 * the control queue gets the right index.
+		 */
+		dev->max_queue_pairs = 1;
+		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
+
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
+{
+	struct virtio_crypto_config config;
+	int ret;
+
+	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
+	dev->cipher_algo = 0;
+	dev->auth_algo = 0;
+	dev->akcipher_algo = 0;
+
+	if (!dev->ops->get_config)
+		return 0;
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
+	if (ret) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
+		return ret;
+	}
+
+	dev->crypto_services = config.crypto_services;
+	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
+						config.cipher_algo_l;
+	dev->hash_algo = config.hash_algo;
+	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
+						config.mac_algo_l;
+	dev->aead_algo = config.aead_algo;
+	dev->akcipher_algo = config.akcipher_algo;
+	return 0;
+}
+
+static int
+virtio_user_dev_init_notify(struct virtio_user_dev *dev)
+{
+
+	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
+		goto err;
+
+	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
+		if (dev->ops->map_notification_area &&
+				dev->ops->map_notification_area(dev))
+			goto err;
+
+	return 0;
+err:
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	return -1;
+}
+
+static void
+virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
+{
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	if (dev->ops->unmap_notification_area && dev->notify_area)
+		dev->ops->unmap_notification_area(dev);
+}
+
+static void
+virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
+			const void *addr,
+			size_t len __rte_unused,
+			void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+	struct rte_memseg_list *msl;
+	uint16_t i;
+	int ret = 0;
+
+	/* ignore externally allocated memory */
+	msl = rte_mem_virt2memseg_list(addr);
+	if (msl->external)
+		return;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	if (dev->started == false)
+		goto exit;
+
+	/* Step 1: pause the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* Step 2: update memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto exit;
+
+	/* Step 3: resume the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto exit;
+	}
+
+exit:
+	pthread_mutex_unlock(&dev->mutex);
+
+	if (ret < 0)
+		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
+}
+
+static int
+virtio_user_dev_setup(struct virtio_user_dev *dev)
+{
+	if (dev->is_server) {
+		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
+			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
+			return -1;
+		}
+	}
+
+	switch (dev->backend_type) {
+	case VIRTIO_USER_BACKEND_VHOST_VDPA:
+		dev->ops = &virtio_crypto_ops_vdpa;
+		break;
+	default:
+		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_alloc_vrings(struct virtio_user_dev *dev)
+{
+	int i, size, nr_vrings;
+	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
+
+	nr_vrings = dev->max_queue_pairs + 1;
+
+	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
+	if (!dev->callfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
+		return -1;
+	}
+
+	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
+	if (!dev->kickfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
+		goto free_callfds;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		dev->callfds[i] = -1;
+		dev->kickfds[i] = -1;
+	}
+
+	if (packed_ring)
+		size = sizeof(*dev->vrings.packed);
+	else
+		size = sizeof(*dev->vrings.split);
+	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
+	if (!dev->vrings.ptr) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
+		goto free_kickfds;
+	}
+
+	if (packed_ring) {
+		dev->packed_queues = rte_zmalloc("virtio_user_dev",
+				nr_vrings * sizeof(*dev->packed_queues), 0);
+		if (!dev->packed_queues) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
+					dev->path);
+			goto free_vrings;
+		}
+	}
+
+	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
+			nr_vrings * sizeof(*dev->qp_enabled), 0);
+	if (!dev->qp_enabled) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
+		goto free_packed_queues;
+	}
+
+	return 0;
+
+free_packed_queues:
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+free_vrings:
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+free_kickfds:
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+free_callfds:
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+
+	return -1;
+}
+
+static void
+virtio_user_free_vrings(struct virtio_user_dev *dev)
+{
+	rte_free(dev->qp_enabled);
+	dev->qp_enabled = NULL;
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+}
+
+#define VIRTIO_USER_SUPPORTED_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+int
+crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server)
+{
+	uint64_t backend_features;
+
+	pthread_mutex_init(&dev->mutex, NULL);
+	strlcpy(dev->path, path, PATH_MAX);
+
+	dev->started = 0;
+	dev->queue_pairs = 1; /* mq disabled by default */
+	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
+	dev->queue_size = queue_size;
+	dev->is_server = server;
+	dev->frontend_features = 0;
+	dev->unsupported_features = 0;
+	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
+	dev->hw.modern = 1;
+
+	if (virtio_user_dev_setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->set_owner(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
+		goto destroy;
+	}
+
+	if (dev->ops->get_backend_features(&backend_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
+		goto destroy;
+	}
+
+	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
+
+	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_cipher_services(dev)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
+		goto destroy;
+	}
+
+	dev->frontend_features &= ~dev->unsupported_features;
+	dev->device_features &= ~dev->unsupported_features;
+
+	if (virtio_user_alloc_vrings(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_notify(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
+		goto free_vrings;
+	}
+
+	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
+				virtio_user_mem_event_cb, dev)) {
+		if (rte_errno != ENOTSUP) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
+					dev->path);
+			goto notify_uninit;
+		}
+	}
+
+	return 0;
+
+notify_uninit:
+	virtio_user_dev_uninit_notify(dev);
+free_vrings:
+	virtio_user_free_vrings(dev);
+destroy:
+	dev->ops->destroy(dev);
+
+	return -1;
+}
+
+void
+crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
+{
+	crypto_virtio_user_stop_device(dev);
+
+	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
+
+	virtio_user_dev_uninit_notify(dev);
+
+	virtio_user_free_vrings(dev);
+
+	if (dev->is_server)
+		unlink(dev->path);
+
+	dev->ops->destroy(dev);
+}
+
+#define CVQ_MAX_DATA_DESCS 32
+
+int
+crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	dev->status = status;
+	ret = dev->ops->set_status(dev, status);
+	if (ret && ret != -ENOTSUP)
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
+{
+	int ret;
+	uint8_t status;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	ret = dev->ops->get_status(dev, &status);
+	if (!ret) {
+		dev->status = status;
+		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
+			"\t-RESET: %u "
+			"\t-ACKNOWLEDGE: %u "
+			"\t-DRIVER: %u "
+			"\t-DRIVER_OK: %u "
+			"\t-FEATURES_OK: %u "
+			"\t-DEVICE_NEED_RESET: %u "
+			"\t-FAILED: %u",
+			dev->status,
+			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
+	} else if (ret != -ENOTSUP) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
+	}
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
+{
+	if (dev->ops->update_link_state)
+		return dev->ops->update_link_state(dev);
+
+	return 0;
+}
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
new file mode 100644
index 0000000000..9cd9856e5d
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#ifndef _VIRTIO_USER_DEV_H
+#define _VIRTIO_USER_DEV_H
+
+#include <limits.h>
+#include <stdbool.h>
+
+#include "../virtio_pci.h"
+#include "../virtio_ring.h"
+
+extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
+
+enum virtio_user_backend_type {
+	VIRTIO_USER_BACKEND_UNKNOWN,
+	VIRTIO_USER_BACKEND_VHOST_USER,
+	VIRTIO_USER_BACKEND_VHOST_VDPA,
+};
+
+struct virtio_user_queue {
+	uint16_t used_idx;
+	bool avail_wrap_counter;
+	bool used_wrap_counter;
+};
+
+struct virtio_user_dev {
+	struct virtio_crypto_hw hw;
+	enum virtio_user_backend_type backend_type;
+	bool		is_server;  /* server or client mode */
+
+	int		*callfds;
+	int		*kickfds;
+	uint16_t	max_queue_pairs;
+	uint16_t	queue_pairs;
+	uint32_t	queue_size;
+	uint64_t	features; /* the negotiated features with driver,
+				   * and will be sync with device
+				   */
+	uint64_t	device_features; /* supported features by device */
+	uint64_t	frontend_features; /* enabled frontend features */
+	uint64_t	unsupported_features; /* unsupported features mask */
+	uint8_t		status;
+	uint32_t	crypto_status;
+	uint32_t	crypto_services;
+	uint64_t	cipher_algo;
+	uint32_t	hash_algo;
+	uint64_t	auth_algo;
+	uint32_t	aead_algo;
+	uint32_t	akcipher_algo;
+	char		path[PATH_MAX];
+
+	union {
+		void			*ptr;
+		struct vring		*split;
+		struct vring_packed	*packed;
+	} vrings;
+
+	struct virtio_user_queue *packed_queues;
+	bool		*qp_enabled;
+
+	struct virtio_user_backend_ops *ops;
+	pthread_mutex_t	mutex;
+	bool		started;
+
+	bool			hw_cvq;
+	struct virtqueue	*scvq;
+
+	void *backend_data;
+
+	uint16_t **notify_area;
+};
+
+int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
+int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server);
+void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
+int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+extern const char * const crypto_virtio_user_backend_strings[];
+#endif
diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
new file mode 100644
index 0000000000..992e8fb43b
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
@@ -0,0 +1,575 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <bus_vdev_driver.h>
+#include <rte_cryptodev.h>
+#include <cryptodev_pmd.h>
+#include <rte_alarm.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+
+#include "virtio_user/virtio_user_dev.h"
+#include "virtio_user/vhost.h"
+#include "virtio_cryptodev.h"
+#include "virtio_logs.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
+
+static void
+virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		     void *dst, int length __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (offset == offsetof(struct virtio_crypto_config, status)) {
+		crypto_virtio_user_dev_update_link_state(dev);
+		*(uint32_t *)dst = dev->crypto_status;
+	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
+		*(uint16_t *)dst = dev->max_queue_pairs;
+	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
+		*(uint32_t *)dst = dev->crypto_services;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
+		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
+		*(uint32_t *)dst = dev->cipher_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
+		*(uint32_t *)dst = dev->hash_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
+		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
+		*(uint32_t *)dst = dev->auth_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
+		*(uint32_t *)dst = dev->aead_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
+		*(uint32_t *)dst = dev->akcipher_algo;
+}
+
+static void
+virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		      const void *src, int length)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(src);
+
+	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
+		    offset, length);
+}
+
+static void
+virtio_user_reset(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
+		crypto_virtio_user_stop_device(dev);
+}
+
+static void
+virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint8_t old_status = dev->status;
+
+	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
+			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
+		crypto_virtio_user_dev_set_features(dev);
+		/* Feature negotiation should be only done in probe time.
+		 * So we skip any more request here.
+		 */
+		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
+	}
+
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
+		if (crypto_virtio_user_start_device(dev)) {
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
+		virtio_user_reset(hw);
+	}
+
+	crypto_virtio_user_dev_set_status(dev, status);
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
+		if (dev->ops->cvq_enable(dev, 1) < 0) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	}
+}
+
+static uint8_t
+virtio_user_get_status(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	crypto_virtio_user_dev_update_status(dev);
+
+	return dev->status;
+}
+
+#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+static uint64_t
+virtio_user_get_features(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* unmask feature bits defined in vhost user protocol */
+	return (dev->device_features | dev->frontend_features) &
+		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
+}
+
+static void
+virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	dev->features = features & (dev->device_features | dev->frontend_features);
+}
+
+static uint8_t
+virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
+{
+	/* rxq interrupts and config interrupt are separated in virtio-user,
+	 * here we only report config change.
+	 */
+	return VIRTIO_PCI_CAP_ISR_CFG;
+}
+
+static uint16_t
+virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
+		    uint16_t vec __rte_unused)
+{
+	return 0;
+}
+
+static uint16_t
+virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
+			  struct virtqueue *vq __rte_unused,
+			  uint16_t vec)
+{
+	/* pretend we have done that */
+	return vec;
+}
+
+/* This function is to get the queue size, aka, number of descs, of a specified
+ * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
+ * max supported queues.
+ */
+static uint16_t
+virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* Currently, each queue has same queue size */
+	return dev->queue_size;
+}
+
+static void
+virtio_user_setup_queue_packed(struct virtqueue *vq,
+			       struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	struct vring_packed *vring;
+	uint64_t desc_addr;
+	uint64_t avail_addr;
+	uint64_t used_addr;
+	uint16_t i;
+
+	vring  = &dev->vrings.packed[queue_idx];
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries *
+		sizeof(struct vring_packed_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr +
+			   sizeof(struct vring_packed_desc_event),
+			   VIRTIO_VRING_ALIGN);
+	vring->num = vq->vq_nentries;
+	vring->desc_iova = vq->vq_ring_mem;
+	vring->desc = (void *)(uintptr_t)desc_addr;
+	vring->driver = (void *)(uintptr_t)avail_addr;
+	vring->device = (void *)(uintptr_t)used_addr;
+	dev->packed_queues[queue_idx].avail_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_idx = 0;
+
+	for (i = 0; i < vring->num; i++)
+		vring->desc[i].flags = 0;
+}
+
+static void
+virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	uint64_t desc_addr, avail_addr, used_addr;
+
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_VRING_ALIGN);
+
+	dev->vrings.split[queue_idx].num = vq->vq_nentries;
+	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
+	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
+	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
+	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
+}
+
+static int
+virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (vtpci_with_packed_queue(hw))
+		virtio_user_setup_queue_packed(vq, dev);
+	else
+		virtio_user_setup_queue_split(vq, dev);
+
+	if (dev->notify_area)
+		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
+
+	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
+		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
+
+	return 0;
+}
+
+static void
+virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(vq);
+}
+
+static void
+virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint64_t notify_data = 1;
+
+	if (!dev->notify_area) {
+		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
+			  sizeof(notify_data)) < 0)
+			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
+				    strerror(errno));
+		return;
+	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
+		rte_write16(vq->vq_queue_index, vq->notify_addr);
+		return;
+	}
+
+	if (vtpci_with_packed_queue(hw)) {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:30]: avail index
+		 * Bit[31]: avail wrap counter
+		 */
+		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
+				VRING_PACKED_DESC_F_AVAIL)) << 31) |
+				((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	} else {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:31]: avail index
+		 */
+		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	}
+	rte_write32(notify_data, vq->notify_addr);
+}
+
+const struct virtio_pci_ops crypto_virtio_user_ops = {
+	.read_dev_cfg	= virtio_user_read_dev_config,
+	.write_dev_cfg	= virtio_user_write_dev_config,
+	.reset		= virtio_user_reset,
+	.get_status	= virtio_user_get_status,
+	.set_status	= virtio_user_set_status,
+	.get_features	= virtio_user_get_features,
+	.set_features	= virtio_user_set_features,
+	.get_isr	= virtio_user_get_isr,
+	.set_config_irq	= virtio_user_set_config_irq,
+	.set_queue_irq	= virtio_user_set_queue_irq,
+	.get_queue_num	= virtio_user_get_queue_num,
+	.setup_queue	= virtio_user_setup_queue,
+	.del_queue	= virtio_user_del_queue,
+	.notify_queue	= virtio_user_notify_queue,
+};
+
+static const char * const valid_args[] = {
+#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
+	VIRTIO_USER_ARG_QUEUES_NUM,
+#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
+	VIRTIO_USER_ARG_QUEUE_SIZE,
+#define VIRTIO_USER_ARG_PATH           "path"
+	VIRTIO_USER_ARG_PATH,
+	NULL
+};
+
+#define VIRTIO_USER_DEF_Q_NUM	1
+#define VIRTIO_USER_DEF_Q_SZ	256
+#define VIRTIO_USER_DEF_SERVER_MODE	0
+
+static int
+get_string_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	uint64_t integer = 0;
+	if (!value || !extra_args)
+		return -EINVAL;
+	errno = 0;
+	integer = strtoull(value, NULL, 0);
+	/* extra_args keeps default value, it should be replaced
+	 * only in case of successful parsing of the 'value' arg
+	 */
+	if (errno == 0)
+		*(uint64_t *)extra_args = integer;
+	return -errno;
+}
+
+static struct rte_cryptodev *
+virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev_pmd_init_params init_params = {
+		.name = "",
+		.private_data_size = sizeof(struct virtio_user_dev),
+	};
+	struct rte_cryptodev_data *data;
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	struct virtio_crypto_hw *hw;
+
+	init_params.socket_id = vdev->device.numa_node;
+	init_params.private_data_size = sizeof(struct virtio_user_dev);
+	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
+	if (cryptodev == NULL) {
+		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
+		return NULL;
+	}
+
+	data = cryptodev->data;
+	dev = data->dev_private;
+	hw = &dev->hw;
+
+	hw->dev_id = data->dev_id;
+	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
+
+	return cryptodev;
+}
+
+static void
+virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
+{
+	rte_cryptodev_pmd_destroy(cryptodev);
+}
+
+static int
+virtio_user_pmd_probe(struct rte_vdev_device *vdev)
+{
+	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
+	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
+	struct rte_cryptodev *cryptodev = NULL;
+	struct rte_kvargs *kvlist = NULL;
+	struct virtio_user_dev *dev;
+	char *path = NULL;
+	int ret = -1;
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
+
+	if (!kvlist) {
+		PMD_INIT_LOG(ERR, "error when parsing param");
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
+					&get_string_arg, &path) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_PATH);
+			goto end;
+		}
+	} else {
+		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
+				VIRTIO_USER_ARG_PATH);
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
+					&get_integer_arg, &queues) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUES_NUM);
+			goto end;
+		}
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
+					&get_integer_arg, &queue_size) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUE_SIZE);
+			goto end;
+		}
+	}
+
+	cryptodev = virtio_user_cryptodev_alloc(vdev);
+	if (!cryptodev) {
+		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
+		goto end;
+	}
+
+	dev = cryptodev->data->dev_private;
+	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
+			server_mode) < 0) {
+		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
+			NULL) < 0) {
+		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
+		crypto_virtio_user_dev_uninit(dev);
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	rte_cryptodev_pmd_probing_finish(cryptodev);
+
+	ret = 0;
+end:
+	rte_kvargs_free(kvlist);
+	free(path);
+	return ret;
+}
+
+static int
+virtio_user_pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev *cryptodev;
+	const char *name;
+	int devid;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	PMD_DRV_LOG(INFO, "Removing %s", name);
+
+	devid = rte_cryptodev_get_dev_id(name);
+	if (devid < 0)
+		return -EINVAL;
+
+	rte_cryptodev_stop(devid);
+
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
+		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_map)
+		return dev->ops->dma_map(dev, addr, iova, len);
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_unmap)
+		return dev->ops->dma_unmap(dev, addr, iova, len);
+
+	return 0;
+}
+
+static struct rte_vdev_driver virtio_user_driver = {
+	.probe = virtio_user_pmd_probe,
+	.remove = virtio_user_pmd_remove,
+	.dma_map = virtio_user_pmd_dma_map,
+	.dma_unmap = virtio_user_pmd_dma_unmap,
+};
+
+static struct cryptodev_driver virtio_crypto_drv;
+
+uint8_t cryptodev_virtio_user_driver_id;
+
+RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
+RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
+	virtio_user_driver.driver,
+	cryptodev_virtio_user_driver_id);
+RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
+	"path=<path> "
+	"queues=<int> "
+	"queue_size=<int>");
-- 
2.25.1


^ permalink raw reply	[relevance 1%]

* RE: [EXTERNAL] [PATCH v8] graph: mcore: optimize graph search
  2025-02-07  1:39 11%             ` [PATCH v8] " Huichao Cai
@ 2025-02-22  6:59  0%               ` Kiran Kumar Kokkilagadda
  0 siblings, 0 replies; 200+ results
From: Kiran Kumar Kokkilagadda @ 2025-02-22  6:59 UTC (permalink / raw)
  To: Huichao Cai, Jerin Jacob, Nithin Kumar Dabilpuram, yanzhirun_163; +Cc: dev



> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Friday, February 7, 2025 7:10 AM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH v8] graph: mcore: optimize graph search
> 
> In the function __rte_graph_mcore_dispatch_sched_node_enqueue, use a
> slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed. 
> In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
> use a slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed.
> 
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---

Acked-by: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>



>  devtools/libabigail.abignore               |  5 +++++
>  doc/guides/rel_notes/release_25_03.rst     |  1 +
>  lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
>  lib/graph/rte_graph_worker_common.h        |  1 +
>  4 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore index
> 21b8cd6113..8876aaee2e 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -33,3 +33,8 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +[suppress_type]
> +        name = rte_node
> +        has_size_change = no
> +        has_data_member_inserted_between =
> +{offset_after(original_process), offset_of(xstat_off)}
> \ No newline at end of file
> diff --git a/doc/guides/rel_notes/release_25_03.rst
> b/doc/guides/rel_notes/release_25_03.rst
> index 269ab6f68a..16a888fd19 100644
> --- a/doc/guides/rel_notes/release_25_03.rst
> +++ b/doc/guides/rel_notes/release_25_03.rst
> @@ -150,6 +150,7 @@ ABI Changes
> 
>  * No ABI change that would break compatibility with 24.11.
> 
> +* graph: Added ``graph`` field to the ``dispatch`` structure in the ``rte_node``
> structure.
> 
>  Known Issues
>  ------------
> diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c
> b/lib/graph/rte_graph_model_mcore_dispatch.c
> index a590fc9497..a81d338227 100644
> --- a/lib/graph/rte_graph_model_mcore_dispatch.c
> +++ b/lib/graph/rte_graph_model_mcore_dispatch.c
> @@ -118,11 +118,14 @@
> __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
>  					      struct rte_graph_rq_head *rq)  {
>  	const unsigned int lcore_id = node->dispatch.lcore_id;
> -	struct rte_graph *graph;
> +	struct rte_graph *graph = node->dispatch.graph;
> 
> -	SLIST_FOREACH(graph, rq, next)
> -		if (graph->dispatch.lcore_id == lcore_id)
> -			break;
> +	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
> +		SLIST_FOREACH(graph, rq, next)
> +			if (graph->dispatch.lcore_id == lcore_id)
> +				break;
> +		node->dispatch.graph = graph;
> +	}
> 
>  	return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false;  } diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index d3ec88519d..aef0f65673 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>  			unsigned int lcore_id;  /**< Node running lcore. */
>  			uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
>  			uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> +			struct rte_graph *graph;  /**< Graph corresponding to
> lcore_id. */
>  		} dispatch;
>  	};
> 
> --
> 2.33.0


^ permalink raw reply	[relevance 0%]

* [RFC PATCH v20] mempool: fix mempool cache size
    2025-02-21 15:13  4% ` [RFC PATCH v18] mempool: fix mempool " Morten Brørup
  2025-02-21 19:05  3% ` [RFC PATCH v19] " Morten Brørup
@ 2025-02-21 20:27  3% ` Morten Brørup
  2 siblings, 0 replies; 200+ results
From: Morten Brørup @ 2025-02-21 20:27 UTC (permalink / raw)
  To: dev; +Cc: Morten Brørup

NOTE: THIS VERSION DOES NOT BREAK THE API/ABI.

First, a per-lcore mempool cache could hold 50 % more than the cache's
size.
Since application developers do not expect this behavior, it could lead to
application failure.
This patch fixes this bug without breaking the API/ABI, by using the
mempool cache's "size" instead of the "flushthresh" as the threshold for
how many objects can be held in a mempool cache.
Note: The "flushthresh" field can be removed from the cache structure in a
future API/ABI breaking release, which must be announced in advance.

Second, requests to fetch a number of objects from the backend driver
exceeding the cache's size (but less than RTE_MEMPOOL_CACHE_MAX_SIZE) were
copied twice; first to the cache, and from there to the destination.
Such superfluous copying through the mempool cache degrades the
performance in these cases.
This patch also fixes this misbehavior, so when fetching more objects from
the driver than the mempool cache's size, they are fetched directly to the
destination.

The internal macro to calculate the cache flush threshold was updated to
reflect the new flush threshold of 1 * size instead of 1.5 * size.

The function rte_mempool_do_generic_put() for adding objects to a mempool
was modified as follows:
- When determining if the cache has sufficient room for the request
  without flushing, compare to the cache's size (cache->size) instead of
  the obsolete flush threshold (cache->flushthresh).
- The comparison for the request being too big, which is considered
  unlikely, was moved down and out of the code path where the cache has
  sufficient room for the added objects, which is considered the most
  likely code path.
- Added __rte_assume() about the cache size, for compiler optimization
  when "n" is compile time constant.
- Added __rte_assume() about "ret", for compiler optimization of
  rte_mempool_generic_get() considering the return value of
  rte_mempool_do_generic_get().

The function rte_mempool_do_generic_get() for getting objects from a
mempool was refactored as follows:
- Handling a request for a constant number of objects was merged with
  handling a request for a nonconstant number of objects, and a note about
  compiler loop unrolling in the constant case was added.
- When determining if the remaining part of a request to be dequeued from
  the backend is too big to be copied via the cache, compare to the
  cache's size (cache->size) instead of the max possible cache size
  (RTE_MEMPOOL_CACHE_MAX_SIZE).
- When refilling the cache, the target fill level was reduced from the
  full cache size to half the cache size. This allows some room for a
  put() request following a get() request where the cache was refilled,
  without "flapping" between draining and refilling the entire cache.
  Note: Before this patch, the distance between the flush threshold and
  the refill level was also half a cache size.
- A copy of cache->len in the local variable "len" is no longer needed,
  so it was removed.
- Added a group of __rte_assume()'s, for compiler optimization when "n" is
  compile time constant.

Some comments were also updated.

Furthermore, some likely()/unlikely()'s were added to a few inline
functions; most prominently rte_mempool_default_cache(), which is used by
both rte_mempool_put_bulk() and rte_mempool_get_bulk().

And finally, RTE_ASSERT()'s were added to check the return values of the
mempool driver dequeue() and enqueue() operations.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v20:
* Added more __rte_assume()'s to fix build error with GCC 11.4.1 and
  GCC 11.5.0 in call to mempool_get_bulk() with compile time constant "n"
  larger than RTE_MEMPOOL_CACHE_MAX_SIZE.
v19:
* Added __rte_assume()'s and RTE_ASSERT()'s.
v18:
* Start over from scratch, to avoid API/ABI breakage.
v17:
* Update rearm in idpf driver.
v16:
* Fix bug in rte_mempool_do_generic_put() regarding criteria for flush.
v15:
* Changed back cache bypass limit from n >= RTE_MEMPOOL_CACHE_MAX_SIZE to
  n > RTE_MEMPOOL_CACHE_MAX_SIZE.
* Removed cache size limit from serving via cache.
v14:
* Change rte_mempool_do_generic_put() back from add-then-flush to
  flush-then-add.
  Keep the target cache fill level of ca. 1/2 size of the cache.
v13:
* Target a cache fill level of ca. 1/2 size of the cache when flushing and
  refilling; based on an assumption of equal probability of get and put,
  instead of assuming a higher probability of put being followed by
  another put, and get being followed by another get.
* Reduce the amount of changes to the drivers.
v12:
* Do not init mempool caches with size zero; they don't exist.
  Bug introduced in v10.
v11:
* Removed rte_mempool_do_generic_get_split().
v10:
* Initialize mempool caches, regardless of size zero.
  This to fix compiler warning about out of bounds access.
v9:
* Removed factor 1.5 from description of cache_size parameter to
  rte_mempool_create().
* Refactored rte_mempool_do_generic_put() to eliminate some gotos.
  No functional change.
* Removed check for n >= RTE_MEMPOOL_CACHE_MAX_SIZE in
  rte_mempool_do_generic_get(); it caused the function to fail when the
  request could not be served from the backend alone, but it could be
  served from the cache and the backend.
* Refactored rte_mempool_do_generic_get_split() to make it shorter.
* When getting objects directly from the backend, use burst size aligned
  with either CPU cache line size or mempool cache size.
v8:
* Rewrote rte_mempool_do_generic_put() to get rid of transaction
  splitting. Use a method similar to the existing put method with fill
  followed by flush if overfilled.
  This also made rte_mempool_do_generic_put_split() obsolete.
* When flushing the cache as much as we can, use burst size aligned with
  either CPU cache line size or mempool cache size.
v7:
* Increased max mempool cache size from 512 to 1024 objects.
  Mainly for CI performance test purposes.
  Originally, the max mempool cache size was 768 objects, and used a fixed
  size array of 1024 objects in the mempool cache structure.
v6:
* Fix v5 incomplete implementation of passing large requests directly to
  the backend.
* Use memcpy instead of rte_memcpy where compiler complains about it.
* Added const to some function parameters.
v5:
* Moved helper functions back into the header file, for improved
  performance.
* Pass large requests directly to the backend. This also simplifies the
  code.
v4:
* Updated subject to reflect that misleading names are considered bugs.
* Rewrote patch description to provide more details about the bugs fixed.
  (Mattias Rönnblom)
* Moved helper functions, not to be inlined, to mempool C file.
  (Mattias Rönnblom)
* Pass requests for n >= RTE_MEMPOOL_CACHE_MAX_SIZE objects known at build
  time directly to backend driver, to avoid calling the helper functions.
  This also fixes the compiler warnings about out of bounds array access.
v3:
* Removed __attribute__(assume).
v2:
* Removed mempool perf test; not part of patch set.
---
 lib/mempool/rte_mempool.c |   5 +-
 lib/mempool/rte_mempool.h | 108 +++++++++++++++++---------------------
 2 files changed, 50 insertions(+), 63 deletions(-)

diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index 1e4f24783c..cddc896442 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -50,10 +50,9 @@ static void
 mempool_event_callback_invoke(enum rte_mempool_event event,
 			      struct rte_mempool *mp);
 
-/* Note: avoid using floating point since that compiler
- * may not think that is constant.
+/* Note: This is no longer 1.5 * size, but simply 1 * size.
  */
-#define CALC_CACHE_FLUSHTHRESH(c) (((c) * 3) / 2)
+#define CALC_CACHE_FLUSHTHRESH(c) (c)
 
 #if defined(RTE_ARCH_X86)
 /*
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index c495cc012f..de1b41d899 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -791,7 +791,8 @@ rte_mempool_ops_dequeue_bulk(struct rte_mempool *mp,
 	rte_mempool_trace_ops_dequeue_bulk(mp, obj_table, n);
 	ops = rte_mempool_get_ops(mp->ops_index);
 	ret = ops->dequeue(mp, obj_table, n);
-	if (ret == 0) {
+	RTE_ASSERT(ret <= 0);
+	if (likely(ret == 0)) {
 		RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_bulk, 1);
 		RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_objs, n);
 	}
@@ -848,6 +849,7 @@ rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const *obj_table,
 	rte_mempool_trace_ops_enqueue_bulk(mp, obj_table, n);
 	ops = rte_mempool_get_ops(mp->ops_index);
 	ret = ops->enqueue(mp, obj_table, n);
+	RTE_ASSERT(ret <= 0);
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	if (unlikely(ret < 0))
 		RTE_MEMPOOL_LOG(CRIT, "cannot enqueue %u objects to mempool %s",
@@ -1044,7 +1046,7 @@ rte_mempool_free(struct rte_mempool *mp);
  *   If cache_size is non-zero, the rte_mempool library will try to
  *   limit the accesses to the common lockless pool, by maintaining a
  *   per-lcore object cache. This argument must be lower or equal to
- *   RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
+ *   RTE_MEMPOOL_CACHE_MAX_SIZE and n. It is advised to choose
  *   cache_size to have "n modulo cache_size == 0": if this is
  *   not the case, some elements will always stay in the pool and will
  *   never be used. The access to the per-lcore table is of course
@@ -1333,10 +1335,10 @@ rte_mempool_cache_free(struct rte_mempool_cache *cache);
 static __rte_always_inline struct rte_mempool_cache *
 rte_mempool_default_cache(struct rte_mempool *mp, unsigned lcore_id)
 {
-	if (mp->cache_size == 0)
+	if (unlikely(mp->cache_size == 0))
 		return NULL;
 
-	if (lcore_id >= RTE_MAX_LCORE)
+	if (unlikely(lcore_id >= RTE_MAX_LCORE))
 		return NULL;
 
 	rte_mempool_trace_default_cache(mp, lcore_id,
@@ -1383,32 +1385,33 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 {
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL))
 		goto driver_enqueue;
 
-	/* increment stat now, adding in mempool always success */
+	/* Increment stats now, adding in mempool always succeeds. */
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-	/* The request itself is too big for the cache */
-	if (unlikely(n > cache->flushthresh))
-		goto driver_enqueue_stats_incremented;
-
-	/*
-	 * The cache follows the following algorithm:
-	 *   1. If the objects cannot be added to the cache without crossing
-	 *      the flush threshold, flush the cache to the backend.
-	 *   2. Add the objects to the cache.
-	 */
-
-	if (cache->len + n <= cache->flushthresh) {
+	__rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	__rte_assume(cache->len <= cache->size);
+	if (likely(cache->len + n <= cache->size)) {
+		/* Sufficient room in the cache for the objects. */
 		cache_objs = &cache->objs[cache->len];
 		cache->len += n;
-	} else {
+	} else if (n <= cache->size) {
+		/*
+		 * The cache is big enough for the objects, but - as detected by
+		 * the comparison above - has insufficient room for them.
+		 * Flush the cache to make room for the objects.
+		 */
 		cache_objs = &cache->objs[0];
 		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
 		cache->len = n;
+	} else {
+		/* The request itself is too big for the cache. */
+		goto driver_enqueue_stats_incremented;
 	}
 
 	/* Add the objects to the cache. */
@@ -1512,10 +1515,10 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 {
 	int ret;
 	unsigned int remaining;
-	uint32_t index, len;
+	uint32_t index;
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL)) {
 		remaining = n;
 		goto driver_dequeue;
@@ -1524,11 +1527,12 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	/* The cache is a stack, so copy will be in reverse order. */
 	cache_objs = &cache->objs[cache->len];
 
-	if (__rte_constant(n) && n <= cache->len) {
+	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	if (likely(n <= cache->len)) {
 		/*
-		 * The request size is known at build time, and
-		 * the entire request can be satisfied from the cache,
-		 * so let the compiler unroll the fixed length copy loop.
+		 * The entire request can be satisfied from the cache.
+		 * Note: If the request size is known at build time,
+		 * the compiler will unroll the fixed length copy loop.
 		 */
 		cache->len -= n;
 		for (index = 0; index < n; index++)
@@ -1540,55 +1544,38 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 		return 0;
 	}
 
-	/*
-	 * Use the cache as much as we have to return hot objects first.
-	 * If the request size 'n' is known at build time, the above comparison
-	 * ensures that n > cache->len here, so omit RTE_MIN().
-	 */
-	len = __rte_constant(n) ? cache->len : RTE_MIN(n, cache->len);
-	cache->len -= len;
-	remaining = n - len;
-	for (index = 0; index < len; index++)
+	/* Use the cache as much as we have to return hot objects first. */
+	for (index = 0; index < cache->len; index++)
 		*obj_table++ = *--cache_objs;
+	remaining = n - cache->len;
+	cache->len = 0;
 
-	/*
-	 * If the request size 'n' is known at build time, the case
-	 * where the entire request can be satisfied from the cache
-	 * has already been handled above, so omit handling it here.
-	 */
-	if (!__rte_constant(n) && remaining == 0) {
-		/* The entire request is satisfied from the cache. */
-
-		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
-		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
-
-		return 0;
-	}
-
-	/* if dequeue below would overflow mem allocated for cache */
-	if (unlikely(remaining > RTE_MEMPOOL_CACHE_MAX_SIZE))
+	/* The remaining request is too big for the cache? */
+	__rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	if (unlikely(remaining > cache->size))
 		goto driver_dequeue;
 
-	/* Fill the cache from the backend; fetch size + remaining objects. */
+	/* Fill the cache from the backend; fetch size / 2 + remaining objects. */
 	ret = rte_mempool_ops_dequeue_bulk(mp, cache->objs,
-			cache->size + remaining);
+			cache->size / 2 + remaining);
 	if (unlikely(ret < 0)) {
 		/*
-		 * We are buffer constrained, and not able to allocate
-		 * cache + remaining.
+		 * We are buffer constrained, and not able to fetch all that.
 		 * Do not fill the cache, just satisfy the remaining part of
 		 * the request directly from the backend.
 		 */
 		goto driver_dequeue;
 	}
 
+	cache->len = cache->size / 2;
+
 	/* Satisfy the remaining part of the request from the filled cache. */
-	cache_objs = &cache->objs[cache->size + remaining];
+	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
+	__rte_assume(remaining <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	cache_objs = &cache->objs[cache->len + remaining];
 	for (index = 0; index < remaining; index++)
 		*obj_table++ = *--cache_objs;
 
-	cache->len = cache->size;
-
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
 
@@ -1599,7 +1586,7 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	/* Get remaining objects directly from the backend. */
 	ret = rte_mempool_ops_dequeue_bulk(mp, obj_table, remaining);
 
-	if (ret < 0) {
+	if (unlikely(ret < 0)) {
 		if (likely(cache != NULL)) {
 			cache->len = n - remaining;
 			/*
@@ -1619,6 +1606,7 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 			RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
 			RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n);
 		}
+		__rte_assume(ret == 0);
 	}
 
 	return ret;
@@ -1650,7 +1638,7 @@ rte_mempool_generic_get(struct rte_mempool *mp, void **obj_table,
 {
 	int ret;
 	ret = rte_mempool_do_generic_get(mp, obj_table, n, cache);
-	if (ret == 0)
+	if (likely(ret == 0))
 		RTE_MEMPOOL_CHECK_COOKIES(mp, obj_table, n, 1);
 	rte_mempool_trace_generic_get(mp, obj_table, n, cache);
 	return ret;
@@ -1741,7 +1729,7 @@ rte_mempool_get_contig_blocks(struct rte_mempool *mp,
 	int ret;
 
 	ret = rte_mempool_ops_dequeue_contig_blocks(mp, first_obj_table, n);
-	if (ret == 0) {
+	if (likely(ret == 0)) {
 		RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
 		RTE_MEMPOOL_STAT_ADD(mp, get_success_blks, n);
 		RTE_MEMPOOL_CONTIG_BLOCKS_CHECK_COOKIES(mp, first_obj_table, n,
-- 
2.43.0


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] sched: fix wrr parameter data type
  @ 2025-02-21 19:14  3% ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2025-02-21 19:14 UTC (permalink / raw)
  To: Megha Ajmera; +Cc: dev, jasvinder.singh, cristian.dumitrescu

On Fri, 21 Feb 2025 18:17:55 +0530
Megha Ajmera <megha.ajmera@intel.com> wrote:

> wrr tokens getting truncated to uint8_t in wrr_store function() due to
> type mismatch. This patch chnages the data type to uint16_t.
> 
> Fixes: e16b06da0908 ("sched: remove WRR from strict priority TC queues")
> 
> Signed-off-by: Megha Ajmera <megha.ajmera@intel.com>
> ---
>  lib/sched/rte_sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/sched/rte_sched.c b/lib/sched/rte_sched.c
> index d8ee4e7e91..dcef44b91b 100644
> --- a/lib/sched/rte_sched.c
> +++ b/lib/sched/rte_sched.c
> @@ -66,7 +66,7 @@ struct __rte_cache_aligned rte_sched_pipe {
>  	uint64_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
>  
>  	/* Weighted Round Robin (WRR) */
> -	uint8_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
> +	uint16_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
>  
>  	/* TC oversubscription */
>  	uint64_t tc_ov_credits;

This would be a change in ABI.

^ permalink raw reply	[relevance 3%]

* [RFC PATCH v19] mempool: fix mempool cache size
    2025-02-21 15:13  4% ` [RFC PATCH v18] mempool: fix mempool " Morten Brørup
@ 2025-02-21 19:05  3% ` Morten Brørup
  2025-02-21 20:27  3% ` [RFC PATCH v20] " Morten Brørup
  2 siblings, 0 replies; 200+ results
From: Morten Brørup @ 2025-02-21 19:05 UTC (permalink / raw)
  To: dev; +Cc: Morten Brørup

NOTE: THIS VERSION DOES NOT BREAK THE API/ABI.

First, a per-lcore mempool cache could hold 50 % more than the cache's
size.
Since application developers do not expect this behavior, it could lead to
application failure.
This patch fixes this bug without breaking the API/ABI, by using the
mempool cache's "size" instead of the "flushthresh" as the threshold for
how many objects can be held in a mempool cache.
Note: The "flushthresh" field can be removed from the cache structure in a
future API/ABI breaking release, which must be announced in advance.

Second, requests to fetch a number of objects from the backend driver
exceeding the cache's size (but less than RTE_MEMPOOL_CACHE_MAX_SIZE) were
copied twice; first to the cache, and from there to the destination.
Such superfluous copying through the mempool cache degrades the
performance in these cases.
This patch also fixes this misbehavior, so when fetching more objects from
the driver than the mempool cache's size, they are fetched directly to the
destination.

The internal macro to calculate the cache flush threshold was updated to
reflect the new flush threshold of 1 * size instead of 1.5 * size.

The function rte_mempool_do_generic_put() for adding objects to a mempool
was modified as follows:
- When determining if the cache has sufficient room for the request
  without flushing, compare to the cache's size (cache->size) instead of
  the obsolete flush threshold (cache->flushthresh).
- The comparison for the request being too big, which is considered
  unlikely, was moved down and out of the code path where the cache has
  sufficient room for the added objects, which is considered the most
  likely code path.
- Added __rte_assume() about the cache size, for compiler optimization
  when "n" is compile time constant.
- Added __rte_assume() about "ret", for compiler optimization of
  rte_mempool_generic_get() considering the return value of
  rte_mempool_do_generic_get().

The function rte_mempool_do_generic_get() for getting objects from a
mempool was refactored as follows:
- Handling a request for a constant number of objects was merged with
  handling a request for a nonconstant number of objects, and a note about
  compiler loop unrolling in the constant case was added.
- When determining if the remaining part of a request to be dequeued from
  the backend is too big to be copied via the cache, compare to the
  cache's size (cache->size) instead of the max possible cache size
  (RTE_MEMPOOL_CACHE_MAX_SIZE).
- When refilling the cache, the target fill level was reduced from the
  full cache size to half the cache size. This allows some room for a
  put() request following a get() request where the cache was refilled,
  without "flapping" between draining and refilling the entire cache.
  Note: Before this patch, the distance between the flush threshold and
  the refill level was also half a cache size.
- A copy of cache->len in the local variable "len" is no longer needed,
  so it was removed.
- Added a group of __rte_assume()'s, for compiler optimization when "n" is
  compile time constant.

Some comments were also updated.

Furthermore, some likely()/unlikely()'s were added to a few inline
functions; most prominently rte_mempool_default_cache(), which is used by
both rte_mempool_put_bulk() and rte_mempool_get_bulk().

And finally, RTE_ASSERT()'s were added to check the return values of the
mempool driver dequeue() and enqueue() operations.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v19:
* Added __rte_assume()'s and RTE_ASSERT()'s.
v18:
* Start over from scratch, to avoid API/ABI breakage.
v17:
* Update rearm in idpf driver.
v16:
* Fix bug in rte_mempool_do_generic_put() regarding criteria for flush.
v15:
* Changed back cache bypass limit from n >= RTE_MEMPOOL_CACHE_MAX_SIZE to
  n > RTE_MEMPOOL_CACHE_MAX_SIZE.
* Removed cache size limit from serving via cache.
v14:
* Change rte_mempool_do_generic_put() back from add-then-flush to
  flush-then-add.
  Keep the target cache fill level of ca. 1/2 size of the cache.
v13:
* Target a cache fill level of ca. 1/2 size of the cache when flushing and
  refilling; based on an assumption of equal probability of get and put,
  instead of assuming a higher probability of put being followed by
  another put, and get being followed by another get.
* Reduce the amount of changes to the drivers.
v12:
* Do not init mempool caches with size zero; they don't exist.
  Bug introduced in v10.
v11:
* Removed rte_mempool_do_generic_get_split().
v10:
* Initialize mempool caches, regardless of size zero.
  This to fix compiler warning about out of bounds access.
v9:
* Removed factor 1.5 from description of cache_size parameter to
  rte_mempool_create().
* Refactored rte_mempool_do_generic_put() to eliminate some gotos.
  No functional change.
* Removed check for n >= RTE_MEMPOOL_CACHE_MAX_SIZE in
  rte_mempool_do_generic_get(); it caused the function to fail when the
  request could not be served from the backend alone, but it could be
  served from the cache and the backend.
* Refactored rte_mempool_do_generic_get_split() to make it shorter.
* When getting objects directly from the backend, use burst size aligned
  with either CPU cache line size or mempool cache size.
v8:
* Rewrote rte_mempool_do_generic_put() to get rid of transaction
  splitting. Use a method similar to the existing put method with fill
  followed by flush if overfilled.
  This also made rte_mempool_do_generic_put_split() obsolete.
* When flushing the cache as much as we can, use burst size aligned with
  either CPU cache line size or mempool cache size.
v7:
* Increased max mempool cache size from 512 to 1024 objects.
  Mainly for CI performance test purposes.
  Originally, the max mempool cache size was 768 objects, and used a fixed
  size array of 1024 objects in the mempool cache structure.
v6:
* Fix v5 incomplete implementation of passing large requests directly to
  the backend.
* Use memcpy instead of rte_memcpy where compiler complains about it.
* Added const to some function parameters.
v5:
* Moved helper functions back into the header file, for improved
  performance.
* Pass large requests directly to the backend. This also simplifies the
  code.
v4:
* Updated subject to reflect that misleading names are considered bugs.
* Rewrote patch description to provide more details about the bugs fixed.
  (Mattias Rönnblom)
* Moved helper functions, not to be inlined, to mempool C file.
  (Mattias Rönnblom)
* Pass requests for n >= RTE_MEMPOOL_CACHE_MAX_SIZE objects known at build
  time directly to backend driver, to avoid calling the helper functions.
  This also fixes the compiler warnings about out of bounds array access.
v3:
* Removed __attribute__(assume).
v2:
* Removed mempool perf test; not part of patch set.
---
 lib/mempool/rte_mempool.c |   5 +-
 lib/mempool/rte_mempool.h | 105 ++++++++++++++++----------------------
 2 files changed, 47 insertions(+), 63 deletions(-)

diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index 1e4f24783c..cddc896442 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -50,10 +50,9 @@ static void
 mempool_event_callback_invoke(enum rte_mempool_event event,
 			      struct rte_mempool *mp);
 
-/* Note: avoid using floating point since that compiler
- * may not think that is constant.
+/* Note: This is no longer 1.5 * size, but simply 1 * size.
  */
-#define CALC_CACHE_FLUSHTHRESH(c) (((c) * 3) / 2)
+#define CALC_CACHE_FLUSHTHRESH(c) (c)
 
 #if defined(RTE_ARCH_X86)
 /*
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index c495cc012f..7742677c01 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -791,7 +791,8 @@ rte_mempool_ops_dequeue_bulk(struct rte_mempool *mp,
 	rte_mempool_trace_ops_dequeue_bulk(mp, obj_table, n);
 	ops = rte_mempool_get_ops(mp->ops_index);
 	ret = ops->dequeue(mp, obj_table, n);
-	if (ret == 0) {
+	RTE_ASSERT(ret <= 0);
+	if (likely(ret == 0)) {
 		RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_bulk, 1);
 		RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_objs, n);
 	}
@@ -848,6 +849,7 @@ rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const *obj_table,
 	rte_mempool_trace_ops_enqueue_bulk(mp, obj_table, n);
 	ops = rte_mempool_get_ops(mp->ops_index);
 	ret = ops->enqueue(mp, obj_table, n);
+	RTE_ASSERT(ret <= 0);
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	if (unlikely(ret < 0))
 		RTE_MEMPOOL_LOG(CRIT, "cannot enqueue %u objects to mempool %s",
@@ -1044,7 +1046,7 @@ rte_mempool_free(struct rte_mempool *mp);
  *   If cache_size is non-zero, the rte_mempool library will try to
  *   limit the accesses to the common lockless pool, by maintaining a
  *   per-lcore object cache. This argument must be lower or equal to
- *   RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
+ *   RTE_MEMPOOL_CACHE_MAX_SIZE and n. It is advised to choose
  *   cache_size to have "n modulo cache_size == 0": if this is
  *   not the case, some elements will always stay in the pool and will
  *   never be used. The access to the per-lcore table is of course
@@ -1333,10 +1335,10 @@ rte_mempool_cache_free(struct rte_mempool_cache *cache);
 static __rte_always_inline struct rte_mempool_cache *
 rte_mempool_default_cache(struct rte_mempool *mp, unsigned lcore_id)
 {
-	if (mp->cache_size == 0)
+	if (unlikely(mp->cache_size == 0))
 		return NULL;
 
-	if (lcore_id >= RTE_MAX_LCORE)
+	if (unlikely(lcore_id >= RTE_MAX_LCORE))
 		return NULL;
 
 	rte_mempool_trace_default_cache(mp, lcore_id,
@@ -1383,32 +1385,33 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 {
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL))
 		goto driver_enqueue;
 
-	/* increment stat now, adding in mempool always success */
+	/* Increment stats now, adding in mempool always succeeds. */
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-	/* The request itself is too big for the cache */
-	if (unlikely(n > cache->flushthresh))
-		goto driver_enqueue_stats_incremented;
-
-	/*
-	 * The cache follows the following algorithm:
-	 *   1. If the objects cannot be added to the cache without crossing
-	 *      the flush threshold, flush the cache to the backend.
-	 *   2. Add the objects to the cache.
-	 */
-
-	if (cache->len + n <= cache->flushthresh) {
+	__rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	__rte_assume(cache->len <= cache->size);
+	if (likely(cache->len + n <= cache->size)) {
+		/* Sufficient room in the cache for the objects. */
 		cache_objs = &cache->objs[cache->len];
 		cache->len += n;
-	} else {
+	} else if (n <= cache->size) {
+		/*
+		 * The cache is big enough for the objects, but - as detected by
+		 * the comparison above - has insufficient room for them.
+		 * Flush the cache to make room for the objects.
+		 */
 		cache_objs = &cache->objs[0];
 		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
 		cache->len = n;
+	} else {
+		/* The request itself is too big for the cache. */
+		goto driver_enqueue_stats_incremented;
 	}
 
 	/* Add the objects to the cache. */
@@ -1512,10 +1515,10 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 {
 	int ret;
 	unsigned int remaining;
-	uint32_t index, len;
+	uint32_t index;
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL)) {
 		remaining = n;
 		goto driver_dequeue;
@@ -1524,11 +1527,12 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	/* The cache is a stack, so copy will be in reverse order. */
 	cache_objs = &cache->objs[cache->len];
 
-	if (__rte_constant(n) && n <= cache->len) {
+	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+	if (likely(n <= cache->len)) {
 		/*
-		 * The request size is known at build time, and
-		 * the entire request can be satisfied from the cache,
-		 * so let the compiler unroll the fixed length copy loop.
+		 * The entire request can be satisfied from the cache.
+		 * Note: If the request size is known at build time,
+		 * the compiler will unroll the fixed length copy loop.
 		 */
 		cache->len -= n;
 		for (index = 0; index < n; index++)
@@ -1540,55 +1544,35 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 		return 0;
 	}
 
-	/*
-	 * Use the cache as much as we have to return hot objects first.
-	 * If the request size 'n' is known at build time, the above comparison
-	 * ensures that n > cache->len here, so omit RTE_MIN().
-	 */
-	len = __rte_constant(n) ? cache->len : RTE_MIN(n, cache->len);
-	cache->len -= len;
-	remaining = n - len;
-	for (index = 0; index < len; index++)
+	/* Use the cache as much as we have to return hot objects first. */
+	for (index = 0; index < cache->len; index++)
 		*obj_table++ = *--cache_objs;
+	remaining = n - cache->len;
+	cache->len = 0;
 
-	/*
-	 * If the request size 'n' is known at build time, the case
-	 * where the entire request can be satisfied from the cache
-	 * has already been handled above, so omit handling it here.
-	 */
-	if (!__rte_constant(n) && remaining == 0) {
-		/* The entire request is satisfied from the cache. */
-
-		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
-		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
-
-		return 0;
-	}
-
-	/* if dequeue below would overflow mem allocated for cache */
-	if (unlikely(remaining > RTE_MEMPOOL_CACHE_MAX_SIZE))
+	/* The remaining request is too big for the cache? */
+	if (unlikely(remaining > cache->size))
 		goto driver_dequeue;
 
-	/* Fill the cache from the backend; fetch size + remaining objects. */
+	/* Fill the cache from the backend; fetch size / 2 + remaining objects. */
 	ret = rte_mempool_ops_dequeue_bulk(mp, cache->objs,
-			cache->size + remaining);
+			cache->size / 2 + remaining);
 	if (unlikely(ret < 0)) {
 		/*
-		 * We are buffer constrained, and not able to allocate
-		 * cache + remaining.
+		 * We are buffer constrained, and not able to fetch all that.
 		 * Do not fill the cache, just satisfy the remaining part of
 		 * the request directly from the backend.
 		 */
 		goto driver_dequeue;
 	}
 
+	cache->len = cache->size / 2;
+
 	/* Satisfy the remaining part of the request from the filled cache. */
-	cache_objs = &cache->objs[cache->size + remaining];
+	cache_objs = &cache->objs[cache->len + remaining];
 	for (index = 0; index < remaining; index++)
 		*obj_table++ = *--cache_objs;
 
-	cache->len = cache->size;
-
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
 
@@ -1599,7 +1583,7 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	/* Get remaining objects directly from the backend. */
 	ret = rte_mempool_ops_dequeue_bulk(mp, obj_table, remaining);
 
-	if (ret < 0) {
+	if (unlikely(ret < 0)) {
 		if (likely(cache != NULL)) {
 			cache->len = n - remaining;
 			/*
@@ -1619,6 +1603,7 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 			RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
 			RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n);
 		}
+		__rte_assume(ret == 0);
 	}
 
 	return ret;
@@ -1650,7 +1635,7 @@ rte_mempool_generic_get(struct rte_mempool *mp, void **obj_table,
 {
 	int ret;
 	ret = rte_mempool_do_generic_get(mp, obj_table, n, cache);
-	if (ret == 0)
+	if (likely(ret == 0))
 		RTE_MEMPOOL_CHECK_COOKIES(mp, obj_table, n, 1);
 	rte_mempool_trace_generic_get(mp, obj_table, n, cache);
 	return ret;
@@ -1741,7 +1726,7 @@ rte_mempool_get_contig_blocks(struct rte_mempool *mp,
 	int ret;
 
 	ret = rte_mempool_ops_dequeue_contig_blocks(mp, first_obj_table, n);
-	if (ret == 0) {
+	if (likely(ret == 0)) {
 		RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
 		RTE_MEMPOOL_STAT_ADD(mp, get_success_blks, n);
 		RTE_MEMPOOL_CONTIG_BLOCKS_CHECK_COOKIES(mp, first_obj_table, n,
-- 
2.43.0


^ permalink raw reply	[relevance 3%]

* [v3 4/6] crypto/virtio: add vDPA backend
  @ 2025-02-21 17:41  1% ` Gowrishankar Muthukrishnan
  0 siblings, 0 replies; 200+ results
From: Gowrishankar Muthukrishnan @ 2025-02-21 17:41 UTC (permalink / raw)
  To: dev, Jay Zhou; +Cc: anoobj, Akhil Goyal, Gowrishankar Muthukrishnan

Add vDPA backend to virtio_user crypto.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
---
 drivers/crypto/virtio/meson.build             |   7 +
 drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
 drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
 drivers/crypto/virtio/virtio_logs.h           |   6 +-
 drivers/crypto/virtio/virtio_pci.h            |   7 +
 drivers/crypto/virtio/virtio_ring.h           |   6 -
 drivers/crypto/virtio/virtio_user/vhost.h     |  90 ++
 .../crypto/virtio/virtio_user/vhost_vdpa.c    | 710 ++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.c      | 767 ++++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.h      |  85 ++
 drivers/crypto/virtio/virtio_user_cryptodev.c | 575 +++++++++++++
 11 files changed, 2283 insertions(+), 30 deletions(-)
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost.h
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
 create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c

diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
index d2c3b3ad07..3763e86746 100644
--- a/drivers/crypto/virtio/meson.build
+++ b/drivers/crypto/virtio/meson.build
@@ -16,3 +16,10 @@ sources = files(
         'virtio_rxtx.c',
         'virtqueue.c',
 )
+
+if is_linux
+    sources += files('virtio_user_cryptodev.c',
+        'virtio_user/vhost_vdpa.c',
+        'virtio_user/virtio_user_dev.c')
+    deps += ['bus_vdev']
+endif
diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
index 92fea557ab..bc737f1e68 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.c
+++ b/drivers/crypto/virtio/virtio_cryptodev.c
@@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
 	return 0;
 }
 
-/*
- * This function is based on probe() function
- * It returns 0 on success.
- */
-static int
-crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
-		struct rte_cryptodev_pmd_init_params *init_params)
+int
+crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev)
 {
-	struct rte_cryptodev *cryptodev;
 	struct virtio_crypto_hw *hw;
 
-	PMD_INIT_FUNC_TRACE();
-
-	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
-					init_params);
-	if (cryptodev == NULL)
-		return -ENODEV;
-
 	cryptodev->driver_id = cryptodev_virtio_driver_id;
 	cryptodev->dev_ops = &virtio_crypto_dev_ops;
 
@@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
 	hw->dev_id = cryptodev->data->dev_id;
 	hw->virtio_dev_capabilities = virtio_capabilities;
 
-	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
-		cryptodev->data->dev_id, pci_dev->id.vendor_id,
-		pci_dev->id.device_id);
+	if (pci_dev) {
+		/* pci device init */
+		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
+			cryptodev->data->dev_id, pci_dev->id.vendor_id,
+			pci_dev->id.device_id);
 
-	/* pci device init */
-	if (vtpci_cryptodev_init(pci_dev, hw))
+		if (vtpci_cryptodev_init(pci_dev, hw))
+			return -1;
+	}
+
+	if (virtio_crypto_init_device(cryptodev, features) < 0)
 		return -1;
 
-	if (virtio_crypto_init_device(cryptodev,
-			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
+	return 0;
+}
+
+/*
+ * This function is based on probe() function
+ * It returns 0 on success.
+ */
+static int
+crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
+		struct rte_cryptodev_pmd_init_params *init_params)
+{
+	struct rte_cryptodev *cryptodev;
+
+	PMD_INIT_FUNC_TRACE();
+
+	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
+					init_params);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
+			pci_dev) < 0)
 		return -1;
 
 	rte_cryptodev_pmd_probing_finish(cryptodev);
diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
index f8498246e2..fad73d54a8 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.h
+++ b/drivers/crypto/virtio/virtio_cryptodev.h
@@ -76,4 +76,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
 		struct rte_crypto_op **tx_pkts,
 		uint16_t nb_pkts);
 
+int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev);
+
 #endif /* _VIRTIO_CRYPTODEV_H_ */
diff --git a/drivers/crypto/virtio/virtio_logs.h b/drivers/crypto/virtio/virtio_logs.h
index 988514919f..1cc51f7990 100644
--- a/drivers/crypto/virtio/virtio_logs.h
+++ b/drivers/crypto/virtio/virtio_logs.h
@@ -15,8 +15,10 @@ extern int virtio_crypto_logtype_init;
 
 #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
 
-extern int virtio_crypto_logtype_init;
-#define RTE_LOGTYPE_VIRTIO_CRYPTO_INIT virtio_crypto_logtype_init
+extern int virtio_crypto_logtype_driver;
+#define RTE_LOGTYPE_VIRTIO_CRYPTO_DRIVER virtio_crypto_logtype_driver
+#define PMD_DRV_LOG(level, ...) \
+	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_DRIVER, "%s(): ", __func__, __VA_ARGS__)
 
 #define VIRTIO_CRYPTO_INIT_LOG_IMPL(level, ...) \
 	RTE_LOG_LINE_PREFIX(level, VIRTIO_CRYPTO_INIT, "%s(): ", __func__, __VA_ARGS__)
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
index 79945cb88e..c75777e005 100644
--- a/drivers/crypto/virtio/virtio_pci.h
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -20,6 +20,9 @@ struct virtqueue;
 #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
 #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
 
+/* VirtIO device IDs. */
+#define VIRTIO_ID_CRYPTO  20
+
 /* VirtIO ABI version, this must match exactly. */
 #define VIRTIO_PCI_ABI_VERSION 0
 
@@ -56,8 +59,12 @@ struct virtqueue;
 #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
 #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
 #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
 #define VIRTIO_CONFIG_STATUS_FAILED    0x80
 
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_VRING_ALIGN 4096
+
 /*
  * Each virtqueue indirect descriptor list must be physically contiguous.
  * To allow us to malloc(9) each list individually, limit the number
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
index c74d1172b7..4b418f6e60 100644
--- a/drivers/crypto/virtio/virtio_ring.h
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
 				sizeof(struct vring_packed_desc_event)), align);
 }
 
-static inline void
-vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
-{
-	vring_init_split(vr, p, 0, align, num);
-}
-
 /*
  * The following is used with VIRTIO_RING_F_EVENT_IDX.
  * Assuming a given event_idx value from the other size, if we have
diff --git a/drivers/crypto/virtio/virtio_user/vhost.h b/drivers/crypto/virtio/virtio_user/vhost.h
new file mode 100644
index 0000000000..29cc1a14d4
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#ifndef _VIRTIO_USER_VHOST_H
+#define _VIRTIO_USER_VHOST_H
+
+#include <stdint.h>
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#include <rte_errno.h>
+
+#include "../virtio_logs.h"
+
+struct vhost_vring_state {
+	unsigned int index;
+	unsigned int num;
+};
+
+struct vhost_vring_file {
+	unsigned int index;
+	int fd;
+};
+
+struct vhost_vring_addr {
+	unsigned int index;
+	/* Option flags. */
+	unsigned int flags;
+	/* Flag values: */
+	/* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+	/* Start of array of descriptors (virtually contiguous) */
+	uint64_t desc_user_addr;
+	/* Used structure address. Must be 32 bit aligned */
+	uint64_t used_user_addr;
+	/* Available structure address. Must be 16 bit aligned */
+	uint64_t avail_user_addr;
+	/* Logging support. */
+	/* Log writes to used structure, at offset calculated from specified
+	 * address. Address must be 32 bit aligned.
+	 */
+	uint64_t log_guest_addr;
+};
+
+#ifndef VHOST_BACKEND_F_IOTLB_MSG_V2
+#define VHOST_BACKEND_F_IOTLB_MSG_V2 1
+#endif
+
+#ifndef VHOST_BACKEND_F_IOTLB_BATCH
+#define VHOST_BACKEND_F_IOTLB_BATCH 2
+#endif
+
+struct virtio_user_dev;
+
+struct virtio_user_backend_ops {
+	int (*setup)(struct virtio_user_dev *dev);
+	int (*destroy)(struct virtio_user_dev *dev);
+	int (*get_backend_features)(uint64_t *features);
+	int (*set_owner)(struct virtio_user_dev *dev);
+	int (*get_features)(struct virtio_user_dev *dev, uint64_t *features);
+	int (*set_features)(struct virtio_user_dev *dev, uint64_t features);
+	int (*set_memory_table)(struct virtio_user_dev *dev);
+	int (*set_vring_num)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*get_vring_base)(struct virtio_user_dev *dev, struct vhost_vring_state *state);
+	int (*set_vring_call)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_kick)(struct virtio_user_dev *dev, struct vhost_vring_file *file);
+	int (*set_vring_addr)(struct virtio_user_dev *dev, struct vhost_vring_addr *addr);
+	int (*get_status)(struct virtio_user_dev *dev, uint8_t *status);
+	int (*set_status)(struct virtio_user_dev *dev, uint8_t status);
+	int (*get_config)(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len);
+	int (*set_config)(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off,
+			uint32_t len);
+	int (*cvq_enable)(struct virtio_user_dev *dev, int enable);
+	int (*enable_qp)(struct virtio_user_dev *dev, uint16_t pair_idx, int enable);
+	int (*dma_map)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*dma_unmap)(struct virtio_user_dev *dev, void *addr, uint64_t iova, size_t len);
+	int (*update_link_state)(struct virtio_user_dev *dev);
+	int (*server_disconnect)(struct virtio_user_dev *dev);
+	int (*server_reconnect)(struct virtio_user_dev *dev);
+	int (*get_intr_fd)(struct virtio_user_dev *dev);
+	int (*map_notification_area)(struct virtio_user_dev *dev);
+	int (*unmap_notification_area)(struct virtio_user_dev *dev);
+};
+
+extern struct virtio_user_backend_ops virtio_ops_vdpa;
+
+#endif
diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
new file mode 100644
index 0000000000..b5839875e6
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
@@ -0,0 +1,710 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include <rte_memory.h>
+
+#include "vhost.h"
+#include "virtio_user_dev.h"
+#include "../virtio_pci.h"
+
+struct vhost_vdpa_data {
+	int vhostfd;
+	uint64_t protocol_features;
+};
+
+#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
+	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
+	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
+
+/* vhost kernel & vdpa ioctls */
+#define VHOST_VIRTIO 0xAF
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
+#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
+#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
+#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
+#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
+#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	uint64_t iova;
+	uint64_t size;
+	uint64_t uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	uint8_t perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+#define VHOST_IOTLB_BATCH_BEGIN    5
+#define VHOST_IOTLB_BATCH_END      6
+	uint8_t type;
+};
+
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_vdpa_config {
+	uint32_t off;
+	uint32_t len;
+	uint8_t buf[];
+};
+
+struct vhost_msg {
+	uint32_t type;
+	uint32_t reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		uint8_t padding[64];
+	};
+};
+
+static int
+vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
+{
+	int ret;
+
+	ret = ioctl(fd, request, arg);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
+				request, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_owner(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_OWNER, NULL);
+}
+
+static int
+vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
+}
+
+static int
+vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int ret;
+
+	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get features");
+		return -1;
+	}
+
+	/* Negotiated vDPA backend features */
+	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to get backend features");
+		return -1;
+	}
+
+	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
+
+	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to set backend features");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	/* WORKAROUND */
+	features |= 1ULL << VIRTIO_F_IOMMU_PLATFORM;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_iotlb_batch_begin(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch begin (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_iotlb_batch_end(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_BATCH)))
+		return 0;
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_BATCH_END;
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch end (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_UPDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.uaddr = (uint64_t)(uintptr_t)addr;
+	msg.iotlb.size = len;
+	msg.iotlb.perm = VHOST_ACCESS_RW;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", addr: %p, len: 0x%zx",
+			__func__, iova, addr, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB update (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_unmap(struct virtio_user_dev *dev, __rte_unused void *addr,
+				  uint64_t iova, size_t len)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	struct vhost_msg msg = {};
+
+	if (!(data->protocol_features & (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2))) {
+		PMD_DRV_LOG(ERR, "IOTLB_MSG_V2 not supported by the backend.");
+		return -1;
+	}
+
+	msg.type = VHOST_IOTLB_MSG_V2;
+	msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
+	msg.iotlb.iova = iova;
+	msg.iotlb.size = len;
+
+	PMD_DRV_LOG(DEBUG, "%s: iova: 0x%" PRIx64 ", len: 0x%zx",
+			__func__, iova, len);
+
+	if (write(data->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
+		PMD_DRV_LOG(ERR, "Failed to send IOTLB invalidate (%s)",
+				strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_dma_map_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_map(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_dma_unmap_batch(struct virtio_user_dev *dev, void *addr,
+				  uint64_t iova, size_t len)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	ret = vhost_vdpa_dma_unmap(dev, addr, iova, len);
+
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_map_contig(const struct rte_memseg_list *msl,
+		const struct rte_memseg *ms, size_t len, void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	if (msl->external)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, len);
+}
+
+static int
+vhost_vdpa_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
+		void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+
+	/* skip external memory that isn't a heap */
+	if (msl->external && !msl->heap)
+		return 0;
+
+	/* skip any segments with invalid IOVA addresses */
+	if (ms->iova == RTE_BAD_IOVA)
+		return 0;
+
+	/* if IOVA mode is VA, we've already mapped the internal segments */
+	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
+		return 0;
+
+	return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, ms->len);
+}
+
+static int
+vhost_vdpa_set_memory_table(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
+		return -1;
+
+	vhost_vdpa_dma_unmap(dev, NULL, 0, SIZE_MAX);
+
+	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
+		/* with IOVA as VA mode, we can get away with mapping contiguous
+		 * chunks rather than going page-by-page.
+		 */
+		ret = rte_memseg_contig_walk_thread_unsafe(
+				vhost_vdpa_map_contig, dev);
+		if (ret)
+			goto batch_end;
+		/* we have to continue the walk because we've skipped the
+		 * external segments during the config walk.
+		 */
+	}
+	ret = rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev);
+
+batch_end:
+	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
+		return -1;
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
+}
+
+static int
+vhost_vdpa_set_vring_num(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_NUM, state);
+}
+
+static int
+vhost_vdpa_set_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_get_vring_base(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_VRING_BASE, state);
+}
+
+static int
+vhost_vdpa_set_vring_call(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_CALL, file);
+}
+
+static int
+vhost_vdpa_set_vring_kick(struct virtio_user_dev *dev, struct vhost_vring_file *file)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_KICK, file);
+}
+
+static int
+vhost_vdpa_set_vring_addr(struct virtio_user_dev *dev, struct vhost_vring_addr *addr)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_VRING_ADDR, addr);
+}
+
+static int
+vhost_vdpa_get_status(struct virtio_user_dev *dev, uint8_t *status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_GET_STATUS, status);
+}
+
+static int
+vhost_vdpa_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_STATUS, &status);
+}
+
+static int
+vhost_vdpa_get_config(struct virtio_user_dev *dev, uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_GET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+		goto out;
+	}
+
+	memcpy(data, config->buf, len);
+out:
+	free(config);
+
+	return ret;
+}
+
+static int
+vhost_vdpa_set_config(struct virtio_user_dev *dev, const uint8_t *data, uint32_t off, uint32_t len)
+{
+	struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+	struct vhost_vdpa_config *config;
+	int ret = 0;
+
+	config = malloc(sizeof(*config) + len);
+	if (!config) {
+		PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data");
+		return -1;
+	}
+
+	config->off = off;
+	config->len = len;
+
+	memcpy(config->buf, data, len);
+
+	ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_SET_CONFIG, config);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to set vDPA config (offset 0x%x, len 0x%x)", off, len);
+		ret = -1;
+	}
+
+	free(config);
+
+	return ret;
+}
+
+/**
+ * Set up environment to talk with a vhost vdpa backend.
+ *
+ * @return
+ *   - (-1) if fail to set up;
+ *   - (>=0) if successful.
+ */
+static int
+vhost_vdpa_setup(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data;
+	uint32_t did = (uint32_t)-1;
+
+	data = malloc(sizeof(*data));
+	if (!data) {
+		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
+		return -1;
+	}
+
+	data->vhostfd = open(dev->path, O_RDWR);
+	if (data->vhostfd < 0) {
+		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
+				dev->path, strerror(errno));
+		free(data);
+		return -1;
+	}
+
+	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
+			did != VIRTIO_ID_CRYPTO) {
+		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
+		close(data->vhostfd);
+		free(data);
+		return -1;
+	}
+
+	dev->backend_data = data;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_destroy(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	if (!data)
+		return 0;
+
+	close(data->vhostfd);
+
+	free(data);
+	dev->backend_data = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
+{
+	struct vhost_vring_state state = {
+		.index = dev->max_queue_pairs,
+		.num   = enable,
+	};
+
+	return vhost_vdpa_set_vring_enable(dev, &state);
+}
+
+static int
+vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
+				uint16_t pair_idx,
+				int enable)
+{
+	struct vhost_vring_state state = {
+		.index = pair_idx,
+		.num   = enable,
+	};
+
+	if (dev->qp_enabled[pair_idx] == enable)
+		return 0;
+
+	if (vhost_vdpa_set_vring_enable(dev, &state))
+		return -1;
+
+	dev->qp_enabled[pair_idx] = enable;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_backend_features(uint64_t *features)
+{
+	*features = 0;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
+{
+	/* TODO: It is W/A until a cleaner approach to find cpt status */
+	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_intr_fd(struct virtio_user_dev *dev __rte_unused)
+{
+	/* No link state interrupt with Vhost-vDPA */
+	return -1;
+}
+
+static int
+vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
+{
+	int nr_vrings = dev->max_queue_pairs;
+
+	return nr_vrings;
+}
+
+static int
+vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
+{
+	int i, nr_vrings;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	for (i = 0; i < nr_vrings; i++) {
+		if (dev->notify_area[i])
+			munmap(dev->notify_area[i], getpagesize());
+	}
+	free(dev->notify_area);
+	dev->notify_area = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int nr_vrings, i, page_size = getpagesize();
+	uint16_t **notify_area;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	/* CQ is another vring */
+	nr_vrings++;
+
+	notify_area = malloc(nr_vrings * sizeof(*notify_area));
+	if (!notify_area) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
+		return -1;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
+					data->vhostfd, i * page_size);
+		if (notify_area[i] == MAP_FAILED) {
+			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
+					dev->path, i);
+			i--;
+			goto map_err;
+		}
+	}
+	dev->notify_area = notify_area;
+
+	return 0;
+
+map_err:
+	for (; i >= 0; i--)
+		munmap(notify_area[i], page_size);
+	free(notify_area);
+
+	return -1;
+}
+
+struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
+	.setup = vhost_vdpa_setup,
+	.destroy = vhost_vdpa_destroy,
+	.get_backend_features = vhost_vdpa_get_backend_features,
+	.set_owner = vhost_vdpa_set_owner,
+	.get_features = vhost_vdpa_get_features,
+	.set_features = vhost_vdpa_set_features,
+	.set_memory_table = vhost_vdpa_set_memory_table,
+	.set_vring_num = vhost_vdpa_set_vring_num,
+	.set_vring_base = vhost_vdpa_set_vring_base,
+	.get_vring_base = vhost_vdpa_get_vring_base,
+	.set_vring_call = vhost_vdpa_set_vring_call,
+	.set_vring_kick = vhost_vdpa_set_vring_kick,
+	.set_vring_addr = vhost_vdpa_set_vring_addr,
+	.get_status = vhost_vdpa_get_status,
+	.set_status = vhost_vdpa_set_status,
+	.get_config = vhost_vdpa_get_config,
+	.set_config = vhost_vdpa_set_config,
+	.cvq_enable = vhost_vdpa_cvq_enable,
+	.enable_qp = vhost_vdpa_enable_queue_pair,
+	.dma_map = vhost_vdpa_dma_map_batch,
+	.dma_unmap = vhost_vdpa_dma_unmap_batch,
+	.update_link_state = vhost_vdpa_update_link_state,
+	.get_intr_fd = vhost_vdpa_get_intr_fd,
+	.map_notification_area = vhost_vdpa_map_notification_area,
+	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
+};
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
new file mode 100644
index 0000000000..248df11ccc
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
@@ -0,0 +1,767 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+
+#include <rte_alarm.h>
+#include <rte_string_fns.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_io.h>
+
+#include "vhost.h"
+#include "virtio_logs.h"
+#include "cryptodev_pmd.h"
+#include "virtio_crypto.h"
+#include "virtio_cvq.h"
+#include "virtio_user_dev.h"
+#include "virtqueue.h"
+
+#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
+
+const char * const crypto_virtio_user_backend_strings[] = {
+	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
+	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
+};
+
+static int
+virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	if (dev->kickfds[queue_sel] >= 0) {
+		close(dev->kickfds[queue_sel]);
+		dev->kickfds[queue_sel] = -1;
+	}
+
+	if (dev->callfds[queue_sel] >= 0) {
+		close(dev->callfds[queue_sel]);
+		dev->callfds[queue_sel] = -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* May use invalid flag, but some backend uses kickfd and
+	 * callfd as criteria to judge if dev is alive. so finally we
+	 * use real event_fd.
+	 */
+	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->callfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->kickfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	struct vhost_vring_state state;
+	int ret;
+
+	state.index = queue_sel;
+	ret = dev->ops->get_vring_base(dev, &state);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
+	 * firstly because vhost depends on this msg to allocate virtqueue
+	 * pair.
+	 */
+	struct vhost_vring_file file;
+	int ret;
+
+	file.index = queue_sel;
+	file.fd = dev->callfds[queue_sel];
+	ret = dev->ops->set_vring_call(dev, &file);
+	if (ret < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	int ret;
+	struct vhost_vring_file file;
+	struct vhost_vring_state state;
+	struct vring *vring = &dev->vrings.split[queue_sel];
+	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
+	uint64_t desc_addr, avail_addr, used_addr;
+	struct vhost_vring_addr addr = {
+		.index = queue_sel,
+		.log_guest_addr = 0,
+		.flags = 0, /* disable log */
+	};
+
+	if (queue_sel == dev->max_queue_pairs) {
+		if (!dev->scvq) {
+			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
+					dev->path);
+			goto err;
+		}
+
+		/* Use shadow control queue information */
+		vring = &dev->scvq->vq_split.ring;
+		pq_vring = &dev->scvq->vq_packed.ring;
+	}
+
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+		desc_addr = pq_vring->desc_iova;
+		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
+		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
+						VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	} else {
+		desc_addr = vring->desc_iova;
+		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
+		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
+					VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	}
+
+	state.index = queue_sel;
+	state.num = vring->num;
+	ret = dev->ops->set_vring_num(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	state.index = queue_sel;
+	state.num = 0; /* no reservation */
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+		state.num |= (1 << 15);
+	ret = dev->ops->set_vring_base(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	ret = dev->ops->set_vring_addr(dev, &addr);
+	if (ret < 0)
+		goto err;
+
+	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
+	 * lastly because vhost depends on this msg to judge if
+	 * virtio is ready.
+	 */
+	file.index = queue_sel;
+	file.fd = dev->kickfds[queue_sel];
+	ret = dev->ops->set_vring_kick(dev, &file);
+	if (ret < 0)
+		goto err;
+
+	return 0;
+err:
+	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
+
+	return -1;
+}
+
+static int
+virtio_user_foreach_queue(struct virtio_user_dev *dev,
+			int (*fn)(struct virtio_user_dev *, uint32_t))
+{
+	uint32_t i, nr_vq;
+
+	nr_vq = dev->max_queue_pairs;
+
+	for (i = 0; i < nr_vq; i++)
+		if (fn(dev, i) < 0)
+			return -1;
+
+	return 0;
+}
+
+int
+crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
+{
+	uint64_t features;
+	int ret = -1;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 0: tell vhost to create queues */
+	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
+		goto error;
+
+	features = dev->features;
+
+	ret = dev->ops->set_features(dev, features);
+	if (ret < 0)
+		goto error;
+	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
+error:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return ret;
+}
+
+int
+crypto_virtio_user_start_device(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	/*
+	 * XXX workaround!
+	 *
+	 * We need to make sure that the locks will be
+	 * taken in the correct order to avoid deadlocks.
+	 *
+	 * Before releasing this lock, this thread should
+	 * not trigger any memory hotplug events.
+	 *
+	 * This is a temporary workaround, and should be
+	 * replaced when we get proper supports from the
+	 * memory subsystem in the future.
+	 */
+	rte_mcfg_mem_read_lock();
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 2: share memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto error;
+
+	/* Step 3: kick queues */
+	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
+	if (ret < 0)
+		goto error;
+
+	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
+	if (ret < 0)
+		goto error;
+
+	/* Step 4: enable queues */
+	for (int i = 0; i < dev->max_queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto error;
+	}
+
+	dev->started = true;
+
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	return 0;
+error:
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
+
+	/* TODO: free resource here or caller to check */
+	return -1;
+}
+
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
+{
+	uint32_t i;
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	if (!dev->started)
+		goto out;
+
+	for (i = 0; i < dev->max_queue_pairs; ++i) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	if (dev->scvq) {
+		ret = dev->ops->cvq_enable(dev, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	/* Stop the backend. */
+	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
+		goto err;
+
+	dev->started = false;
+
+out:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return 0;
+err:
+	pthread_mutex_unlock(&dev->mutex);
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
+
+	return -1;
+}
+
+static int
+virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
+{
+	int ret;
+
+	if (!dev->ops->get_config) {
+		dev->max_queue_pairs = user_max_qp;
+		return 0;
+	}
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
+			offsetof(struct virtio_crypto_config, max_dataqueues),
+			sizeof(uint16_t));
+	if (ret) {
+		/*
+		 * We need to know the max queue pair from the device so that
+		 * the control queue gets the right index.
+		 */
+		dev->max_queue_pairs = 1;
+		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
+
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
+{
+	struct virtio_crypto_config config;
+	int ret;
+
+	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
+	dev->cipher_algo = 0;
+	dev->auth_algo = 0;
+	dev->akcipher_algo = 0;
+
+	if (!dev->ops->get_config)
+		return 0;
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
+	if (ret) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
+		return ret;
+	}
+
+	dev->crypto_services = config.crypto_services;
+	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
+						config.cipher_algo_l;
+	dev->hash_algo = config.hash_algo;
+	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
+						config.mac_algo_l;
+	dev->aead_algo = config.aead_algo;
+	dev->akcipher_algo = config.akcipher_algo;
+	return 0;
+}
+
+static int
+virtio_user_dev_init_notify(struct virtio_user_dev *dev)
+{
+
+	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
+		goto err;
+
+	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
+		if (dev->ops->map_notification_area &&
+				dev->ops->map_notification_area(dev))
+			goto err;
+
+	return 0;
+err:
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	return -1;
+}
+
+static void
+virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
+{
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	if (dev->ops->unmap_notification_area && dev->notify_area)
+		dev->ops->unmap_notification_area(dev);
+}
+
+static void
+virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
+			const void *addr,
+			size_t len __rte_unused,
+			void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+	struct rte_memseg_list *msl;
+	uint16_t i;
+	int ret = 0;
+
+	/* ignore externally allocated memory */
+	msl = rte_mem_virt2memseg_list(addr);
+	if (msl->external)
+		return;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	if (dev->started == false)
+		goto exit;
+
+	/* Step 1: pause the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* Step 2: update memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto exit;
+
+	/* Step 3: resume the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto exit;
+	}
+
+exit:
+	pthread_mutex_unlock(&dev->mutex);
+
+	if (ret < 0)
+		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
+}
+
+static int
+virtio_user_dev_setup(struct virtio_user_dev *dev)
+{
+	if (dev->is_server) {
+		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
+			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
+			return -1;
+		}
+	}
+
+	switch (dev->backend_type) {
+	case VIRTIO_USER_BACKEND_VHOST_VDPA:
+		dev->ops = &virtio_crypto_ops_vdpa;
+		break;
+	default:
+		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_alloc_vrings(struct virtio_user_dev *dev)
+{
+	int i, size, nr_vrings;
+	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
+
+	nr_vrings = dev->max_queue_pairs + 1;
+
+	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
+	if (!dev->callfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
+		return -1;
+	}
+
+	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
+	if (!dev->kickfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
+		goto free_callfds;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		dev->callfds[i] = -1;
+		dev->kickfds[i] = -1;
+	}
+
+	if (packed_ring)
+		size = sizeof(*dev->vrings.packed);
+	else
+		size = sizeof(*dev->vrings.split);
+	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
+	if (!dev->vrings.ptr) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
+		goto free_kickfds;
+	}
+
+	if (packed_ring) {
+		dev->packed_queues = rte_zmalloc("virtio_user_dev",
+				nr_vrings * sizeof(*dev->packed_queues), 0);
+		if (!dev->packed_queues) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
+					dev->path);
+			goto free_vrings;
+		}
+	}
+
+	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
+			nr_vrings * sizeof(*dev->qp_enabled), 0);
+	if (!dev->qp_enabled) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
+		goto free_packed_queues;
+	}
+
+	return 0;
+
+free_packed_queues:
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+free_vrings:
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+free_kickfds:
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+free_callfds:
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+
+	return -1;
+}
+
+static void
+virtio_user_free_vrings(struct virtio_user_dev *dev)
+{
+	rte_free(dev->qp_enabled);
+	dev->qp_enabled = NULL;
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+}
+
+#define VIRTIO_USER_SUPPORTED_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+int
+crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server)
+{
+	uint64_t backend_features;
+
+	pthread_mutex_init(&dev->mutex, NULL);
+	strlcpy(dev->path, path, PATH_MAX);
+
+	dev->started = 0;
+	dev->queue_pairs = 1; /* mq disabled by default */
+	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
+	dev->queue_size = queue_size;
+	dev->is_server = server;
+	dev->frontend_features = 0;
+	dev->unsupported_features = 0;
+	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
+	dev->hw.modern = 1;
+
+	if (virtio_user_dev_setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->set_owner(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
+		goto destroy;
+	}
+
+	if (dev->ops->get_backend_features(&backend_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
+		goto destroy;
+	}
+
+	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
+
+	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_cipher_services(dev)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
+		goto destroy;
+	}
+
+	dev->frontend_features &= ~dev->unsupported_features;
+	dev->device_features &= ~dev->unsupported_features;
+
+	if (virtio_user_alloc_vrings(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_notify(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
+		goto free_vrings;
+	}
+
+	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
+				virtio_user_mem_event_cb, dev)) {
+		if (rte_errno != ENOTSUP) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
+					dev->path);
+			goto notify_uninit;
+		}
+	}
+
+	return 0;
+
+notify_uninit:
+	virtio_user_dev_uninit_notify(dev);
+free_vrings:
+	virtio_user_free_vrings(dev);
+destroy:
+	dev->ops->destroy(dev);
+
+	return -1;
+}
+
+void
+crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
+{
+	crypto_virtio_user_stop_device(dev);
+
+	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
+
+	virtio_user_dev_uninit_notify(dev);
+
+	virtio_user_free_vrings(dev);
+
+	if (dev->is_server)
+		unlink(dev->path);
+
+	dev->ops->destroy(dev);
+}
+
+#define CVQ_MAX_DATA_DESCS 32
+
+static inline void *
+virtio_user_iova2virt(struct virtio_user_dev *dev __rte_unused, rte_iova_t iova)
+{
+	if (rte_eal_iova_mode() == RTE_IOVA_VA)
+		return (void *)(uintptr_t)iova;
+	else
+		return rte_mem_iova2virt(iova);
+}
+
+static inline int
+desc_is_avail(struct vring_packed_desc *desc, bool wrap_counter)
+{
+	uint16_t flags = rte_atomic_load_explicit(&desc->flags, rte_memory_order_acquire);
+
+	return wrap_counter == !!(flags & VRING_PACKED_DESC_F_AVAIL) &&
+		wrap_counter != !!(flags & VRING_PACKED_DESC_F_USED);
+}
+
+int
+crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	dev->status = status;
+	ret = dev->ops->set_status(dev, status);
+	if (ret && ret != -ENOTSUP)
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
+{
+	int ret;
+	uint8_t status;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	ret = dev->ops->get_status(dev, &status);
+	if (!ret) {
+		dev->status = status;
+		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
+			"\t-RESET: %u "
+			"\t-ACKNOWLEDGE: %u "
+			"\t-DRIVER: %u "
+			"\t-DRIVER_OK: %u "
+			"\t-FEATURES_OK: %u "
+			"\t-DEVICE_NEED_RESET: %u "
+			"\t-FAILED: %u",
+			dev->status,
+			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
+	} else if (ret != -ENOTSUP) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
+	}
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
+{
+	if (dev->ops->update_link_state)
+		return dev->ops->update_link_state(dev);
+
+	return 0;
+}
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
new file mode 100644
index 0000000000..9cd9856e5d
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#ifndef _VIRTIO_USER_DEV_H
+#define _VIRTIO_USER_DEV_H
+
+#include <limits.h>
+#include <stdbool.h>
+
+#include "../virtio_pci.h"
+#include "../virtio_ring.h"
+
+extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
+
+enum virtio_user_backend_type {
+	VIRTIO_USER_BACKEND_UNKNOWN,
+	VIRTIO_USER_BACKEND_VHOST_USER,
+	VIRTIO_USER_BACKEND_VHOST_VDPA,
+};
+
+struct virtio_user_queue {
+	uint16_t used_idx;
+	bool avail_wrap_counter;
+	bool used_wrap_counter;
+};
+
+struct virtio_user_dev {
+	struct virtio_crypto_hw hw;
+	enum virtio_user_backend_type backend_type;
+	bool		is_server;  /* server or client mode */
+
+	int		*callfds;
+	int		*kickfds;
+	uint16_t	max_queue_pairs;
+	uint16_t	queue_pairs;
+	uint32_t	queue_size;
+	uint64_t	features; /* the negotiated features with driver,
+				   * and will be sync with device
+				   */
+	uint64_t	device_features; /* supported features by device */
+	uint64_t	frontend_features; /* enabled frontend features */
+	uint64_t	unsupported_features; /* unsupported features mask */
+	uint8_t		status;
+	uint32_t	crypto_status;
+	uint32_t	crypto_services;
+	uint64_t	cipher_algo;
+	uint32_t	hash_algo;
+	uint64_t	auth_algo;
+	uint32_t	aead_algo;
+	uint32_t	akcipher_algo;
+	char		path[PATH_MAX];
+
+	union {
+		void			*ptr;
+		struct vring		*split;
+		struct vring_packed	*packed;
+	} vrings;
+
+	struct virtio_user_queue *packed_queues;
+	bool		*qp_enabled;
+
+	struct virtio_user_backend_ops *ops;
+	pthread_mutex_t	mutex;
+	bool		started;
+
+	bool			hw_cvq;
+	struct virtqueue	*scvq;
+
+	void *backend_data;
+
+	uint16_t **notify_area;
+};
+
+int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
+int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server);
+void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
+int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+extern const char * const crypto_virtio_user_backend_strings[];
+#endif
diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
new file mode 100644
index 0000000000..6dfdb76268
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
@@ -0,0 +1,575 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <bus_vdev_driver.h>
+#include <rte_cryptodev.h>
+#include <cryptodev_pmd.h>
+#include <rte_alarm.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+
+#include "virtio_user/virtio_user_dev.h"
+#include "virtio_user/vhost.h"
+#include "virtio_cryptodev.h"
+#include "virtio_logs.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
+
+static void
+virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		     void *dst, int length __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (offset == offsetof(struct virtio_crypto_config, status)) {
+		crypto_virtio_user_dev_update_link_state(dev);
+		*(uint32_t *)dst = dev->crypto_status;
+	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
+		*(uint16_t *)dst = dev->max_queue_pairs;
+	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
+		*(uint32_t *)dst = dev->crypto_services;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
+		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
+		*(uint32_t *)dst = dev->cipher_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
+		*(uint32_t *)dst = dev->hash_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
+		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
+		*(uint32_t *)dst = dev->auth_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
+		*(uint32_t *)dst = dev->aead_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
+		*(uint32_t *)dst = dev->akcipher_algo;
+}
+
+static void
+virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		      const void *src, int length)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(src);
+
+	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
+		    offset, length);
+}
+
+static void
+virtio_user_reset(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
+		crypto_virtio_user_stop_device(dev);
+}
+
+static void
+virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint8_t old_status = dev->status;
+
+	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
+			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
+		crypto_virtio_user_dev_set_features(dev);
+		/* Feature negotiation should be only done in probe time.
+		 * So we skip any more request here.
+		 */
+		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
+	}
+
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
+		if (crypto_virtio_user_start_device(dev)) {
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
+		virtio_user_reset(hw);
+	}
+
+	crypto_virtio_user_dev_set_status(dev, status);
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
+		if (dev->ops->cvq_enable(dev, 1) < 0) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	}
+}
+
+static uint8_t
+virtio_user_get_status(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	crypto_virtio_user_dev_update_status(dev);
+
+	return dev->status;
+}
+
+#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+static uint64_t
+virtio_user_get_features(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* unmask feature bits defined in vhost user protocol */
+	return (dev->device_features | dev->frontend_features) &
+		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
+}
+
+static void
+virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	dev->features = features & (dev->device_features | dev->frontend_features);
+}
+
+static uint8_t
+virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
+{
+	/* rxq interrupts and config interrupt are separated in virtio-user,
+	 * here we only report config change.
+	 */
+	return VIRTIO_PCI_CAP_ISR_CFG;
+}
+
+static uint16_t
+virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
+		    uint16_t vec __rte_unused)
+{
+	return 0;
+}
+
+static uint16_t
+virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
+			  struct virtqueue *vq __rte_unused,
+			  uint16_t vec)
+{
+	/* pretend we have done that */
+	return vec;
+}
+
+/* This function is to get the queue size, aka, number of descs, of a specified
+ * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
+ * max supported queues.
+ */
+static uint16_t
+virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* Currently, each queue has same queue size */
+	return dev->queue_size;
+}
+
+static void
+virtio_user_setup_queue_packed(struct virtqueue *vq,
+			       struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	struct vring_packed *vring;
+	uint64_t desc_addr;
+	uint64_t avail_addr;
+	uint64_t used_addr;
+	uint16_t i;
+
+	vring  = &dev->vrings.packed[queue_idx];
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries *
+		sizeof(struct vring_packed_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr +
+			   sizeof(struct vring_packed_desc_event),
+			   VIRTIO_VRING_ALIGN);
+	vring->num = vq->vq_nentries;
+	vring->desc_iova = vq->vq_ring_mem;
+	vring->desc = (void *)(uintptr_t)desc_addr;
+	vring->driver = (void *)(uintptr_t)avail_addr;
+	vring->device = (void *)(uintptr_t)used_addr;
+	dev->packed_queues[queue_idx].avail_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_idx = 0;
+
+	for (i = 0; i < vring->num; i++)
+		vring->desc[i].flags = 0;
+}
+
+static void
+virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	uint64_t desc_addr, avail_addr, used_addr;
+
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_VRING_ALIGN);
+
+	dev->vrings.split[queue_idx].num = vq->vq_nentries;
+	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
+	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
+	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
+	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
+}
+
+static int
+virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (vtpci_with_packed_queue(hw))
+		virtio_user_setup_queue_packed(vq, dev);
+	else
+		virtio_user_setup_queue_split(vq, dev);
+
+	if (dev->notify_area)
+		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
+
+	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
+		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
+
+	return 0;
+}
+
+static void
+virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(vq);
+}
+
+static void
+virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint64_t notify_data = 1;
+
+	if (!dev->notify_area) {
+		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
+			  sizeof(notify_data)) < 0)
+			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
+				    strerror(errno));
+		return;
+	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
+		rte_write16(vq->vq_queue_index, vq->notify_addr);
+		return;
+	}
+
+	if (vtpci_with_packed_queue(hw)) {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:30]: avail index
+		 * Bit[31]: avail wrap counter
+		 */
+		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
+				VRING_PACKED_DESC_F_AVAIL)) << 31) |
+				((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	} else {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:31]: avail index
+		 */
+		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	}
+	rte_write32(notify_data, vq->notify_addr);
+}
+
+const struct virtio_pci_ops crypto_virtio_user_ops = {
+	.read_dev_cfg	= virtio_user_read_dev_config,
+	.write_dev_cfg	= virtio_user_write_dev_config,
+	.reset		= virtio_user_reset,
+	.get_status	= virtio_user_get_status,
+	.set_status	= virtio_user_set_status,
+	.get_features	= virtio_user_get_features,
+	.set_features	= virtio_user_set_features,
+	.get_isr	= virtio_user_get_isr,
+	.set_config_irq	= virtio_user_set_config_irq,
+	.set_queue_irq	= virtio_user_set_queue_irq,
+	.get_queue_num	= virtio_user_get_queue_num,
+	.setup_queue	= virtio_user_setup_queue,
+	.del_queue	= virtio_user_del_queue,
+	.notify_queue	= virtio_user_notify_queue,
+};
+
+static const char * const valid_args[] = {
+#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
+	VIRTIO_USER_ARG_QUEUES_NUM,
+#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
+	VIRTIO_USER_ARG_QUEUE_SIZE,
+#define VIRTIO_USER_ARG_PATH           "path"
+	VIRTIO_USER_ARG_PATH,
+	NULL
+};
+
+#define VIRTIO_USER_DEF_Q_NUM	1
+#define VIRTIO_USER_DEF_Q_SZ	256
+#define VIRTIO_USER_DEF_SERVER_MODE	0
+
+static int
+get_string_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	uint64_t integer = 0;
+	if (!value || !extra_args)
+		return -EINVAL;
+	errno = 0;
+	integer = strtoull(value, NULL, 0);
+	/* extra_args keeps default value, it should be replaced
+	 * only in case of successful parsing of the 'value' arg
+	 */
+	if (errno == 0)
+		*(uint64_t *)extra_args = integer;
+	return -errno;
+}
+
+static struct rte_cryptodev *
+virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev_pmd_init_params init_params = {
+		.name = "",
+		.private_data_size = sizeof(struct virtio_user_dev),
+	};
+	struct rte_cryptodev_data *data;
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	struct virtio_crypto_hw *hw;
+
+	init_params.socket_id = vdev->device.numa_node;
+	init_params.private_data_size = sizeof(struct virtio_user_dev);
+	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
+	if (cryptodev == NULL) {
+		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
+		return NULL;
+	}
+
+	data = cryptodev->data;
+	dev = data->dev_private;
+	hw = &dev->hw;
+
+	hw->dev_id = data->dev_id;
+	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
+
+	return cryptodev;
+}
+
+static void
+virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
+{
+	rte_cryptodev_pmd_destroy(cryptodev);
+}
+
+static int
+virtio_user_pmd_probe(struct rte_vdev_device *vdev)
+{
+	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
+	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
+	struct rte_cryptodev *cryptodev = NULL;
+	struct rte_kvargs *kvlist = NULL;
+	struct virtio_user_dev *dev;
+	char *path = NULL;
+	int ret;
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
+
+	if (!kvlist) {
+		PMD_INIT_LOG(ERR, "error when parsing param");
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
+					&get_string_arg, &path) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_PATH);
+			goto end;
+		}
+	} else {
+		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
+				VIRTIO_USER_ARG_PATH);
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
+					&get_integer_arg, &queues) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUES_NUM);
+			goto end;
+		}
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
+					&get_integer_arg, &queue_size) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUE_SIZE);
+			goto end;
+		}
+	}
+
+	cryptodev = virtio_user_cryptodev_alloc(vdev);
+	if (!cryptodev) {
+		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
+		goto end;
+	}
+
+	dev = cryptodev->data->dev_private;
+	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
+			server_mode) < 0) {
+		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
+			NULL) < 0) {
+		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
+		crypto_virtio_user_dev_uninit(dev);
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	rte_cryptodev_pmd_probing_finish(cryptodev);
+
+	ret = 0;
+end:
+	rte_kvargs_free(kvlist);
+	free(path);
+	return ret;
+}
+
+static int
+virtio_user_pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev *cryptodev;
+	const char *name;
+	int devid;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	PMD_DRV_LOG(INFO, "Removing %s", name);
+
+	devid = rte_cryptodev_get_dev_id(name);
+	if (devid < 0)
+		return -EINVAL;
+
+	rte_cryptodev_stop(devid);
+
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
+		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_map)
+		return dev->ops->dma_map(dev, addr, iova, len);
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_unmap)
+		return dev->ops->dma_unmap(dev, addr, iova, len);
+
+	return 0;
+}
+
+static struct rte_vdev_driver virtio_user_driver = {
+	.probe = virtio_user_pmd_probe,
+	.remove = virtio_user_pmd_remove,
+	.dma_map = virtio_user_pmd_dma_map,
+	.dma_unmap = virtio_user_pmd_dma_unmap,
+};
+
+static struct cryptodev_driver virtio_crypto_drv;
+
+uint8_t cryptodev_virtio_user_driver_id;
+
+RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
+RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
+	virtio_user_driver.driver,
+	cryptodev_virtio_user_driver_id);
+RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
+	"path=<path> "
+	"queues=<int> "
+	"queue_size=<int>");
-- 
2.25.1


^ permalink raw reply	[relevance 1%]

* [RFC PATCH v18] mempool: fix mempool cache size
  @ 2025-02-21 15:13  4% ` Morten Brørup
  2025-02-21 19:05  3% ` [RFC PATCH v19] " Morten Brørup
  2025-02-21 20:27  3% ` [RFC PATCH v20] " Morten Brørup
  2 siblings, 0 replies; 200+ results
From: Morten Brørup @ 2025-02-21 15:13 UTC (permalink / raw)
  To: dev; +Cc: Morten Brørup

NOTE: THIS VERSION DOES NOT BREAK THE API/ABI.

First, a per-lcore mempool cache could hold 50 % more than the cache's
size.
Since application developers do not expect this behavior, it could lead to
application failure.
This patch fixes this bug without breaking the API/ABI, by using the
mempool cache's "size" instead of the "flushthresh" as the threshold for
how many objects can be held in a mempool cache.
Note: The "flushthresh" field can be removed from the cache structure in a
future API/ABI breaking release, which must be announced in advance.

Second, requests to fetch a number of objects from the backend driver
exceeding the cache's size (but less than RTE_MEMPOOL_CACHE_MAX_SIZE) were
copied twice; first to the cache, and from there to the destination.
Such superfluous copying through the mempool cache degrades the
performance in these cases.
This patch also fixes this misbehavior, so when fetching more objects from
the driver than the mempool cache's size, they are fetched directly to the
destination.

The internal macro to calculate the cache flush threshold was updated to
reflect the new flush threshold of 1 * size instead of 1.5 * size.

The function rte_mempool_do_generic_put() for adding objects to a mempool
was modified as follows:
- When determining if the cache has sufficient room for the request
  without flushing, compare to the cache's size (cache->size) instead of
  the obsolete flush threshold (cache->flushthresh).
- The comparison for the request being too big, which is considered
  unlikely, was moved down and out of the code path where the cache has
  sufficient room for the added objects, which is considered the most
  likely code path.

The function rte_mempool_do_generic_get() for getting objects from a
mempool was refactored as follows:
- Handling a request for a constant number of objects was merged with
  handling a request for a nonconstant number of objects, and a note about
  compiler loop unrolling in the constant case was added.
- When determining if the remaining part of a request to be dequeued from
  the backend is too big to be copied via the cache, compare to the
  cache's size (cache->size) instead of the max possible cache size
  (RTE_MEMPOOL_CACHE_MAX_SIZE).
- When refilling the cache, the target fill level was reduced from the
  full cache size to half the cache size. This allows some room for a
  put() request following a get() request where the cache was refilled,
  without "flapping" between draining and refilling the entire cache.
  Note: Before this patch, the distance between the flush threshold and
  the refill level was also half a cache size.
- A copy of cache->len in the local variable "len" is no longer needed,
  so it was removed.

Furthermore, some likely()/unlikely()'s were added to a few inline
functions; most prominently rte_mempool_default_cache(), which is used by
both rte_mempool_put_bulk() and rte_mempool_get_bulk().

And finally, some comments were updated.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
v18:
* Start over from scratch, to avoid API/ABI breakage.
v17:
* Update rearm in idpf driver.
v16:
* Fix bug in rte_mempool_do_generic_put() regarding criteria for flush.
v15:
* Changed back cache bypass limit from n >= RTE_MEMPOOL_CACHE_MAX_SIZE to
  n > RTE_MEMPOOL_CACHE_MAX_SIZE.
* Removed cache size limit from serving via cache.
v14:
* Change rte_mempool_do_generic_put() back from add-then-flush to
  flush-then-add.
  Keep the target cache fill level of ca. 1/2 size of the cache.
v13:
* Target a cache fill level of ca. 1/2 size of the cache when flushing and
  refilling; based on an assumption of equal probability of get and put,
  instead of assuming a higher probability of put being followed by
  another put, and get being followed by another get.
* Reduce the amount of changes to the drivers.
v12:
* Do not init mempool caches with size zero; they don't exist.
  Bug introduced in v10.
v11:
* Removed rte_mempool_do_generic_get_split().
v10:
* Initialize mempool caches, regardless of size zero.
  This to fix compiler warning about out of bounds access.
v9:
* Removed factor 1.5 from description of cache_size parameter to
  rte_mempool_create().
* Refactored rte_mempool_do_generic_put() to eliminate some gotos.
  No functional change.
* Removed check for n >= RTE_MEMPOOL_CACHE_MAX_SIZE in
  rte_mempool_do_generic_get(); it caused the function to fail when the
  request could not be served from the backend alone, but it could be
  served from the cache and the backend.
* Refactored rte_mempool_do_generic_get_split() to make it shorter.
* When getting objects directly from the backend, use burst size aligned
  with either CPU cache line size or mempool cache size.
v8:
* Rewrote rte_mempool_do_generic_put() to get rid of transaction
  splitting. Use a method similar to the existing put method with fill
  followed by flush if overfilled.
  This also made rte_mempool_do_generic_put_split() obsolete.
* When flushing the cache as much as we can, use burst size aligned with
  either CPU cache line size or mempool cache size.
v7:
* Increased max mempool cache size from 512 to 1024 objects.
  Mainly for CI performance test purposes.
  Originally, the max mempool cache size was 768 objects, and used a fixed
  size array of 1024 objects in the mempool cache structure.
v6:
* Fix v5 incomplete implementation of passing large requests directly to
  the backend.
* Use memcpy instead of rte_memcpy where compiler complains about it.
* Added const to some function parameters.
v5:
* Moved helper functions back into the header file, for improved
  performance.
* Pass large requests directly to the backend. This also simplifies the
  code.
v4:
* Updated subject to reflect that misleading names are considered bugs.
* Rewrote patch description to provide more details about the bugs fixed.
  (Mattias Rönnblom)
* Moved helper functions, not to be inlined, to mempool C file.
  (Mattias Rönnblom)
* Pass requests for n >= RTE_MEMPOOL_CACHE_MAX_SIZE objects known at build
  time directly to backend driver, to avoid calling the helper functions.
  This also fixes the compiler warnings about out of bounds array access.
v3:
* Removed __attribute__(assume).
v2:
* Removed mempool perf test; not part of patch set.
---
 lib/mempool/rte_mempool.c |  5 +-
 lib/mempool/rte_mempool.h | 98 +++++++++++++++------------------------
 2 files changed, 40 insertions(+), 63 deletions(-)

diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index 1e4f24783c..cddc896442 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -50,10 +50,9 @@ static void
 mempool_event_callback_invoke(enum rte_mempool_event event,
 			      struct rte_mempool *mp);
 
-/* Note: avoid using floating point since that compiler
- * may not think that is constant.
+/* Note: This is no longer 1.5 * size, but simply 1 * size.
  */
-#define CALC_CACHE_FLUSHTHRESH(c) (((c) * 3) / 2)
+#define CALC_CACHE_FLUSHTHRESH(c) (c)
 
 #if defined(RTE_ARCH_X86)
 /*
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index c495cc012f..1200301ae9 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -791,7 +791,7 @@ rte_mempool_ops_dequeue_bulk(struct rte_mempool *mp,
 	rte_mempool_trace_ops_dequeue_bulk(mp, obj_table, n);
 	ops = rte_mempool_get_ops(mp->ops_index);
 	ret = ops->dequeue(mp, obj_table, n);
-	if (ret == 0) {
+	if (likely(ret == 0)) {
 		RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_bulk, 1);
 		RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_objs, n);
 	}
@@ -1044,7 +1044,7 @@ rte_mempool_free(struct rte_mempool *mp);
  *   If cache_size is non-zero, the rte_mempool library will try to
  *   limit the accesses to the common lockless pool, by maintaining a
  *   per-lcore object cache. This argument must be lower or equal to
- *   RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
+ *   RTE_MEMPOOL_CACHE_MAX_SIZE and n. It is advised to choose
  *   cache_size to have "n modulo cache_size == 0": if this is
  *   not the case, some elements will always stay in the pool and will
  *   never be used. The access to the per-lcore table is of course
@@ -1333,10 +1333,10 @@ rte_mempool_cache_free(struct rte_mempool_cache *cache);
 static __rte_always_inline struct rte_mempool_cache *
 rte_mempool_default_cache(struct rte_mempool *mp, unsigned lcore_id)
 {
-	if (mp->cache_size == 0)
+	if (unlikely(mp->cache_size == 0))
 		return NULL;
 
-	if (lcore_id >= RTE_MAX_LCORE)
+	if (unlikely(lcore_id >= RTE_MAX_LCORE))
 		return NULL;
 
 	rte_mempool_trace_default_cache(mp, lcore_id,
@@ -1383,32 +1383,30 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
 {
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL))
 		goto driver_enqueue;
 
-	/* increment stat now, adding in mempool always success */
+	/* Increment stats now, adding in mempool always succeeds. */
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-	/* The request itself is too big for the cache */
-	if (unlikely(n > cache->flushthresh))
-		goto driver_enqueue_stats_incremented;
-
-	/*
-	 * The cache follows the following algorithm:
-	 *   1. If the objects cannot be added to the cache without crossing
-	 *      the flush threshold, flush the cache to the backend.
-	 *   2. Add the objects to the cache.
-	 */
-
-	if (cache->len + n <= cache->flushthresh) {
+	if (likely(cache->len + n <= cache->size)) {
+		/* Sufficient room in the cache for the objects. */
 		cache_objs = &cache->objs[cache->len];
 		cache->len += n;
-	} else {
+	} else if (n <= cache->size) {
+		/*
+		 * The cache is big enough for the objects, but - as detected by
+		 * the comparison above - has insufficient room for them.
+		 * Flush the cache to make room for the objects.
+		 */
 		cache_objs = &cache->objs[0];
 		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
 		cache->len = n;
+	} else {
+		/* The request itself is too big for the cache. */
+		goto driver_enqueue_stats_incremented;
 	}
 
 	/* Add the objects to the cache. */
@@ -1512,10 +1510,10 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 {
 	int ret;
 	unsigned int remaining;
-	uint32_t index, len;
+	uint32_t index;
 	void **cache_objs;
 
-	/* No cache provided */
+	/* No cache provided? */
 	if (unlikely(cache == NULL)) {
 		remaining = n;
 		goto driver_dequeue;
@@ -1524,11 +1522,11 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	/* The cache is a stack, so copy will be in reverse order. */
 	cache_objs = &cache->objs[cache->len];
 
-	if (__rte_constant(n) && n <= cache->len) {
+	if (likely(n <= cache->len)) {
 		/*
-		 * The request size is known at build time, and
-		 * the entire request can be satisfied from the cache,
-		 * so let the compiler unroll the fixed length copy loop.
+		 * The entire request can be satisfied from the cache.
+		 * Note: If the request size is known at build time,
+		 * the compiler will unroll the fixed length copy loop.
 		 */
 		cache->len -= n;
 		for (index = 0; index < n; index++)
@@ -1540,55 +1538,35 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 		return 0;
 	}
 
-	/*
-	 * Use the cache as much as we have to return hot objects first.
-	 * If the request size 'n' is known at build time, the above comparison
-	 * ensures that n > cache->len here, so omit RTE_MIN().
-	 */
-	len = __rte_constant(n) ? cache->len : RTE_MIN(n, cache->len);
-	cache->len -= len;
-	remaining = n - len;
-	for (index = 0; index < len; index++)
+	/* Use the cache as much as we have to return hot objects first. */
+	for (index = 0; index < cache->len; index++)
 		*obj_table++ = *--cache_objs;
+	remaining = n - cache->len;
+	cache->len = 0;
 
-	/*
-	 * If the request size 'n' is known at build time, the case
-	 * where the entire request can be satisfied from the cache
-	 * has already been handled above, so omit handling it here.
-	 */
-	if (!__rte_constant(n) && remaining == 0) {
-		/* The entire request is satisfied from the cache. */
-
-		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
-		RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
-
-		return 0;
-	}
-
-	/* if dequeue below would overflow mem allocated for cache */
-	if (unlikely(remaining > RTE_MEMPOOL_CACHE_MAX_SIZE))
+	/* The remaining request is too big for the cache? */
+	if (unlikely(remaining > cache->size))
 		goto driver_dequeue;
 
-	/* Fill the cache from the backend; fetch size + remaining objects. */
+	/* Fill the cache from the backend; fetch remaining objects + size / 2. */
 	ret = rte_mempool_ops_dequeue_bulk(mp, cache->objs,
-			cache->size + remaining);
+			remaining + cache->size / 2);
 	if (unlikely(ret < 0)) {
 		/*
-		 * We are buffer constrained, and not able to allocate
-		 * cache + remaining.
+		 * We are buffer constrained, and not able to fetch all that.
 		 * Do not fill the cache, just satisfy the remaining part of
 		 * the request directly from the backend.
 		 */
 		goto driver_dequeue;
 	}
 
+	cache->len = cache->size / 2;
+
 	/* Satisfy the remaining part of the request from the filled cache. */
-	cache_objs = &cache->objs[cache->size + remaining];
+	cache_objs = &cache->objs[cache->len + remaining];
 	for (index = 0; index < remaining; index++)
 		*obj_table++ = *--cache_objs;
 
-	cache->len = cache->size;
-
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
 	RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
 
@@ -1599,7 +1577,7 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
 	/* Get remaining objects directly from the backend. */
 	ret = rte_mempool_ops_dequeue_bulk(mp, obj_table, remaining);
 
-	if (ret < 0) {
+	if (unlikely(ret < 0)) {
 		if (likely(cache != NULL)) {
 			cache->len = n - remaining;
 			/*
@@ -1650,7 +1628,7 @@ rte_mempool_generic_get(struct rte_mempool *mp, void **obj_table,
 {
 	int ret;
 	ret = rte_mempool_do_generic_get(mp, obj_table, n, cache);
-	if (ret == 0)
+	if (likely(ret == 0))
 		RTE_MEMPOOL_CHECK_COOKIES(mp, obj_table, n, 1);
 	rte_mempool_trace_generic_get(mp, obj_table, n, cache);
 	return ret;
@@ -1741,7 +1719,7 @@ rte_mempool_get_contig_blocks(struct rte_mempool *mp,
 	int ret;
 
 	ret = rte_mempool_ops_dequeue_contig_blocks(mp, first_obj_table, n);
-	if (ret == 0) {
+	if (likely(ret == 0)) {
 		RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
 		RTE_MEMPOOL_STAT_ADD(mp, get_success_blks, n);
 		RTE_MEMPOOL_CONTIG_BLOCKS_CHECK_COOKIES(mp, first_obj_table, n,
-- 
2.43.0


^ permalink raw reply	[relevance 4%]

* RE: [PATCH v22 00/27] remove use of VLAs for Windows
  2025-02-18 14:22  0%       ` David Marchand
@ 2025-02-19 14:28  0%         ` Konstantin Ananyev
  0 siblings, 0 replies; 200+ results
From: Konstantin Ananyev @ 2025-02-19 14:28 UTC (permalink / raw)
  To: David Marchand; +Cc: Andre Muezerie, dev, thomas, honnappa.nagarahalli


> > > > As per guidance technical board meeting 2024/04/17. This series
> > > > removes the use of VLAs from code built for Windows for all 3
> > > > toolchains. If there are additional opportunities to convert VLAs
> > > > to regular C arrays please provide the details for incorporation
> > > > into the series.
> > > >
> > > > MSVC does not support VLAs, replace VLAs with standard C arrays
> > > > or alloca(). alloca() is available for all toolchain/platform
> > > > combinations officially supported by DPDK.
> > >
> > > - I have one concern wrt patch 7.
> > > This changes the API/ABI of the RCU library.
> > > ABI can't be broken in the 25.03 release.
> > >
> > > Since MSVC builds do not include RCU yet, I skipped this change and
> > > adjusted this libray meson.build.
> > >
> > > Konstantin, do you think patch 7 could be rewritten to make use of
> > > alloca() and avoid an API change?
> > > https://patchwork.dpdk.org/project/dpdk/patch/1738805610-17507-8-git-send-email-andremue@linux.microsoft.com/
> >
> > I am not big fan of alloca() approach, but yes it is surely possible.
> 
> Can you please explain your reluctance?

Mostly conceptual: using alloca() doesn't really differ from simply using VLA,
in fact it makes code looks uglier.
I understand that we do want MSVC enabled, and in many cases such mechanical
replacement is ok, but probably better to avoid  it whenever possible.  

> 
> 
> > BTW, why it is considered ad API/ABI change?
> > Because we introduce extra limit on max allowable size?
> 
> Yes, this is what was mentionned in the commitlog.
> 
> 
> > If that would help somehow, we can make it even bigger: 1K or so.
> 
> Strictly speaking, it is still an API change.
 
Ok, then I suppose we have 3 options:
1) wait for 25.11 to apply these changes
2) use alloca()
3) come-up with some smarter approach.

For 3) - I don't have any good ideas right now.
One option would be to allow ring_enqueue/ring_dequeue to accept custom copy_elem() functions as a parameter.
That would solve an issue, as in that case we wouldn't need to make temp copy of data on the stack,
but that's probably too big changes for such small thing.
So I am ok with both 1) and 2).
In fact - it is probably possible to go with 2) for now, and then switch to 1) or 3) in 25.11


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v22 00/27] remove use of VLAs for Windows
  2025-02-07 14:23  3%     ` Konstantin Ananyev
@ 2025-02-18 14:22  0%       ` David Marchand
  2025-02-19 14:28  0%         ` Konstantin Ananyev
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2025-02-18 14:22 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: Andre Muezerie, dev, thomas, honnappa.nagarahalli

On Fri, Feb 7, 2025 at 3:23 PM Konstantin Ananyev
<konstantin.ananyev@huawei.com> wrote:
> > > As per guidance technical board meeting 2024/04/17. This series
> > > removes the use of VLAs from code built for Windows for all 3
> > > toolchains. If there are additional opportunities to convert VLAs
> > > to regular C arrays please provide the details for incorporation
> > > into the series.
> > >
> > > MSVC does not support VLAs, replace VLAs with standard C arrays
> > > or alloca(). alloca() is available for all toolchain/platform
> > > combinations officially supported by DPDK.
> >
> > - I have one concern wrt patch 7.
> > This changes the API/ABI of the RCU library.
> > ABI can't be broken in the 25.03 release.
> >
> > Since MSVC builds do not include RCU yet, I skipped this change and
> > adjusted this libray meson.build.
> >
> > Konstantin, do you think patch 7 could be rewritten to make use of
> > alloca() and avoid an API change?
> > https://patchwork.dpdk.org/project/dpdk/patch/1738805610-17507-8-git-send-email-andremue@linux.microsoft.com/
>
> I am not big fan of alloca() approach, but yes it is surely possible.

Can you please explain your reluctance?


> BTW, why it is considered ad API/ABI change?
> Because we introduce extra limit on max allowable size?

Yes, this is what was mentionned in the commitlog.


> If that would help somehow, we can make it even bigger: 1K or so.

Strictly speaking, it is still an API change.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v5 02/11] eal: add new secure free function
  @ 2025-02-12  6:46  3%       ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2025-02-12  6:46 UTC (permalink / raw)
  To: fengchengwen; +Cc: dev, Anatoly Burakov, Tyler Retzlaff

On Wed, 12 Feb 2025 10:01:13 +0800
fengchengwen <fengchengwen@huawei.com> wrote:

> On 2025/2/12 1:35, Stephen Hemminger wrote:
> > Although internally rte_free does poison the buffer in most
> > cases, it is useful to have function that explicitly does
> > this to avoid any security issues.
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> >  lib/eal/common/rte_malloc.c  | 30 ++++++++++++++++++++++++------
> >  lib/eal/include/rte_malloc.h | 18 ++++++++++++++++++
> >  lib/eal/version.map          |  3 +++
> >  3 files changed, 45 insertions(+), 6 deletions(-)
> > 
> > diff --git a/lib/eal/common/rte_malloc.c b/lib/eal/common/rte_malloc.c
> > index 3eed4d4be6..c9e0f4724f 100644
> > --- a/lib/eal/common/rte_malloc.c
> > +++ b/lib/eal/common/rte_malloc.c
> > @@ -15,6 +15,7 @@
> >  #include <rte_eal_memconfig.h>
> >  #include <rte_common.h>
> >  #include <rte_spinlock.h>
> > +#include <rte_string_fns.h>
> >  
> >  #include <eal_trace_internal.h>
> >  
> > @@ -27,27 +28,44 @@
> >  
> >  
> >  /* Free the memory space back to heap */
> > -static void
> > -mem_free(void *addr, const bool trace_ena)
> > +static inline void
> > +mem_free(void *addr, const bool trace_ena, bool zero)
> >  {
> > +	struct malloc_elem *elem;
> > +
> >  	if (trace_ena)
> >  		rte_eal_trace_mem_free(addr);
> >  
> > -	if (addr == NULL) return;
> > -	if (malloc_heap_free(malloc_elem_from_data(addr)) < 0)
> > +	if (addr == NULL)
> > +		return;
> > +
> > +	elem = malloc_elem_from_data(addr);
> > +	if (zero) {
> > +		size_t data_len = elem->size - MALLOC_ELEM_OVERHEAD;  
> 
> this will make rte_malloc know the layout of malloc-elem.
> Prefer to add extra paramter, e.g. malloc_heap_free(elem, bool zero)

Don't understand, these are functions inside the rte_malloc implementation file.
The layout of malloc_elem is already known here and nothing visible in API or ABI is changing.
>   
> > +
> > +/**
> > + * Frees the memory space pointed to by the provided pointer
> > + * and guarantees it will be zero'd before reuse.
> > + *
> > + * This pointer must have been returned by a previous call to
> > + * rte_malloc(), rte_zmalloc(), rte_calloc() or rte_realloc(). The behaviour of
> > + * rte_free() is undefined if the pointer does not match this requirement.  
> 
> Suggest add notice: The value may be cleared twice, which affects the performance.

That could easily change with a little work, this is only for crypto keys
so performance doesn't matter.

> > + *
> > + * If the pointer is NULL, the function does nothing.
> > + *
> > + * @param ptr
> > + *   The pointer to memory to be freed.
> > + */
> > +__rte_experimental
> > +void
> > +rte_free_sensitive(void *ptr);  
> 
> one line is OK.
> void rte_free_sensitive(void *ptr);

Yes it could be on one line, and more compact is my preferred style.
But other functions in this file and DPDK style guide say it should be on its own line.





^ permalink raw reply	[relevance 3%]

* RE: [PATCH v5 3/4] drivers: move iavf common folder to iavf net
  2025-02-10 16:44  2%   ` [PATCH v5 3/4] drivers: move iavf common folder to iavf net Bruce Richardson
@ 2025-02-11 14:12  0%     ` Stokes, Ian
  0 siblings, 0 replies; 200+ results
From: Stokes, Ian @ 2025-02-11 14:12 UTC (permalink / raw)
  To: Richardson, Bruce, dev; +Cc: Richardson, Bruce

> The common/iavf driver folder contains the base code for the iavf
> driver, which is also linked against by the ice driver and others.
> However, there is no need for this to be in common, and we can
> move it to the net/intel/iavf as a base code driver. This involves
> updating dependencies that were on common/iavf to net/iavf
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  devtools/libabigail.abignore                       |  1 +
>  doc/guides/rel_notes/release_25_03.rst             |  5 ++++-
>  drivers/common/iavf/version.map                    | 13 -------------
>  drivers/common/meson.build                         |  1 -
>  .../{common/iavf => net/intel/iavf/base}/README    |  0
>  .../iavf => net/intel/iavf/base}/iavf_adminq.c     |  0
>  .../iavf => net/intel/iavf/base}/iavf_adminq.h     |  0
>  .../iavf => net/intel/iavf/base}/iavf_adminq_cmd.h |  0
>  .../iavf => net/intel/iavf/base}/iavf_alloc.h      |  0
>  .../iavf => net/intel/iavf/base}/iavf_common.c     |  0
>  .../iavf => net/intel/iavf/base}/iavf_devids.h     |  0
>  .../iavf => net/intel/iavf/base}/iavf_impl.c       |  0
>  .../iavf => net/intel/iavf/base}/iavf_osdep.h      |  0
>  .../iavf => net/intel/iavf/base}/iavf_prototype.h  |  0
>  .../iavf => net/intel/iavf/base}/iavf_register.h   |  0
>  .../iavf => net/intel/iavf/base}/iavf_status.h     |  0
>  .../iavf => net/intel/iavf/base}/iavf_type.h       |  0
>  .../iavf => net/intel/iavf/base}/meson.build       |  0
>  .../iavf => net/intel/iavf/base}/virtchnl.h        |  0
>  .../intel/iavf/base}/virtchnl_inline_ipsec.h       |  0
>  drivers/net/intel/iavf/meson.build                 | 13 +++++++++----
>  drivers/net/intel/iavf/version.map                 | 14 ++++++++++++++
>  drivers/net/intel/ice/meson.build                  |  7 +++----
>  drivers/net/intel/idpf/meson.build                 |  2 +-
>  24 files changed, 32 insertions(+), 24 deletions(-)
>  delete mode 100644 drivers/common/iavf/version.map
>  rename drivers/{common/iavf => net/intel/iavf/base}/README (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_adminq.c (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_adminq.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_adminq_cmd.h
> (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_alloc.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_common.c
> (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_devids.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_impl.c (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_osdep.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_prototype.h
> (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_register.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_status.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/iavf_type.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/meson.build (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/virtchnl.h (100%)
>  rename drivers/{common/iavf => net/intel/iavf/base}/virtchnl_inline_ipsec.h
> (100%)
> 
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index b7daca4841..ce501632b3 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -23,6 +23,7 @@
>  ; This is not a libabigail rule (see check-abi.sh).
>  ; This is used for driver removal and other special cases like mlx glue libs.
>  ;
> +; SKIP_LIBRARY=librte_common_iavf
>  ; SKIP_LIBRARY=librte_common_idpf
>  ; SKIP_LIBRARY=librte_common_mlx5_glue
>  ; SKIP_LIBRARY=librte_net_mlx4_glue
> diff --git a/doc/guides/rel_notes/release_25_03.rst
> b/doc/guides/rel_notes/release_25_03.rst
> index 2338a97e76..d2e8b03107 100644
> --- a/doc/guides/rel_notes/release_25_03.rst
> +++ b/doc/guides/rel_notes/release_25_03.rst
> @@ -182,9 +182,12 @@ API Changes
>    ``-Denable_drivers=net/intel/e1000``.
> 
>  * The driver ``common/idpf`` has been merged into the ``net/intel/idpf``
> driver.
> -  This change should have no impact to end applications, but,
> +  Similarly, the ``common/iavf`` driver has been merged into the
> ``net/intel/iavf`` driver.
> +  These changes should have no impact to end applications, but,
>    when specifying the ``idpf`` or ``cpfl`` net drivers to meson via ``-
> Denable_drivers`` option,
>    there is no longer any need to also specify the ``common/idpf`` driver.
> +  In the same way, when specifying the ``iavf`` or ``ice`` net drivers,
> +  there is no need to also specify the ``common/iavf`` driver.
>    Note, however, ``net/intel/cpfl`` driver now depends upon the
> ``net/intel/idpf`` driver.
> 
> 
> diff --git a/drivers/common/iavf/version.map
> b/drivers/common/iavf/version.map
> deleted file mode 100644
> index 6c1427cca4..0000000000
> --- a/drivers/common/iavf/version.map
> +++ /dev/null
> @@ -1,13 +0,0 @@
> -INTERNAL {
> -	global:
> -
> -	iavf_aq_send_msg_to_pf;
> -	iavf_clean_arq_element;
> -	iavf_init_adminq;
> -	iavf_set_mac_type;
> -	iavf_shutdown_adminq;
> -	iavf_vf_parse_hw_config;
> -	iavf_vf_reset;
> -
> -	local: *;
> -};
> diff --git a/drivers/common/meson.build b/drivers/common/meson.build
> index e1e3149d8f..dc096aab0a 100644
> --- a/drivers/common/meson.build
> +++ b/drivers/common/meson.build
> @@ -5,7 +5,6 @@ std_deps = ['eal']
>  drivers = [
>          'cpt',
>          'dpaax',
> -        'iavf',
>          'ionic',
>          'mvep',
>          'octeontx',
> diff --git a/drivers/common/iavf/README
> b/drivers/net/intel/iavf/base/README
> similarity index 100%
> rename from drivers/common/iavf/README
> rename to drivers/net/intel/iavf/base/README
> diff --git a/drivers/common/iavf/iavf_adminq.c
> b/drivers/net/intel/iavf/base/iavf_adminq.c
> similarity index 100%
> rename from drivers/common/iavf/iavf_adminq.c
> rename to drivers/net/intel/iavf/base/iavf_adminq.c
> diff --git a/drivers/common/iavf/iavf_adminq.h
> b/drivers/net/intel/iavf/base/iavf_adminq.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_adminq.h
> rename to drivers/net/intel/iavf/base/iavf_adminq.h
> diff --git a/drivers/common/iavf/iavf_adminq_cmd.h
> b/drivers/net/intel/iavf/base/iavf_adminq_cmd.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_adminq_cmd.h
> rename to drivers/net/intel/iavf/base/iavf_adminq_cmd.h
> diff --git a/drivers/common/iavf/iavf_alloc.h
> b/drivers/net/intel/iavf/base/iavf_alloc.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_alloc.h
> rename to drivers/net/intel/iavf/base/iavf_alloc.h
> diff --git a/drivers/common/iavf/iavf_common.c
> b/drivers/net/intel/iavf/base/iavf_common.c
> similarity index 100%
> rename from drivers/common/iavf/iavf_common.c
> rename to drivers/net/intel/iavf/base/iavf_common.c
> diff --git a/drivers/common/iavf/iavf_devids.h
> b/drivers/net/intel/iavf/base/iavf_devids.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_devids.h
> rename to drivers/net/intel/iavf/base/iavf_devids.h
> diff --git a/drivers/common/iavf/iavf_impl.c
> b/drivers/net/intel/iavf/base/iavf_impl.c
> similarity index 100%
> rename from drivers/common/iavf/iavf_impl.c
> rename to drivers/net/intel/iavf/base/iavf_impl.c
> diff --git a/drivers/common/iavf/iavf_osdep.h
> b/drivers/net/intel/iavf/base/iavf_osdep.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_osdep.h
> rename to drivers/net/intel/iavf/base/iavf_osdep.h
> diff --git a/drivers/common/iavf/iavf_prototype.h
> b/drivers/net/intel/iavf/base/iavf_prototype.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_prototype.h
> rename to drivers/net/intel/iavf/base/iavf_prototype.h
> diff --git a/drivers/common/iavf/iavf_register.h
> b/drivers/net/intel/iavf/base/iavf_register.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_register.h
> rename to drivers/net/intel/iavf/base/iavf_register.h
> diff --git a/drivers/common/iavf/iavf_status.h
> b/drivers/net/intel/iavf/base/iavf_status.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_status.h
> rename to drivers/net/intel/iavf/base/iavf_status.h
> diff --git a/drivers/common/iavf/iavf_type.h
> b/drivers/net/intel/iavf/base/iavf_type.h
> similarity index 100%
> rename from drivers/common/iavf/iavf_type.h
> rename to drivers/net/intel/iavf/base/iavf_type.h
> diff --git a/drivers/common/iavf/meson.build
> b/drivers/net/intel/iavf/base/meson.build
> similarity index 100%
> rename from drivers/common/iavf/meson.build
> rename to drivers/net/intel/iavf/base/meson.build
> diff --git a/drivers/common/iavf/virtchnl.h
> b/drivers/net/intel/iavf/base/virtchnl.h
> similarity index 100%
> rename from drivers/common/iavf/virtchnl.h
> rename to drivers/net/intel/iavf/base/virtchnl.h
> diff --git a/drivers/common/iavf/virtchnl_inline_ipsec.h
> b/drivers/net/intel/iavf/base/virtchnl_inline_ipsec.h
> similarity index 100%
> rename from drivers/common/iavf/virtchnl_inline_ipsec.h
> rename to drivers/net/intel/iavf/base/virtchnl_inline_ipsec.h
> diff --git a/drivers/net/intel/iavf/meson.build
> b/drivers/net/intel/iavf/meson.build
> index d9b605f55a..c823d618e3 100644
> --- a/drivers/net/intel/iavf/meson.build
> +++ b/drivers/net/intel/iavf/meson.build
> @@ -7,9 +7,13 @@ endif
> 
>  testpmd_sources = files('iavf_testpmd.c')
> 
> -deps += ['common_iavf', 'security', 'cryptodev']
> +deps += ['security', 'cryptodev']
> 
>  sources = files(
> +        'base/iavf_adminq.c',
> +        'base/iavf_common.c',
> +        'base/iavf_impl.c',
> +
>          'iavf_ethdev.c',
>          'iavf_rxtx.c',
>          'iavf_vchnl.c',
> @@ -20,8 +24,9 @@ sources = files(
>          'iavf_ipsec_crypto.c',
>          'iavf_fsub.c',
>  )
> +includes += include_directories('base')
> 
> -if arch_subdir == 'x86' and is_variable('static_rte_common_iavf')
> +if arch_subdir == 'x86'
>      sources += files('iavf_rxtx_vec_sse.c')
> 
>      if is_windows and cc.get_id() != 'clang'
> @@ -30,7 +35,7 @@ if arch_subdir == 'x86' and
> is_variable('static_rte_common_iavf')
> 
>      iavf_avx2_lib = static_library('iavf_avx2_lib',
>              'iavf_rxtx_vec_avx2.c',
> -            dependencies: [static_rte_ethdev, static_rte_common_iavf],
> +            dependencies: [static_rte_ethdev],
>              include_directories: includes,
>              c_args: [cflags, '-mavx2'])
>      objs += iavf_avx2_lib.extract_objects('iavf_rxtx_vec_avx2.c')
> @@ -43,7 +48,7 @@ if arch_subdir == 'x86' and
> is_variable('static_rte_common_iavf')
>          endif
>          iavf_avx512_lib = static_library('iavf_avx512_lib',
>                  'iavf_rxtx_vec_avx512.c',
> -                dependencies: [static_rte_ethdev, static_rte_common_iavf],
> +                dependencies: [static_rte_ethdev],
>                  include_directories: includes,
>                  c_args: avx512_args)
>          objs += iavf_avx512_lib.extract_objects('iavf_rxtx_vec_avx512.c')
> diff --git a/drivers/net/intel/iavf/version.map
> b/drivers/net/intel/iavf/version.map
> index 98de64cca2..d18dea64dd 100644
> --- a/drivers/net/intel/iavf/version.map
> +++ b/drivers/net/intel/iavf/version.map
> @@ -17,3 +17,17 @@ EXPERIMENTAL {
>  	# added in 21.11
>  	rte_pmd_ifd_dynflag_proto_xtr_ipsec_crypto_said_mask;
>  };
> +
> +INTERNAL {
> +	global:
> +
> +	iavf_aq_send_msg_to_pf;
> +	iavf_clean_arq_element;
> +	iavf_init_adminq;
> +	iavf_set_mac_type;
> +	iavf_shutdown_adminq;
> +	iavf_vf_parse_hw_config;
> +	iavf_vf_reset;
> +
> +	local: *;
> +};
> diff --git a/drivers/net/intel/ice/meson.build
> b/drivers/net/intel/ice/meson.build
> index beaf21e176..5faf887386 100644
> --- a/drivers/net/intel/ice/meson.build
> +++ b/drivers/net/intel/ice/meson.build
> @@ -18,7 +18,7 @@ sources = files(
> 
>  testpmd_sources = files('ice_testpmd.c')
> 
> -deps += ['hash', 'net', 'common_iavf']
> +deps += ['hash', 'net', 'net_iavf']
>  includes += include_directories('base')
> 
>  if arch_subdir == 'x86'
> @@ -30,7 +30,7 @@ if arch_subdir == 'x86'
> 
>      ice_avx2_lib = static_library('ice_avx2_lib',
>              'ice_rxtx_vec_avx2.c',
> -            dependencies: [static_rte_ethdev, static_rte_kvargs, static_rte_hash],
> +            dependencies: [static_rte_ethdev, static_rte_hash],
>              include_directories: includes,
>              c_args: [cflags, '-mavx2'])
>      objs += ice_avx2_lib.extract_objects('ice_rxtx_vec_avx2.c')
> @@ -43,8 +43,7 @@ if arch_subdir == 'x86'
>          endif
>          ice_avx512_lib = static_library('ice_avx512_lib',
>                  'ice_rxtx_vec_avx512.c',
> -                dependencies: [static_rte_ethdev,
> -                    static_rte_kvargs, static_rte_hash],
> +                dependencies: [static_rte_ethdev, static_rte_hash],
>                  include_directories: includes,
>                  c_args: avx512_args)
>          objs += ice_avx512_lib.extract_objects('ice_rxtx_vec_avx512.c')
> diff --git a/drivers/net/intel/idpf/meson.build
> b/drivers/net/intel/idpf/meson.build
> index 87bc39f76e..d69254484b 100644
> --- a/drivers/net/intel/idpf/meson.build
> +++ b/drivers/net/intel/idpf/meson.build
> @@ -7,7 +7,7 @@ if is_windows
>      subdir_done()
>  endif
> 
> -includes += include_directories('../../../common/iavf')
> +includes += include_directories('../iavf/base')
> 
>  sources = files(
>          'idpf_common_device.c',
> --
> 2.43.0

Patch looks good to me Bruce.

Acked-by: Ian Stokes <ian.stokes@intel.com>


^ permalink raw reply	[relevance 0%]

* [PATCH v7] net: add thread-safe crc api
  2025-02-10 21:27  4%         ` [PATCH v6] " Arkadiusz Kusztal
  2025-02-11  6:23  0%           ` Stephen Hemminger
@ 2025-02-11  9:02  4%           ` Arkadiusz Kusztal
  1 sibling, 0 replies; 200+ results
From: Arkadiusz Kusztal @ 2025-02-11  9:02 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, kai.ji, brian.dooley, stephen, Arkadiusz Kusztal

The current net CRC API is not thread-safe, this patch
solves this by adding another, thread-safe API functions.
This API is also safe to use across multiple processes,
yet with limitations on max-simd-bitwidth, which will be checked only by
the process that created the CRC context; all other processes
(that did not create the context) will use the highest possible
SIMD extension that was built with the binary, but no higher than the one
requested by the CRC context.

Since the change of the API at this point is an ABI break,
these API symbols are versioned with the _26 suffix.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
v2:
- added multi-process safety
v3:
- made the crc context opaque
- versioned old APIs
v4:
- exported rte_net_crc_free symbol
v5:
- fixed unclear comments in release notes section
- aligned `fall-through` comments
v6:
- fixed typos and code formatting
- added entry to the nullfree.cocci script
- added malloc attributes
- reverted copyright changes
v7:
- made net_crc header self-contained

 app/test/test_crc.c                    | 167 ++++++++++---------------
 devtools/cocci/nullfree.cocci          |   3 +
 doc/guides/rel_notes/release_25_03.rst |   5 +
 drivers/crypto/qat/qat_sym.h           |   6 +-
 drivers/crypto/qat/qat_sym_session.c   |   8 ++
 drivers/crypto/qat/qat_sym_session.h   |   2 +
 lib/net/meson.build                    |   2 +
 lib/net/net_crc.h                      |  16 +++
 lib/net/rte_net_crc.c                  | 127 ++++++++++++++++++-
 lib/net/rte_net_crc.h                  |  39 ++++--
 lib/net/version.map                    |   6 +
 11 files changed, 268 insertions(+), 113 deletions(-)

diff --git a/app/test/test_crc.c b/app/test/test_crc.c
index b85fca35fe..f18eff7217 100644
--- a/app/test/test_crc.c
+++ b/app/test/test_crc.c
@@ -44,131 +44,100 @@ static const uint32_t crc32_vec_res = 0xb491aab4;
 static const uint32_t crc32_vec1_res = 0xac54d294;
 static const uint32_t crc32_vec2_res = 0xefaae02f;
 static const uint32_t crc16_vec_res = 0x6bec;
-static const uint16_t crc16_vec1_res = 0x8cdd;
-static const uint16_t crc16_vec2_res = 0xec5b;
+static const uint32_t crc16_vec1_res = 0x8cdd;
+static const uint32_t crc16_vec2_res = 0xec5b;
 
 static int
-crc_calc(const uint8_t *vec,
-	uint32_t vec_len,
-	enum rte_net_crc_type type)
+crc_all_algs(const char *desc, enum rte_net_crc_type type,
+	const uint8_t *data, int data_len, uint32_t res)
 {
-	/* compute CRC */
-	uint32_t ret = rte_net_crc_calc(vec, vec_len, type);
+	struct rte_net_crc *ctx;
+	uint32_t crc;
+	int ret = TEST_SUCCESS;
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SCALAR, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SCALAR\n", desc);
+		debug_hexdump(stdout, "SCALAR", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SSE42, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SSE42\n", desc);
+		debug_hexdump(stdout, "SSE", &crc, 4);
+		ret = TEST_FAILED;
+	}
 
-	/* dump data on console */
-	debug_hexdump(stdout, NULL, vec, vec_len);
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_AVX512, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s AVX512\n", desc);
+		debug_hexdump(stdout, "AVX512", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_NEON, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s NEON\n", desc);
+		debug_hexdump(stdout, "NEON", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
 
-	return  ret;
+	return ret;
 }
 
 static int
-test_crc_calc(void)
-{
+crc_autotest(void)
+{	uint8_t *test_data;
 	uint32_t i;
-	enum rte_net_crc_type type;
-	uint8_t *test_data;
-	uint32_t result;
-	int error;
+	int ret = TEST_SUCCESS;
 
 	/* 32-bit ethernet CRC: Test 1 */
-	type = RTE_NET_CRC32_ETH;
-
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc32_vec_res)
-		return -1;
+	ret = crc_all_algs("32-bit ethernet CRC: Test 1", RTE_NET_CRC32_ETH, crc_vec,
+		sizeof(crc_vec), crc32_vec_res);
 
 	/* 32-bit ethernet CRC: Test 2 */
 	test_data = rte_zmalloc(NULL, CRC32_VEC_LEN1, 0);
 	if (test_data == NULL)
 		return -7;
-
 	for (i = 0; i < CRC32_VEC_LEN1; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN1, type);
-	if (result != crc32_vec1_res) {
-		error = -2;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 2", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN1, crc32_vec1_res);
 
 	/* 32-bit ethernet CRC: Test 3 */
+	memset(test_data, 0, CRC32_VEC_LEN1);
 	for (i = 0; i < CRC32_VEC_LEN2; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN2, type);
-	if (result != crc32_vec2_res) {
-		error = -3;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 3", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN2, crc32_vec2_res);
 
 	/* 16-bit CCITT CRC:  Test 4 */
-	type = RTE_NET_CRC16_CCITT;
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc16_vec_res) {
-		error = -4;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 5 */
-	result = crc_calc(crc16_vec1, CRC16_VEC_LEN1, type);
-	if (result != crc16_vec1_res) {
-		error = -5;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 6 */
-	result = crc_calc(crc16_vec2, CRC16_VEC_LEN2, type);
-	if (result != crc16_vec2_res) {
-		error = -6;
-		goto fail;
-	}
-
-	rte_free(test_data);
-	return 0;
-
-fail:
-	rte_free(test_data);
-	return error;
-}
-
-static int
-test_crc(void)
-{
-	int ret;
-	/* set CRC scalar mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SCALAR);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (scalar): failed (%d)\n", ret);
-		return ret;
-	}
-	/* set CRC sse4.2 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SSE42);
+	crc_all_algs("16-bit CCITT CRC:  Test 4", RTE_NET_CRC16_CCITT, crc_vec,
+		sizeof(crc_vec), crc16_vec_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (x86_64_SSE4.2): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC avx512 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_AVX512);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (x86_64 AVX512): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC neon mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_NEON);
+	/* 16-bit CCITT CRC:  Test 5 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 5", RTE_NET_CRC16_CCITT, crc16_vec1,
+		CRC16_VEC_LEN1, crc16_vec1_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (arm64 neon pmull): failed (%d)\n", ret);
-		return ret;
-	}
+	/* 16-bit CCITT CRC:  Test 6 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 6", RTE_NET_CRC16_CCITT, crc16_vec2,
+		CRC16_VEC_LEN2, crc16_vec2_res);
 
-	return 0;
+	return ret;
 }
 
-REGISTER_FAST_TEST(crc_autotest, true, true, test_crc);
+REGISTER_FAST_TEST(crc_autotest, true, true, crc_autotest);
diff --git a/devtools/cocci/nullfree.cocci b/devtools/cocci/nullfree.cocci
index c0526a2a3f..e7417b69ff 100644
--- a/devtools/cocci/nullfree.cocci
+++ b/devtools/cocci/nullfree.cocci
@@ -138,4 +138,7 @@ expression E;
 |
 - if (E != NULL) BN_free(E);
 + BN_free(E);
+|
+- if (E != NULL) rte_net_crc_free(E);
++ rte_net_crc_free(E);
 )
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 3ef6f8f427..2693011b8f 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -165,6 +165,11 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* net: Changed the API for CRC calculation to be thread safe.
+  An opaque context argument was introduced to the net CRC API containing
+  the algorithm type and length. This argument is added to
+  to ``rte_net_crc_calc``, ``rte_net_crc_set_alg`` and freed with ``rte_net_crc_free``.
+  These functions are versioned to retain binary compatiabilty until the next LTS release.
 
 ABI Changes
 -----------
diff --git a/drivers/crypto/qat/qat_sym.h b/drivers/crypto/qat/qat_sym.h
index f42336d7ed..849e047615 100644
--- a/drivers/crypto/qat/qat_sym.h
+++ b/drivers/crypto/qat/qat_sym.h
@@ -267,8 +267,7 @@ qat_crc_verify(struct qat_sym_session *ctx, struct rte_crypto_op *op)
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				crc_data_ofs);
 
-		crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 
 		if (crc != *(uint32_t *)(crc_data + crc_data_len))
 			op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
@@ -291,8 +290,7 @@ qat_crc_generate(struct qat_sym_session *ctx,
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				sym_op->auth.data.offset);
 		crc = (uint32_t *)(crc_data + crc_data_len);
-		*crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		*crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 	}
 }
 
diff --git a/drivers/crypto/qat/qat_sym_session.c b/drivers/crypto/qat/qat_sym_session.c
index 50d687fd37..7200022adf 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -3174,6 +3174,14 @@ qat_sec_session_set_docsis_parameters(struct rte_cryptodev *dev,
 		ret = qat_sym_session_configure_crc(dev, xform, session);
 		if (ret < 0)
 			return ret;
+	} else {
+		/* Initialize crc algorithm */
+		session->crc = rte_net_crc_set_alg(RTE_NET_CRC_AVX512,
+			RTE_NET_CRC32_ETH);
+		if (session->crc == NULL) {
+			QAT_LOG(ERR, "Cannot initialize CRC context");
+			return -1;
+		}
 	}
 	qat_sym_session_finalize(session);
 
diff --git a/drivers/crypto/qat/qat_sym_session.h b/drivers/crypto/qat/qat_sym_session.h
index 2ca6c8ddf5..2ef2066646 100644
--- a/drivers/crypto/qat/qat_sym_session.h
+++ b/drivers/crypto/qat/qat_sym_session.h
@@ -7,6 +7,7 @@
 #include <rte_crypto.h>
 #include <cryptodev_pmd.h>
 #include <rte_security.h>
+#include <rte_net_crc.h>
 
 #include "qat_common.h"
 #include "icp_qat_hw.h"
@@ -149,6 +150,7 @@ struct qat_sym_session {
 	uint8_t is_zuc256;
 	uint8_t is_wireless;
 	uint32_t slice_types;
+	struct rte_net_crc *crc;
 	enum qat_sym_proto_flag qat_proto_flag;
 	qat_sym_build_request_t build_request[2];
 #ifndef RTE_QAT_OPENSSL
diff --git a/lib/net/meson.build b/lib/net/meson.build
index 8afcc4ed37..b26b377e8e 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2020 Intel Corporation
 
+use_function_versioning=true
+
 headers = files(
         'rte_cksum.h',
         'rte_ip.h',
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 7a74d5406c..a9a6c9542c 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -5,6 +5,22 @@
 #ifndef _NET_CRC_H_
 #define _NET_CRC_H_
 
+#include "rte_net_crc.h"
+
+void
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
+
+struct rte_net_crc *
+rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v25(const void *data,
+	uint32_t data_len, enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 346c285c15..2fb3eec231 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -10,6 +10,8 @@
 #include <rte_net_crc.h>
 #include <rte_log.h>
 #include <rte_vect.h>
+#include <rte_function_versioning.h>
+#include <rte_malloc.h>
 
 #include "net_crc.h"
 
@@ -38,11 +40,20 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len);
 typedef uint32_t
 (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
 
+struct rte_net_crc {
+	enum rte_net_crc_alg alg;
+	enum rte_net_crc_type type;
+};
+
 static rte_net_crc_handler handlers_default[] = {
 	[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_default_handler,
 	[RTE_NET_CRC32_ETH] = rte_crc32_eth_default_handler,
 };
 
+static struct {
+	rte_net_crc_handler f[RTE_NET_CRC_REQS];
+} handlers_dpdk26[RTE_NET_CRC_AVX512 + 1];
+
 static const rte_net_crc_handler *handlers = handlers_default;
 
 static const rte_net_crc_handler handlers_scalar[] = {
@@ -286,10 +297,56 @@ rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len)
 	return handlers[RTE_NET_CRC32_ETH](data, data_len);
 }
 
+static void
+handlers_init(enum rte_net_crc_alg alg)
+{
+	handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler;
+	handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] = rte_crc32_eth_handler;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_avx512_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_avx512_handler;
+			break;
+		}
+#endif
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+		if (SSE42_PCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_sse42_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_sse42_handler;
+		}
+#endif
+		break;
+	case RTE_NET_CRC_NEON:
+#ifdef CC_ARM64_NEON_PMULL_SUPPORT
+		if (NEON_PMULL_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_neon_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_neon_handler;
+			break;
+		}
+#endif
+		/* fall-through */
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+}
+
 /* Public API */
 
 void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg)
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -316,9 +373,59 @@ rte_net_crc_set_alg(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
+VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+
+struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type)
+{
+	uint16_t max_simd_bitwidth;
+	struct rte_net_crc *crc;
+
+	crc = rte_zmalloc(NULL, sizeof(struct rte_net_crc), 0);
+	if (crc == NULL)
+		return NULL;
+	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
+	crc->type = type;
+	crc->alg = RTE_NET_CRC_SCALAR;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_512) {
+			crc->alg = RTE_NET_CRC_AVX512;
+			return crc;
+		}
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_SSE42;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_NEON:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_NEON;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+	return crc;
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
+MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
+	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
+	rte_net_crc_set_alg_v26);
+
+void rte_net_crc_free(struct rte_net_crc *crc)
+{
+	rte_free(crc);
+}
 
 uint32_t
-rte_net_crc_calc(const void *data,
+rte_net_crc_calc_v25(const void *data,
 	uint32_t data_len,
 	enum rte_net_crc_type type)
 {
@@ -330,6 +437,18 @@ rte_net_crc_calc(const void *data,
 
 	return ret;
 }
+VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len)
+{
+	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
+MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len),
+	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
@@ -338,4 +457,8 @@ RTE_INIT(rte_net_crc_init)
 	sse42_pclmulqdq_init();
 	avx512_vpclmulqdq_init();
 	neon_pmull_init();
+	handlers_init(RTE_NET_CRC_SCALAR);
+	handlers_init(RTE_NET_CRC_NEON);
+	handlers_init(RTE_NET_CRC_SSE42);
+	handlers_init(RTE_NET_CRC_AVX512);
 }
diff --git a/lib/net/rte_net_crc.h b/lib/net/rte_net_crc.h
index 72d3e10ff6..3f350b3001 100644
--- a/lib/net/rte_net_crc.h
+++ b/lib/net/rte_net_crc.h
@@ -6,6 +6,7 @@
 #define _RTE_NET_CRC_H_
 
 #include <stdint.h>
+#include <rte_common.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -26,8 +27,21 @@ enum rte_net_crc_alg {
 	RTE_NET_CRC_AVX512,
 };
 
+/** CRC context (algorithm, type) */
+struct rte_net_crc;
+
 /**
- * This API set the CRC computation algorithm (i.e. scalar version,
+ * Frees the memory space pointed to by the CRC context pointer.
+ * If the pointer is NULL, the function does nothing.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
+ */
+void
+rte_net_crc_free(struct rte_net_crc *crc);
+
+/**
+ * This API set the CRC context (i.e. scalar version,
  * x86 64-bit sse4.2 intrinsic version, etc.) and internal data
  * structure.
  *
@@ -37,27 +51,36 @@ enum rte_net_crc_alg {
  *   - RTE_NET_CRC_SSE42 (Use 64-bit SSE4.2 intrinsic)
  *   - RTE_NET_CRC_NEON (Use ARM Neon intrinsic)
  *   - RTE_NET_CRC_AVX512 (Use 512-bit AVX intrinsic)
+ * @param type
+ *   CRC type (enum rte_net_crc_type)
+ *
+ * @return
+ *   Pointer to the CRC context
  */
-void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg);
+struct rte_net_crc *
+rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type type)
+	__rte_malloc __rte_dealloc(rte_net_crc_free, 1);
 
 /**
  * CRC compute API
  *
+ * Note:
+ * The command line argument --force-max-simd-bitwidth will be ignored
+ * by processes that have not created this CRC context.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
  * @param data
  *   Pointer to the packet data for CRC computation
  * @param data_len
  *   Data length for CRC computation
- * @param type
- *   CRC type (enum rte_net_crc_type)
  *
  * @return
  *   CRC value
  */
 uint32_t
-rte_net_crc_calc(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type);
+rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 
 #ifdef __cplusplus
 }
diff --git a/lib/net/version.map b/lib/net/version.map
index bec4ce23ea..d03f3f6ad0 100644
--- a/lib/net/version.map
+++ b/lib/net/version.map
@@ -5,6 +5,7 @@ DPDK_25 {
 	rte_ether_format_addr;
 	rte_ether_unformat_addr;
 	rte_net_crc_calc;
+	rte_net_crc_free;
 	rte_net_crc_set_alg;
 	rte_net_get_ptype;
 	rte_net_make_rarp_packet;
@@ -12,3 +13,8 @@ DPDK_25 {
 
 	local: *;
 };
+
+DPDK_26 {
+	rte_net_crc_calc;
+	rte_net_crc_set_alg;
+} DPDK_25;
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* RE: [PATCH v6] net: add thread-safe crc api
  2025-02-11  6:23  0%           ` Stephen Hemminger
@ 2025-02-11  8:35  0%             ` Kusztal, ArkadiuszX
  0 siblings, 0 replies; 200+ results
From: Kusztal, ArkadiuszX @ 2025-02-11  8:35 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, ferruh.yigit, Ji, Kai, Dooley, Brian



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, February 11, 2025 7:24 AM
> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>
> Cc: dev@dpdk.org; ferruh.yigit@amd.com; Ji, Kai <kai.ji@intel.com>; Dooley,
> Brian <brian.dooley@intel.com>
> Subject: Re: [PATCH v6] net: add thread-safe crc api
> 
> On Mon, 10 Feb 2025 21:27:10 +0000
> Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com> wrote:
> 
> > The current net CRC API is not thread-safe, this patch solves this by
> > adding another, thread-safe API functions.
> > This API is also safe to use across multiple processes, yet with
> > limitations on max-simd-bitwidth, which will be checked only by the
> > process that created the CRC context; all other processes (that did
> > not create the context) will use the highest possible SIMD extension
> > that was built with the binary, but no higher than the one requested
> > by the CRC context.
> >
> > Since the change of the API at this point is an ABI break, these API
> > symbols are versioned with the _26 suffix.
> >
> > Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> > ---
> 
> Thanks for updating so quick, but the problem is you need to move the
> prototype for rte_net_crc_free() to get it to work.
Prototype of ` rte_net_crc_free` function was moved up in this patch, so it is visible. For me, the problem seems to be that the `rte_net_crc.h` is not self-contained (due to malloc addition), and the compiler thinks that the unknown attributes aliases are actual function body. I will fix that in v7. 
> 
> -------------------------------BEGIN LOGS----------------------------
> #################################################################
> ###################
> #### [Begin job log] "ubuntu-22.04-gcc-mini" at step Build and test
> #################################################################
> ###################
>       |         ^~~~~~~~~~~~
> /home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:61:53: error: expected ‘)’
> before numeric constant
>    61 |         __rte_malloc __rte_dealloc(rte_net_crc_free, 1);
>       |                                                     ^~
>       |                                                     )
> /home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:60:1: error: old-style
> parameter declarations in prototyped function definition
>    60 | rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type
> type)
>       | ^~~~~~~~~~~~~~~~~~~
> buildtools/chkincs/chkincs.p/rte_net_crc.c:3: error: expected ‘{’ at end of input
> In file included from buildtools/chkincs/chkincs.p/rte_net_crc.c:1:
> /home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:60:42: error: unused
> parameter ‘alg’ [-Werror=unused-parameter]
>    60 | rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type
> type)
>       |                     ~~~~~~~~~~~~~~~~~~~~~^~~
> /home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:60:69: error: unused
> parameter ‘type’ [-Werror=unused-parameter]
>    60 | rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type
> type)
>       |                                               ~~~~~~~~~~~~~~~~~~~~~~^~~~
> buildtools/chkincs/chkincs.p/rte_net_crc.c:3: error: control reaches end of non-
> void function [-Werror=return-type]
> cc1: all warnings being treated as errors [629/2123] Compiling C object
> buildtools/chkincs/chkincs.p/meson-generated_rte_mpls.c.o
> [630/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-
> generated_rte_arp.c.o
> [631/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-
> generated_rte_ether.c.o
> [632/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-
> generated_rte_net.c.o

^ permalink raw reply	[relevance 0%]

* Re: [PATCH v6] net: add thread-safe crc api
  2025-02-10 21:27  4%         ` [PATCH v6] " Arkadiusz Kusztal
@ 2025-02-11  6:23  0%           ` Stephen Hemminger
  2025-02-11  8:35  0%             ` Kusztal, ArkadiuszX
  2025-02-11  9:02  4%           ` [PATCH v7] " Arkadiusz Kusztal
  1 sibling, 1 reply; 200+ results
From: Stephen Hemminger @ 2025-02-11  6:23 UTC (permalink / raw)
  To: Arkadiusz Kusztal; +Cc: dev, ferruh.yigit, kai.ji, brian.dooley

On Mon, 10 Feb 2025 21:27:10 +0000
Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com> wrote:

> The current net CRC API is not thread-safe, this patch
> solves this by adding another, thread-safe API functions.
> This API is also safe to use across multiple processes,
> yet with limitations on max-simd-bitwidth, which will be checked only by
> the process that created the CRC context; all other processes
> (that did not create the context) will use the highest possible
> SIMD extension that was built with the binary, but no higher than the one
> requested by the CRC context.
> 
> Since the change of the API at this point is an ABI break,
> these API symbols are versioned with the _26 suffix.
> 
> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> ---

Thanks for updating so quick, but the problem is you need to move
the prototype for rte_net_crc_free() to get it to work.

-------------------------------BEGIN LOGS----------------------------
####################################################################################
#### [Begin job log] "ubuntu-22.04-gcc-mini" at step Build and test
####################################################################################
      |         ^~~~~~~~~~~~
/home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:61:53: error: expected ‘)’ before numeric constant
   61 |         __rte_malloc __rte_dealloc(rte_net_crc_free, 1);
      |                                                     ^~
      |                                                     )
/home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:60:1: error: old-style parameter declarations in prototyped function definition
   60 | rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type type)
      | ^~~~~~~~~~~~~~~~~~~
buildtools/chkincs/chkincs.p/rte_net_crc.c:3: error: expected ‘{’ at end of input
In file included from buildtools/chkincs/chkincs.p/rte_net_crc.c:1:
/home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:60:42: error: unused parameter ‘alg’ [-Werror=unused-parameter]
   60 | rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type type)
      |                     ~~~~~~~~~~~~~~~~~~~~~^~~
/home/runner/work/dpdk/dpdk/lib/net/rte_net_crc.h:60:69: error: unused parameter ‘type’ [-Werror=unused-parameter]
   60 | rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type type)
      |                                               ~~~~~~~~~~~~~~~~~~~~~~^~~~
buildtools/chkincs/chkincs.p/rte_net_crc.c:3: error: control reaches end of non-void function [-Werror=return-type]
cc1: all warnings being treated as errors
[629/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_mpls.c.o
[630/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_arp.c.o
[631/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_ether.c.o
[632/2123] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_net.c.o

^ permalink raw reply	[relevance 0%]

* [PATCH v6] net: add thread-safe crc api
  2025-02-07 18:24  4%       ` [PATCH v5] " Arkadiusz Kusztal
@ 2025-02-10 21:27  4%         ` Arkadiusz Kusztal
  2025-02-11  6:23  0%           ` Stephen Hemminger
  2025-02-11  9:02  4%           ` [PATCH v7] " Arkadiusz Kusztal
  0 siblings, 2 replies; 200+ results
From: Arkadiusz Kusztal @ 2025-02-10 21:27 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, kai.ji, brian.dooley, stephen, Arkadiusz Kusztal

The current net CRC API is not thread-safe, this patch
solves this by adding another, thread-safe API functions.
This API is also safe to use across multiple processes,
yet with limitations on max-simd-bitwidth, which will be checked only by
the process that created the CRC context; all other processes
(that did not create the context) will use the highest possible
SIMD extension that was built with the binary, but no higher than the one
requested by the CRC context.

Since the change of the API at this point is an ABI break,
these API symbols are versioned with the _26 suffix.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
v2:
- added multi-process safety
v3:
- made the crc context opaque
- versioned old APIs
v4:
- exported rte_net_crc_free symbol
v5:
- fixed unclear comments in release notes section
- aligned `fall-through` comments
v6:
- fixed typos and code formatting
- added entry to the nullfree.cocci script
- added malloc attributes
- reverted copyright changes

 app/test/test_crc.c                    | 167 ++++++++++---------------
 devtools/cocci/nullfree.cocci          |   3 +
 doc/guides/rel_notes/release_25_03.rst |   5 +
 drivers/crypto/qat/qat_sym.h           |   6 +-
 drivers/crypto/qat/qat_sym_session.c   |   8 ++
 drivers/crypto/qat/qat_sym_session.h   |   2 +
 lib/net/meson.build                    |   2 +
 lib/net/net_crc.h                      |  16 +++
 lib/net/rte_net_crc.c                  | 127 ++++++++++++++++++-
 lib/net/rte_net_crc.h                  |  38 ++++--
 lib/net/version.map                    |   6 +
 11 files changed, 267 insertions(+), 113 deletions(-)

diff --git a/app/test/test_crc.c b/app/test/test_crc.c
index b85fca35fe..f18eff7217 100644
--- a/app/test/test_crc.c
+++ b/app/test/test_crc.c
@@ -44,131 +44,100 @@ static const uint32_t crc32_vec_res = 0xb491aab4;
 static const uint32_t crc32_vec1_res = 0xac54d294;
 static const uint32_t crc32_vec2_res = 0xefaae02f;
 static const uint32_t crc16_vec_res = 0x6bec;
-static const uint16_t crc16_vec1_res = 0x8cdd;
-static const uint16_t crc16_vec2_res = 0xec5b;
+static const uint32_t crc16_vec1_res = 0x8cdd;
+static const uint32_t crc16_vec2_res = 0xec5b;
 
 static int
-crc_calc(const uint8_t *vec,
-	uint32_t vec_len,
-	enum rte_net_crc_type type)
+crc_all_algs(const char *desc, enum rte_net_crc_type type,
+	const uint8_t *data, int data_len, uint32_t res)
 {
-	/* compute CRC */
-	uint32_t ret = rte_net_crc_calc(vec, vec_len, type);
+	struct rte_net_crc *ctx;
+	uint32_t crc;
+	int ret = TEST_SUCCESS;
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SCALAR, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SCALAR\n", desc);
+		debug_hexdump(stdout, "SCALAR", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SSE42, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SSE42\n", desc);
+		debug_hexdump(stdout, "SSE", &crc, 4);
+		ret = TEST_FAILED;
+	}
 
-	/* dump data on console */
-	debug_hexdump(stdout, NULL, vec, vec_len);
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_AVX512, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s AVX512\n", desc);
+		debug_hexdump(stdout, "AVX512", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_NEON, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s NEON\n", desc);
+		debug_hexdump(stdout, "NEON", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
 
-	return  ret;
+	return ret;
 }
 
 static int
-test_crc_calc(void)
-{
+crc_autotest(void)
+{	uint8_t *test_data;
 	uint32_t i;
-	enum rte_net_crc_type type;
-	uint8_t *test_data;
-	uint32_t result;
-	int error;
+	int ret = TEST_SUCCESS;
 
 	/* 32-bit ethernet CRC: Test 1 */
-	type = RTE_NET_CRC32_ETH;
-
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc32_vec_res)
-		return -1;
+	ret = crc_all_algs("32-bit ethernet CRC: Test 1", RTE_NET_CRC32_ETH, crc_vec,
+		sizeof(crc_vec), crc32_vec_res);
 
 	/* 32-bit ethernet CRC: Test 2 */
 	test_data = rte_zmalloc(NULL, CRC32_VEC_LEN1, 0);
 	if (test_data == NULL)
 		return -7;
-
 	for (i = 0; i < CRC32_VEC_LEN1; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN1, type);
-	if (result != crc32_vec1_res) {
-		error = -2;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 2", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN1, crc32_vec1_res);
 
 	/* 32-bit ethernet CRC: Test 3 */
+	memset(test_data, 0, CRC32_VEC_LEN1);
 	for (i = 0; i < CRC32_VEC_LEN2; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN2, type);
-	if (result != crc32_vec2_res) {
-		error = -3;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 3", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN2, crc32_vec2_res);
 
 	/* 16-bit CCITT CRC:  Test 4 */
-	type = RTE_NET_CRC16_CCITT;
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc16_vec_res) {
-		error = -4;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 5 */
-	result = crc_calc(crc16_vec1, CRC16_VEC_LEN1, type);
-	if (result != crc16_vec1_res) {
-		error = -5;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 6 */
-	result = crc_calc(crc16_vec2, CRC16_VEC_LEN2, type);
-	if (result != crc16_vec2_res) {
-		error = -6;
-		goto fail;
-	}
-
-	rte_free(test_data);
-	return 0;
-
-fail:
-	rte_free(test_data);
-	return error;
-}
-
-static int
-test_crc(void)
-{
-	int ret;
-	/* set CRC scalar mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SCALAR);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (scalar): failed (%d)\n", ret);
-		return ret;
-	}
-	/* set CRC sse4.2 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SSE42);
+	crc_all_algs("16-bit CCITT CRC:  Test 4", RTE_NET_CRC16_CCITT, crc_vec,
+		sizeof(crc_vec), crc16_vec_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (x86_64_SSE4.2): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC avx512 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_AVX512);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (x86_64 AVX512): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC neon mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_NEON);
+	/* 16-bit CCITT CRC:  Test 5 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 5", RTE_NET_CRC16_CCITT, crc16_vec1,
+		CRC16_VEC_LEN1, crc16_vec1_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (arm64 neon pmull): failed (%d)\n", ret);
-		return ret;
-	}
+	/* 16-bit CCITT CRC:  Test 6 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 6", RTE_NET_CRC16_CCITT, crc16_vec2,
+		CRC16_VEC_LEN2, crc16_vec2_res);
 
-	return 0;
+	return ret;
 }
 
-REGISTER_FAST_TEST(crc_autotest, true, true, test_crc);
+REGISTER_FAST_TEST(crc_autotest, true, true, crc_autotest);
diff --git a/devtools/cocci/nullfree.cocci b/devtools/cocci/nullfree.cocci
index c0526a2a3f..e7417b69ff 100644
--- a/devtools/cocci/nullfree.cocci
+++ b/devtools/cocci/nullfree.cocci
@@ -138,4 +138,7 @@ expression E;
 |
 - if (E != NULL) BN_free(E);
 + BN_free(E);
+|
+- if (E != NULL) rte_net_crc_free(E);
++ rte_net_crc_free(E);
 )
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 2b139fc35b..424a0252cb 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -161,6 +161,11 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* net: Changed the API for CRC calculation to be thread safe.
+  An opaque context argument was introduced to the net CRC API containing
+  the algorithm type and length. This argument is added to
+  to ``rte_net_crc_calc``, ``rte_net_crc_set_alg`` and freed with ``rte_net_crc_free``.
+  These functions are versioned to retain binary compatiabilty until the next LTS release.
 
 ABI Changes
 -----------
diff --git a/drivers/crypto/qat/qat_sym.h b/drivers/crypto/qat/qat_sym.h
index f42336d7ed..849e047615 100644
--- a/drivers/crypto/qat/qat_sym.h
+++ b/drivers/crypto/qat/qat_sym.h
@@ -267,8 +267,7 @@ qat_crc_verify(struct qat_sym_session *ctx, struct rte_crypto_op *op)
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				crc_data_ofs);
 
-		crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 
 		if (crc != *(uint32_t *)(crc_data + crc_data_len))
 			op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
@@ -291,8 +290,7 @@ qat_crc_generate(struct qat_sym_session *ctx,
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				sym_op->auth.data.offset);
 		crc = (uint32_t *)(crc_data + crc_data_len);
-		*crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		*crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 	}
 }
 
diff --git a/drivers/crypto/qat/qat_sym_session.c b/drivers/crypto/qat/qat_sym_session.c
index 50d687fd37..7200022adf 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -3174,6 +3174,14 @@ qat_sec_session_set_docsis_parameters(struct rte_cryptodev *dev,
 		ret = qat_sym_session_configure_crc(dev, xform, session);
 		if (ret < 0)
 			return ret;
+	} else {
+		/* Initialize crc algorithm */
+		session->crc = rte_net_crc_set_alg(RTE_NET_CRC_AVX512,
+			RTE_NET_CRC32_ETH);
+		if (session->crc == NULL) {
+			QAT_LOG(ERR, "Cannot initialize CRC context");
+			return -1;
+		}
 	}
 	qat_sym_session_finalize(session);
 
diff --git a/drivers/crypto/qat/qat_sym_session.h b/drivers/crypto/qat/qat_sym_session.h
index 2ca6c8ddf5..2ef2066646 100644
--- a/drivers/crypto/qat/qat_sym_session.h
+++ b/drivers/crypto/qat/qat_sym_session.h
@@ -7,6 +7,7 @@
 #include <rte_crypto.h>
 #include <cryptodev_pmd.h>
 #include <rte_security.h>
+#include <rte_net_crc.h>
 
 #include "qat_common.h"
 #include "icp_qat_hw.h"
@@ -149,6 +150,7 @@ struct qat_sym_session {
 	uint8_t is_zuc256;
 	uint8_t is_wireless;
 	uint32_t slice_types;
+	struct rte_net_crc *crc;
 	enum qat_sym_proto_flag qat_proto_flag;
 	qat_sym_build_request_t build_request[2];
 #ifndef RTE_QAT_OPENSSL
diff --git a/lib/net/meson.build b/lib/net/meson.build
index 8afcc4ed37..b26b377e8e 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2020 Intel Corporation
 
+use_function_versioning=true
+
 headers = files(
         'rte_cksum.h',
         'rte_ip.h',
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 7a74d5406c..a9a6c9542c 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -5,6 +5,22 @@
 #ifndef _NET_CRC_H_
 #define _NET_CRC_H_
 
+#include "rte_net_crc.h"
+
+void
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
+
+struct rte_net_crc *
+rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v25(const void *data,
+	uint32_t data_len, enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 346c285c15..2fb3eec231 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -10,6 +10,8 @@
 #include <rte_net_crc.h>
 #include <rte_log.h>
 #include <rte_vect.h>
+#include <rte_function_versioning.h>
+#include <rte_malloc.h>
 
 #include "net_crc.h"
 
@@ -38,11 +40,20 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len);
 typedef uint32_t
 (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
 
+struct rte_net_crc {
+	enum rte_net_crc_alg alg;
+	enum rte_net_crc_type type;
+};
+
 static rte_net_crc_handler handlers_default[] = {
 	[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_default_handler,
 	[RTE_NET_CRC32_ETH] = rte_crc32_eth_default_handler,
 };
 
+static struct {
+	rte_net_crc_handler f[RTE_NET_CRC_REQS];
+} handlers_dpdk26[RTE_NET_CRC_AVX512 + 1];
+
 static const rte_net_crc_handler *handlers = handlers_default;
 
 static const rte_net_crc_handler handlers_scalar[] = {
@@ -286,10 +297,56 @@ rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len)
 	return handlers[RTE_NET_CRC32_ETH](data, data_len);
 }
 
+static void
+handlers_init(enum rte_net_crc_alg alg)
+{
+	handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler;
+	handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] = rte_crc32_eth_handler;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_avx512_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_avx512_handler;
+			break;
+		}
+#endif
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+		if (SSE42_PCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_sse42_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_sse42_handler;
+		}
+#endif
+		break;
+	case RTE_NET_CRC_NEON:
+#ifdef CC_ARM64_NEON_PMULL_SUPPORT
+		if (NEON_PMULL_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_neon_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_neon_handler;
+			break;
+		}
+#endif
+		/* fall-through */
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+}
+
 /* Public API */
 
 void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg)
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -316,9 +373,59 @@ rte_net_crc_set_alg(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
+VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+
+struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type)
+{
+	uint16_t max_simd_bitwidth;
+	struct rte_net_crc *crc;
+
+	crc = rte_zmalloc(NULL, sizeof(struct rte_net_crc), 0);
+	if (crc == NULL)
+		return NULL;
+	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
+	crc->type = type;
+	crc->alg = RTE_NET_CRC_SCALAR;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_512) {
+			crc->alg = RTE_NET_CRC_AVX512;
+			return crc;
+		}
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_SSE42;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_NEON:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_NEON;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+	return crc;
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
+MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
+	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
+	rte_net_crc_set_alg_v26);
+
+void rte_net_crc_free(struct rte_net_crc *crc)
+{
+	rte_free(crc);
+}
 
 uint32_t
-rte_net_crc_calc(const void *data,
+rte_net_crc_calc_v25(const void *data,
 	uint32_t data_len,
 	enum rte_net_crc_type type)
 {
@@ -330,6 +437,18 @@ rte_net_crc_calc(const void *data,
 
 	return ret;
 }
+VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len)
+{
+	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
+MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len),
+	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
@@ -338,4 +457,8 @@ RTE_INIT(rte_net_crc_init)
 	sse42_pclmulqdq_init();
 	avx512_vpclmulqdq_init();
 	neon_pmull_init();
+	handlers_init(RTE_NET_CRC_SCALAR);
+	handlers_init(RTE_NET_CRC_NEON);
+	handlers_init(RTE_NET_CRC_SSE42);
+	handlers_init(RTE_NET_CRC_AVX512);
 }
diff --git a/lib/net/rte_net_crc.h b/lib/net/rte_net_crc.h
index 72d3e10ff6..6c14b8ab6c 100644
--- a/lib/net/rte_net_crc.h
+++ b/lib/net/rte_net_crc.h
@@ -26,8 +26,21 @@ enum rte_net_crc_alg {
 	RTE_NET_CRC_AVX512,
 };
 
+/** CRC context (algorithm, type) */
+struct rte_net_crc;
+
 /**
- * This API set the CRC computation algorithm (i.e. scalar version,
+ * Frees the memory space pointed to by the CRC context pointer.
+ * If the pointer is NULL, the function does nothing.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
+ */
+void
+rte_net_crc_free(struct rte_net_crc *crc);
+
+/**
+ * This API set the CRC context (i.e. scalar version,
  * x86 64-bit sse4.2 intrinsic version, etc.) and internal data
  * structure.
  *
@@ -37,27 +50,36 @@ enum rte_net_crc_alg {
  *   - RTE_NET_CRC_SSE42 (Use 64-bit SSE4.2 intrinsic)
  *   - RTE_NET_CRC_NEON (Use ARM Neon intrinsic)
  *   - RTE_NET_CRC_AVX512 (Use 512-bit AVX intrinsic)
+ * @param type
+ *   CRC type (enum rte_net_crc_type)
+ *
+ * @return
+ *   Pointer to the CRC context
  */
-void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg);
+struct rte_net_crc *
+rte_net_crc_set_alg(enum rte_net_crc_alg alg, enum rte_net_crc_type type)
+	__rte_malloc __rte_dealloc(rte_net_crc_free, 1);
 
 /**
  * CRC compute API
  *
+ * Note:
+ * The command line argument --force-max-simd-bitwidth will be ignored
+ * by processes that have not created this CRC context.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
  * @param data
  *   Pointer to the packet data for CRC computation
  * @param data_len
  *   Data length for CRC computation
- * @param type
- *   CRC type (enum rte_net_crc_type)
  *
  * @return
  *   CRC value
  */
 uint32_t
-rte_net_crc_calc(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type);
+rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 
 #ifdef __cplusplus
 }
diff --git a/lib/net/version.map b/lib/net/version.map
index bec4ce23ea..d03f3f6ad0 100644
--- a/lib/net/version.map
+++ b/lib/net/version.map
@@ -5,6 +5,7 @@ DPDK_25 {
 	rte_ether_format_addr;
 	rte_ether_unformat_addr;
 	rte_net_crc_calc;
+	rte_net_crc_free;
 	rte_net_crc_set_alg;
 	rte_net_get_ptype;
 	rte_net_make_rarp_packet;
@@ -12,3 +13,8 @@ DPDK_25 {
 
 	local: *;
 };
+
+DPDK_26 {
+	rte_net_crc_calc;
+	rte_net_crc_set_alg;
+} DPDK_25;
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* [PATCH v5 3/4] drivers: move iavf common folder to iavf net
    2025-02-10 16:44  3%   ` [PATCH v5 1/4] drivers: merge common and net " Bruce Richardson
@ 2025-02-10 16:44  2%   ` Bruce Richardson
  2025-02-11 14:12  0%     ` Stokes, Ian
  1 sibling, 1 reply; 200+ results
From: Bruce Richardson @ 2025-02-10 16:44 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson

The common/iavf driver folder contains the base code for the iavf
driver, which is also linked against by the ice driver and others.
However, there is no need for this to be in common, and we can
move it to the net/intel/iavf as a base code driver. This involves
updating dependencies that were on common/iavf to net/iavf

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 devtools/libabigail.abignore                       |  1 +
 doc/guides/rel_notes/release_25_03.rst             |  5 ++++-
 drivers/common/iavf/version.map                    | 13 -------------
 drivers/common/meson.build                         |  1 -
 .../{common/iavf => net/intel/iavf/base}/README    |  0
 .../iavf => net/intel/iavf/base}/iavf_adminq.c     |  0
 .../iavf => net/intel/iavf/base}/iavf_adminq.h     |  0
 .../iavf => net/intel/iavf/base}/iavf_adminq_cmd.h |  0
 .../iavf => net/intel/iavf/base}/iavf_alloc.h      |  0
 .../iavf => net/intel/iavf/base}/iavf_common.c     |  0
 .../iavf => net/intel/iavf/base}/iavf_devids.h     |  0
 .../iavf => net/intel/iavf/base}/iavf_impl.c       |  0
 .../iavf => net/intel/iavf/base}/iavf_osdep.h      |  0
 .../iavf => net/intel/iavf/base}/iavf_prototype.h  |  0
 .../iavf => net/intel/iavf/base}/iavf_register.h   |  0
 .../iavf => net/intel/iavf/base}/iavf_status.h     |  0
 .../iavf => net/intel/iavf/base}/iavf_type.h       |  0
 .../iavf => net/intel/iavf/base}/meson.build       |  0
 .../iavf => net/intel/iavf/base}/virtchnl.h        |  0
 .../intel/iavf/base}/virtchnl_inline_ipsec.h       |  0
 drivers/net/intel/iavf/meson.build                 | 13 +++++++++----
 drivers/net/intel/iavf/version.map                 | 14 ++++++++++++++
 drivers/net/intel/ice/meson.build                  |  7 +++----
 drivers/net/intel/idpf/meson.build                 |  2 +-
 24 files changed, 32 insertions(+), 24 deletions(-)
 delete mode 100644 drivers/common/iavf/version.map
 rename drivers/{common/iavf => net/intel/iavf/base}/README (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_adminq.c (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_adminq.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_adminq_cmd.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_alloc.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_common.c (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_devids.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_impl.c (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_osdep.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_prototype.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_register.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_status.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/iavf_type.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/meson.build (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/virtchnl.h (100%)
 rename drivers/{common/iavf => net/intel/iavf/base}/virtchnl_inline_ipsec.h (100%)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index b7daca4841..ce501632b3 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -23,6 +23,7 @@
 ; This is not a libabigail rule (see check-abi.sh).
 ; This is used for driver removal and other special cases like mlx glue libs.
 ;
+; SKIP_LIBRARY=librte_common_iavf
 ; SKIP_LIBRARY=librte_common_idpf
 ; SKIP_LIBRARY=librte_common_mlx5_glue
 ; SKIP_LIBRARY=librte_net_mlx4_glue
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 2338a97e76..d2e8b03107 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -182,9 +182,12 @@ API Changes
   ``-Denable_drivers=net/intel/e1000``.
 
 * The driver ``common/idpf`` has been merged into the ``net/intel/idpf`` driver.
-  This change should have no impact to end applications, but,
+  Similarly, the ``common/iavf`` driver has been merged into the ``net/intel/iavf`` driver.
+  These changes should have no impact to end applications, but,
   when specifying the ``idpf`` or ``cpfl`` net drivers to meson via ``-Denable_drivers`` option,
   there is no longer any need to also specify the ``common/idpf`` driver.
+  In the same way, when specifying the ``iavf`` or ``ice`` net drivers,
+  there is no need to also specify the ``common/iavf`` driver.
   Note, however, ``net/intel/cpfl`` driver now depends upon the ``net/intel/idpf`` driver.
 
 
diff --git a/drivers/common/iavf/version.map b/drivers/common/iavf/version.map
deleted file mode 100644
index 6c1427cca4..0000000000
--- a/drivers/common/iavf/version.map
+++ /dev/null
@@ -1,13 +0,0 @@
-INTERNAL {
-	global:
-
-	iavf_aq_send_msg_to_pf;
-	iavf_clean_arq_element;
-	iavf_init_adminq;
-	iavf_set_mac_type;
-	iavf_shutdown_adminq;
-	iavf_vf_parse_hw_config;
-	iavf_vf_reset;
-
-	local: *;
-};
diff --git a/drivers/common/meson.build b/drivers/common/meson.build
index e1e3149d8f..dc096aab0a 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -5,7 +5,6 @@ std_deps = ['eal']
 drivers = [
         'cpt',
         'dpaax',
-        'iavf',
         'ionic',
         'mvep',
         'octeontx',
diff --git a/drivers/common/iavf/README b/drivers/net/intel/iavf/base/README
similarity index 100%
rename from drivers/common/iavf/README
rename to drivers/net/intel/iavf/base/README
diff --git a/drivers/common/iavf/iavf_adminq.c b/drivers/net/intel/iavf/base/iavf_adminq.c
similarity index 100%
rename from drivers/common/iavf/iavf_adminq.c
rename to drivers/net/intel/iavf/base/iavf_adminq.c
diff --git a/drivers/common/iavf/iavf_adminq.h b/drivers/net/intel/iavf/base/iavf_adminq.h
similarity index 100%
rename from drivers/common/iavf/iavf_adminq.h
rename to drivers/net/intel/iavf/base/iavf_adminq.h
diff --git a/drivers/common/iavf/iavf_adminq_cmd.h b/drivers/net/intel/iavf/base/iavf_adminq_cmd.h
similarity index 100%
rename from drivers/common/iavf/iavf_adminq_cmd.h
rename to drivers/net/intel/iavf/base/iavf_adminq_cmd.h
diff --git a/drivers/common/iavf/iavf_alloc.h b/drivers/net/intel/iavf/base/iavf_alloc.h
similarity index 100%
rename from drivers/common/iavf/iavf_alloc.h
rename to drivers/net/intel/iavf/base/iavf_alloc.h
diff --git a/drivers/common/iavf/iavf_common.c b/drivers/net/intel/iavf/base/iavf_common.c
similarity index 100%
rename from drivers/common/iavf/iavf_common.c
rename to drivers/net/intel/iavf/base/iavf_common.c
diff --git a/drivers/common/iavf/iavf_devids.h b/drivers/net/intel/iavf/base/iavf_devids.h
similarity index 100%
rename from drivers/common/iavf/iavf_devids.h
rename to drivers/net/intel/iavf/base/iavf_devids.h
diff --git a/drivers/common/iavf/iavf_impl.c b/drivers/net/intel/iavf/base/iavf_impl.c
similarity index 100%
rename from drivers/common/iavf/iavf_impl.c
rename to drivers/net/intel/iavf/base/iavf_impl.c
diff --git a/drivers/common/iavf/iavf_osdep.h b/drivers/net/intel/iavf/base/iavf_osdep.h
similarity index 100%
rename from drivers/common/iavf/iavf_osdep.h
rename to drivers/net/intel/iavf/base/iavf_osdep.h
diff --git a/drivers/common/iavf/iavf_prototype.h b/drivers/net/intel/iavf/base/iavf_prototype.h
similarity index 100%
rename from drivers/common/iavf/iavf_prototype.h
rename to drivers/net/intel/iavf/base/iavf_prototype.h
diff --git a/drivers/common/iavf/iavf_register.h b/drivers/net/intel/iavf/base/iavf_register.h
similarity index 100%
rename from drivers/common/iavf/iavf_register.h
rename to drivers/net/intel/iavf/base/iavf_register.h
diff --git a/drivers/common/iavf/iavf_status.h b/drivers/net/intel/iavf/base/iavf_status.h
similarity index 100%
rename from drivers/common/iavf/iavf_status.h
rename to drivers/net/intel/iavf/base/iavf_status.h
diff --git a/drivers/common/iavf/iavf_type.h b/drivers/net/intel/iavf/base/iavf_type.h
similarity index 100%
rename from drivers/common/iavf/iavf_type.h
rename to drivers/net/intel/iavf/base/iavf_type.h
diff --git a/drivers/common/iavf/meson.build b/drivers/net/intel/iavf/base/meson.build
similarity index 100%
rename from drivers/common/iavf/meson.build
rename to drivers/net/intel/iavf/base/meson.build
diff --git a/drivers/common/iavf/virtchnl.h b/drivers/net/intel/iavf/base/virtchnl.h
similarity index 100%
rename from drivers/common/iavf/virtchnl.h
rename to drivers/net/intel/iavf/base/virtchnl.h
diff --git a/drivers/common/iavf/virtchnl_inline_ipsec.h b/drivers/net/intel/iavf/base/virtchnl_inline_ipsec.h
similarity index 100%
rename from drivers/common/iavf/virtchnl_inline_ipsec.h
rename to drivers/net/intel/iavf/base/virtchnl_inline_ipsec.h
diff --git a/drivers/net/intel/iavf/meson.build b/drivers/net/intel/iavf/meson.build
index d9b605f55a..c823d618e3 100644
--- a/drivers/net/intel/iavf/meson.build
+++ b/drivers/net/intel/iavf/meson.build
@@ -7,9 +7,13 @@ endif
 
 testpmd_sources = files('iavf_testpmd.c')
 
-deps += ['common_iavf', 'security', 'cryptodev']
+deps += ['security', 'cryptodev']
 
 sources = files(
+        'base/iavf_adminq.c',
+        'base/iavf_common.c',
+        'base/iavf_impl.c',
+
         'iavf_ethdev.c',
         'iavf_rxtx.c',
         'iavf_vchnl.c',
@@ -20,8 +24,9 @@ sources = files(
         'iavf_ipsec_crypto.c',
         'iavf_fsub.c',
 )
+includes += include_directories('base')
 
-if arch_subdir == 'x86' and is_variable('static_rte_common_iavf')
+if arch_subdir == 'x86'
     sources += files('iavf_rxtx_vec_sse.c')
 
     if is_windows and cc.get_id() != 'clang'
@@ -30,7 +35,7 @@ if arch_subdir == 'x86' and is_variable('static_rte_common_iavf')
 
     iavf_avx2_lib = static_library('iavf_avx2_lib',
             'iavf_rxtx_vec_avx2.c',
-            dependencies: [static_rte_ethdev, static_rte_common_iavf],
+            dependencies: [static_rte_ethdev],
             include_directories: includes,
             c_args: [cflags, '-mavx2'])
     objs += iavf_avx2_lib.extract_objects('iavf_rxtx_vec_avx2.c')
@@ -43,7 +48,7 @@ if arch_subdir == 'x86' and is_variable('static_rte_common_iavf')
         endif
         iavf_avx512_lib = static_library('iavf_avx512_lib',
                 'iavf_rxtx_vec_avx512.c',
-                dependencies: [static_rte_ethdev, static_rte_common_iavf],
+                dependencies: [static_rte_ethdev],
                 include_directories: includes,
                 c_args: avx512_args)
         objs += iavf_avx512_lib.extract_objects('iavf_rxtx_vec_avx512.c')
diff --git a/drivers/net/intel/iavf/version.map b/drivers/net/intel/iavf/version.map
index 98de64cca2..d18dea64dd 100644
--- a/drivers/net/intel/iavf/version.map
+++ b/drivers/net/intel/iavf/version.map
@@ -17,3 +17,17 @@ EXPERIMENTAL {
 	# added in 21.11
 	rte_pmd_ifd_dynflag_proto_xtr_ipsec_crypto_said_mask;
 };
+
+INTERNAL {
+	global:
+
+	iavf_aq_send_msg_to_pf;
+	iavf_clean_arq_element;
+	iavf_init_adminq;
+	iavf_set_mac_type;
+	iavf_shutdown_adminq;
+	iavf_vf_parse_hw_config;
+	iavf_vf_reset;
+
+	local: *;
+};
diff --git a/drivers/net/intel/ice/meson.build b/drivers/net/intel/ice/meson.build
index beaf21e176..5faf887386 100644
--- a/drivers/net/intel/ice/meson.build
+++ b/drivers/net/intel/ice/meson.build
@@ -18,7 +18,7 @@ sources = files(
 
 testpmd_sources = files('ice_testpmd.c')
 
-deps += ['hash', 'net', 'common_iavf']
+deps += ['hash', 'net', 'net_iavf']
 includes += include_directories('base')
 
 if arch_subdir == 'x86'
@@ -30,7 +30,7 @@ if arch_subdir == 'x86'
 
     ice_avx2_lib = static_library('ice_avx2_lib',
             'ice_rxtx_vec_avx2.c',
-            dependencies: [static_rte_ethdev, static_rte_kvargs, static_rte_hash],
+            dependencies: [static_rte_ethdev, static_rte_hash],
             include_directories: includes,
             c_args: [cflags, '-mavx2'])
     objs += ice_avx2_lib.extract_objects('ice_rxtx_vec_avx2.c')
@@ -43,8 +43,7 @@ if arch_subdir == 'x86'
         endif
         ice_avx512_lib = static_library('ice_avx512_lib',
                 'ice_rxtx_vec_avx512.c',
-                dependencies: [static_rte_ethdev,
-                    static_rte_kvargs, static_rte_hash],
+                dependencies: [static_rte_ethdev, static_rte_hash],
                 include_directories: includes,
                 c_args: avx512_args)
         objs += ice_avx512_lib.extract_objects('ice_rxtx_vec_avx512.c')
diff --git a/drivers/net/intel/idpf/meson.build b/drivers/net/intel/idpf/meson.build
index 87bc39f76e..d69254484b 100644
--- a/drivers/net/intel/idpf/meson.build
+++ b/drivers/net/intel/idpf/meson.build
@@ -7,7 +7,7 @@ if is_windows
     subdir_done()
 endif
 
-includes += include_directories('../../../common/iavf')
+includes += include_directories('../iavf/base')
 
 sources = files(
         'idpf_common_device.c',
-- 
2.43.0


^ permalink raw reply	[relevance 2%]

* [PATCH v5 1/4] drivers: merge common and net idpf drivers
  @ 2025-02-10 16:44  3%   ` Bruce Richardson
  2025-02-10 16:44  2%   ` [PATCH v5 3/4] drivers: move iavf common folder to iavf net Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2025-02-10 16:44 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, Praveen Shetty

Rather than having some of the idpf code split out into the "common"
directory, used by both a net/idpf and a net/cpfl driver, we can
merge all idpf code together under net/idpf and have the cpfl driver
depend on "net/idpf" rather than "common/idpf".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Praveen Shetty <praveen.shetty@intel.com>
---
 devtools/libabigail.abignore                  |  1 +
 doc/guides/rel_notes/release_25_03.rst        |  6 +++
 drivers/common/idpf/meson.build               | 41 -------------------
 drivers/common/meson.build                    |  1 -
 drivers/net/intel/cpfl/meson.build            |  2 +-
 .../{common => net/intel}/idpf/base/README    |  0
 .../intel}/idpf/base/idpf_alloc.h             |  0
 .../intel}/idpf/base/idpf_controlq.c          |  0
 .../intel}/idpf/base/idpf_controlq.h          |  0
 .../intel}/idpf/base/idpf_controlq_api.h      |  0
 .../intel}/idpf/base/idpf_controlq_setup.c    |  0
 .../intel}/idpf/base/idpf_devids.h            |  0
 .../intel}/idpf/base/idpf_lan_pf_regs.h       |  0
 .../intel}/idpf/base/idpf_lan_txrx.h          |  0
 .../intel}/idpf/base/idpf_lan_vf_regs.h       |  0
 .../intel}/idpf/base/idpf_osdep.h             |  0
 .../intel}/idpf/base/idpf_prototype.h         |  0
 .../intel}/idpf/base/idpf_type.h              |  0
 .../intel}/idpf/base/meson.build              |  0
 .../intel}/idpf/base/siov_regs.h              |  0
 .../intel}/idpf/base/virtchnl2.h              |  0
 .../intel}/idpf/base/virtchnl2_lan_desc.h     |  0
 .../intel}/idpf/idpf_common_device.c          |  0
 .../intel}/idpf/idpf_common_device.h          |  0
 .../intel}/idpf/idpf_common_logs.h            |  0
 .../intel}/idpf/idpf_common_rxtx.c            |  0
 .../intel}/idpf/idpf_common_rxtx.h            |  0
 .../intel}/idpf/idpf_common_rxtx_avx2.c       |  0
 .../intel}/idpf/idpf_common_rxtx_avx512.c     |  0
 .../intel}/idpf/idpf_common_virtchnl.c        |  0
 .../intel}/idpf/idpf_common_virtchnl.h        |  0
 drivers/net/intel/idpf/meson.build            | 31 ++++++++++++--
 .../{common => net/intel}/idpf/version.map    |  0
 drivers/net/meson.build                       |  2 +-
 34 files changed, 37 insertions(+), 47 deletions(-)
 delete mode 100644 drivers/common/idpf/meson.build
 rename drivers/{common => net/intel}/idpf/base/README (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_alloc.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_api.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_setup.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_devids.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_pf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_txrx.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_vf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_osdep.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_prototype.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_type.h (100%)
 rename drivers/{common => net/intel}/idpf/base/meson.build (100%)
 rename drivers/{common => net/intel}/idpf/base/siov_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2_lan_desc.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_logs.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx_avx2.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx_avx512.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.h (100%)
 rename drivers/{common => net/intel}/idpf/version.map (100%)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 9ae1a36c3a..b7daca4841 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -23,6 +23,7 @@
 ; This is not a libabigail rule (see check-abi.sh).
 ; This is used for driver removal and other special cases like mlx glue libs.
 ;
+; SKIP_LIBRARY=librte_common_idpf
 ; SKIP_LIBRARY=librte_common_mlx5_glue
 ; SKIP_LIBRARY=librte_net_mlx4_glue
 ; SKIP_LIBRARY=librte_net_igc
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 0125235fbd..2338a97e76 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -181,6 +181,12 @@ API Changes
   has changed from ``-Denable_drivers=net/igc`` to
   ``-Denable_drivers=net/intel/e1000``.
 
+* The driver ``common/idpf`` has been merged into the ``net/intel/idpf`` driver.
+  This change should have no impact to end applications, but,
+  when specifying the ``idpf`` or ``cpfl`` net drivers to meson via ``-Denable_drivers`` option,
+  there is no longer any need to also specify the ``common/idpf`` driver.
+  Note, however, ``net/intel/cpfl`` driver now depends upon the ``net/intel/idpf`` driver.
+
 
 ABI Changes
 -----------
diff --git a/drivers/common/idpf/meson.build b/drivers/common/idpf/meson.build
deleted file mode 100644
index 0a30c7e601..0000000000
--- a/drivers/common/idpf/meson.build
+++ /dev/null
@@ -1,41 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2022 Intel Corporation
-
-if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
-    subdir_done()
-endif
-
-includes += include_directories('../iavf')
-
-deps += ['mbuf']
-
-sources = files(
-        'idpf_common_device.c',
-        'idpf_common_rxtx.c',
-        'idpf_common_virtchnl.c',
-)
-
-if arch_subdir == 'x86'
-    idpf_avx2_lib = static_library('idpf_avx2_lib',
-        'idpf_common_rxtx_avx2.c',
-        dependencies: [static_rte_ethdev, static_rte_hash],
-        include_directories: includes,
-        c_args: [cflags, '-mavx2'])
-    objs += idpf_avx2_lib.extract_objects('idpf_common_rxtx_avx2.c')
-
-    if cc_has_avx512
-        cflags += ['-DCC_AVX512_SUPPORT']
-        avx512_args = cflags + cc_avx512_flags
-        if cc.has_argument('-march=skylake-avx512')
-            avx512_args += '-march=skylake-avx512'
-        endif
-        idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
-                'idpf_common_rxtx_avx512.c',
-                dependencies: [static_rte_mbuf,],
-                include_directories: includes,
-                c_args: avx512_args)
-        objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
-    endif
-endif
-
-subdir('base')
diff --git a/drivers/common/meson.build b/drivers/common/meson.build
index 8734af36aa..e1e3149d8f 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -6,7 +6,6 @@ drivers = [
         'cpt',
         'dpaax',
         'iavf',
-        'idpf',
         'ionic',
         'mvep',
         'octeontx',
diff --git a/drivers/net/intel/cpfl/meson.build b/drivers/net/intel/cpfl/meson.build
index 87fcfe0bb1..1f0269d50b 100644
--- a/drivers/net/intel/cpfl/meson.build
+++ b/drivers/net/intel/cpfl/meson.build
@@ -11,7 +11,7 @@ if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
     subdir_done()
 endif
 
-deps += ['hash', 'common_idpf']
+deps += ['hash', 'net_idpf']
 
 sources = files(
         'cpfl_ethdev.c',
diff --git a/drivers/common/idpf/base/README b/drivers/net/intel/idpf/base/README
similarity index 100%
rename from drivers/common/idpf/base/README
rename to drivers/net/intel/idpf/base/README
diff --git a/drivers/common/idpf/base/idpf_alloc.h b/drivers/net/intel/idpf/base/idpf_alloc.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_alloc.h
rename to drivers/net/intel/idpf/base/idpf_alloc.h
diff --git a/drivers/common/idpf/base/idpf_controlq.c b/drivers/net/intel/idpf/base/idpf_controlq.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.c
rename to drivers/net/intel/idpf/base/idpf_controlq.c
diff --git a/drivers/common/idpf/base/idpf_controlq.h b/drivers/net/intel/idpf/base/idpf_controlq.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.h
rename to drivers/net/intel/idpf/base/idpf_controlq.h
diff --git a/drivers/common/idpf/base/idpf_controlq_api.h b/drivers/net/intel/idpf/base/idpf_controlq_api.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_api.h
rename to drivers/net/intel/idpf/base/idpf_controlq_api.h
diff --git a/drivers/common/idpf/base/idpf_controlq_setup.c b/drivers/net/intel/idpf/base/idpf_controlq_setup.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_setup.c
rename to drivers/net/intel/idpf/base/idpf_controlq_setup.c
diff --git a/drivers/common/idpf/base/idpf_devids.h b/drivers/net/intel/idpf/base/idpf_devids.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_devids.h
rename to drivers/net/intel/idpf/base/idpf_devids.h
diff --git a/drivers/common/idpf/base/idpf_lan_pf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_pf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
diff --git a/drivers/common/idpf/base/idpf_lan_txrx.h b/drivers/net/intel/idpf/base/idpf_lan_txrx.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_txrx.h
rename to drivers/net/intel/idpf/base/idpf_lan_txrx.h
diff --git a/drivers/common/idpf/base/idpf_lan_vf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_vf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
diff --git a/drivers/common/idpf/base/idpf_osdep.h b/drivers/net/intel/idpf/base/idpf_osdep.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_osdep.h
rename to drivers/net/intel/idpf/base/idpf_osdep.h
diff --git a/drivers/common/idpf/base/idpf_prototype.h b/drivers/net/intel/idpf/base/idpf_prototype.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_prototype.h
rename to drivers/net/intel/idpf/base/idpf_prototype.h
diff --git a/drivers/common/idpf/base/idpf_type.h b/drivers/net/intel/idpf/base/idpf_type.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_type.h
rename to drivers/net/intel/idpf/base/idpf_type.h
diff --git a/drivers/common/idpf/base/meson.build b/drivers/net/intel/idpf/base/meson.build
similarity index 100%
rename from drivers/common/idpf/base/meson.build
rename to drivers/net/intel/idpf/base/meson.build
diff --git a/drivers/common/idpf/base/siov_regs.h b/drivers/net/intel/idpf/base/siov_regs.h
similarity index 100%
rename from drivers/common/idpf/base/siov_regs.h
rename to drivers/net/intel/idpf/base/siov_regs.h
diff --git a/drivers/common/idpf/base/virtchnl2.h b/drivers/net/intel/idpf/base/virtchnl2.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2.h
rename to drivers/net/intel/idpf/base/virtchnl2.h
diff --git a/drivers/common/idpf/base/virtchnl2_lan_desc.h b/drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2_lan_desc.h
rename to drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
diff --git a/drivers/common/idpf/idpf_common_device.c b/drivers/net/intel/idpf/idpf_common_device.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.c
rename to drivers/net/intel/idpf/idpf_common_device.c
diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/net/intel/idpf/idpf_common_device.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.h
rename to drivers/net/intel/idpf/idpf_common_device.h
diff --git a/drivers/common/idpf/idpf_common_logs.h b/drivers/net/intel/idpf/idpf_common_logs.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_logs.h
rename to drivers/net/intel/idpf/idpf_common_logs.h
diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/net/intel/idpf/idpf_common_rxtx.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.c
rename to drivers/net/intel/idpf/idpf_common_rxtx.c
diff --git a/drivers/common/idpf/idpf_common_rxtx.h b/drivers/net/intel/idpf/idpf_common_rxtx.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.h
rename to drivers/net/intel/idpf/idpf_common_rxtx.h
diff --git a/drivers/common/idpf/idpf_common_rxtx_avx2.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx2.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx_avx2.c
rename to drivers/net/intel/idpf/idpf_common_rxtx_avx2.c
diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx_avx512.c
rename to drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.c b/drivers/net/intel/idpf/idpf_common_virtchnl.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.c
rename to drivers/net/intel/idpf/idpf_common_virtchnl.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.h b/drivers/net/intel/idpf/idpf_common_virtchnl.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.h
rename to drivers/net/intel/idpf/idpf_common_virtchnl.h
diff --git a/drivers/net/intel/idpf/meson.build b/drivers/net/intel/idpf/meson.build
index 34cbdc4da0..87bc39f76e 100644
--- a/drivers/net/intel/idpf/meson.build
+++ b/drivers/net/intel/idpf/meson.build
@@ -7,13 +7,38 @@ if is_windows
     subdir_done()
 endif
 
-deps += ['common_idpf']
+includes += include_directories('../../../common/iavf')
 
 sources = files(
+        'idpf_common_device.c',
+        'idpf_common_rxtx.c',
+        'idpf_common_virtchnl.c',
+
         'idpf_ethdev.c',
         'idpf_rxtx.c',
 )
 
-if arch_subdir == 'x86'and cc_has_avx512
-    cflags += ['-DCC_AVX512_SUPPORT']
+if arch_subdir == 'x86'
+    idpf_avx2_lib = static_library('idpf_avx2_lib',
+        'idpf_common_rxtx_avx2.c',
+        dependencies: [static_rte_ethdev, static_rte_hash],
+        include_directories: includes,
+        c_args: [cflags, '-mavx2'])
+    objs += idpf_avx2_lib.extract_objects('idpf_common_rxtx_avx2.c')
+
+    if cc_has_avx512
+        cflags += ['-DCC_AVX512_SUPPORT']
+        avx512_args = cflags + cc_avx512_flags
+        if cc.has_argument('-march=skylake-avx512')
+            avx512_args += '-march=skylake-avx512'
+        endif
+        idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
+                'idpf_common_rxtx_avx512.c',
+                dependencies: static_rte_mbuf,
+                include_directories: includes,
+                c_args: avx512_args)
+        objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
+    endif
 endif
+
+subdir('base')
diff --git a/drivers/common/idpf/version.map b/drivers/net/intel/idpf/version.map
similarity index 100%
rename from drivers/common/idpf/version.map
rename to drivers/net/intel/idpf/version.map
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index cdc3e7e664..460eb69e5b 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,7 +24,6 @@ drivers = [
         'gve',
         'hinic',
         'hns3',
-        'intel/cpfl',
         'intel/e1000',
         'intel/fm10k',
         'intel/i40e',
@@ -33,6 +32,7 @@ drivers = [
         'intel/idpf',
         'intel/ipn3ke',
         'intel/ixgbe',
+        'intel/cpfl',  # depends on idpf, so must come after it
         'ionic',
         'mana',
         'memif',
-- 
2.43.0


^ permalink raw reply	[relevance 3%]

* [PATCH v5] net: add thread-safe crc api
  2025-02-07  6:37  4%     ` [PATCH v4] " Arkadiusz Kusztal
@ 2025-02-07 18:24  4%       ` Arkadiusz Kusztal
  2025-02-10 21:27  4%         ` [PATCH v6] " Arkadiusz Kusztal
  0 siblings, 1 reply; 200+ results
From: Arkadiusz Kusztal @ 2025-02-07 18:24 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, kai.ji, brian.dooley, stephen, Arkadiusz Kusztal

The current net CRC API is not thread-safe, this patch
solves this by adding another, thread-safe API functions.
This API is also safe to use across multiple processes,
yet with limitations on max-simd-bitwidth, which will be checked only by
the process that created the CRC context; all other processes
(that did not create the context) will use the highest possible
SIMD extension that was built with the binary, but no higher than the one
requested by the CRC context.

Since the change of the API at this point is an ABI break,
these API symbols are versioned with the _26 suffix.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
v2:
- added multi-process safety
v3:
- made the crc context opaque
- versioned old APIs
v4:
- exported rte_net_crc_free symbol
v5:
- fixed unclear comments in release notes section
- aligned `fall-through` comments

 app/test/test_crc.c                    | 169 ++++++++++---------------
 doc/guides/rel_notes/release_25_03.rst |   5 +
 drivers/crypto/qat/qat_sym.h           |   6 +-
 drivers/crypto/qat/qat_sym_session.c   |   8 ++
 drivers/crypto/qat/qat_sym_session.h   |   2 +
 lib/net/meson.build                    |   2 +
 lib/net/net_crc.h                      |  18 ++-
 lib/net/rte_net_crc.c                  | 130 ++++++++++++++++++-
 lib/net/rte_net_crc.h                  |  39 ++++--
 lib/net/version.map                    |   6 +
 10 files changed, 268 insertions(+), 117 deletions(-)

diff --git a/app/test/test_crc.c b/app/test/test_crc.c
index b85fca35fe..d7a11e8025 100644
--- a/app/test/test_crc.c
+++ b/app/test/test_crc.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #include "test.h"
@@ -44,131 +44,100 @@ static const uint32_t crc32_vec_res = 0xb491aab4;
 static const uint32_t crc32_vec1_res = 0xac54d294;
 static const uint32_t crc32_vec2_res = 0xefaae02f;
 static const uint32_t crc16_vec_res = 0x6bec;
-static const uint16_t crc16_vec1_res = 0x8cdd;
-static const uint16_t crc16_vec2_res = 0xec5b;
+static const uint32_t crc16_vec1_res = 0x8cdd;
+static const uint32_t crc16_vec2_res = 0xec5b;
 
 static int
-crc_calc(const uint8_t *vec,
-	uint32_t vec_len,
-	enum rte_net_crc_type type)
+crc_all_algs(const char *desc, enum rte_net_crc_type type,
+	const uint8_t *data, int data_len, uint32_t res)
 {
-	/* compute CRC */
-	uint32_t ret = rte_net_crc_calc(vec, vec_len, type);
+	struct rte_net_crc *ctx;
+	uint32_t crc;
+	int ret = TEST_SUCCESS;
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SCALAR, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SCALAR\n", desc);
+		debug_hexdump(stdout, "SCALAR", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SSE42, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SSE42\n", desc);
+		debug_hexdump(stdout, "SSE", &crc, 4);
+		ret = TEST_FAILED;
+	}
 
-	/* dump data on console */
-	debug_hexdump(stdout, NULL, vec, vec_len);
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_AVX512, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s AVX512\n", desc);
+		debug_hexdump(stdout, "AVX512", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_NEON, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s NEON\n", desc);
+		debug_hexdump(stdout, "NEON", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
 
-	return  ret;
+	return ret;
 }
 
 static int
-test_crc_calc(void)
-{
+crc_autotest(void)
+{	uint8_t *test_data;
 	uint32_t i;
-	enum rte_net_crc_type type;
-	uint8_t *test_data;
-	uint32_t result;
-	int error;
+	int ret = TEST_SUCCESS;
 
 	/* 32-bit ethernet CRC: Test 1 */
-	type = RTE_NET_CRC32_ETH;
-
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc32_vec_res)
-		return -1;
+	ret = crc_all_algs("32-bit ethernet CRC: Test 1", RTE_NET_CRC32_ETH, crc_vec,
+		sizeof(crc_vec), crc32_vec_res);
 
 	/* 32-bit ethernet CRC: Test 2 */
 	test_data = rte_zmalloc(NULL, CRC32_VEC_LEN1, 0);
 	if (test_data == NULL)
 		return -7;
-
 	for (i = 0; i < CRC32_VEC_LEN1; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN1, type);
-	if (result != crc32_vec1_res) {
-		error = -2;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 2", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN1, crc32_vec1_res);
 
 	/* 32-bit ethernet CRC: Test 3 */
+	memset(test_data, 0, CRC32_VEC_LEN1);
 	for (i = 0; i < CRC32_VEC_LEN2; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN2, type);
-	if (result != crc32_vec2_res) {
-		error = -3;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 3", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN2, crc32_vec2_res);
 
 	/* 16-bit CCITT CRC:  Test 4 */
-	type = RTE_NET_CRC16_CCITT;
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc16_vec_res) {
-		error = -4;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 5 */
-	result = crc_calc(crc16_vec1, CRC16_VEC_LEN1, type);
-	if (result != crc16_vec1_res) {
-		error = -5;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 6 */
-	result = crc_calc(crc16_vec2, CRC16_VEC_LEN2, type);
-	if (result != crc16_vec2_res) {
-		error = -6;
-		goto fail;
-	}
-
-	rte_free(test_data);
-	return 0;
-
-fail:
-	rte_free(test_data);
-	return error;
-}
-
-static int
-test_crc(void)
-{
-	int ret;
-	/* set CRC scalar mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SCALAR);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (scalar): failed (%d)\n", ret);
-		return ret;
-	}
-	/* set CRC sse4.2 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SSE42);
+	crc_all_algs("16-bit CCITT CRC:  Test 4", RTE_NET_CRC16_CCITT, crc_vec,
+		sizeof(crc_vec), crc16_vec_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (x86_64_SSE4.2): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC avx512 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_AVX512);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (x86_64 AVX512): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC neon mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_NEON);
+	/* 16-bit CCITT CRC:  Test 5 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 5", RTE_NET_CRC16_CCITT, crc16_vec1,
+		CRC16_VEC_LEN1, crc16_vec1_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (arm64 neon pmull): failed (%d)\n", ret);
-		return ret;
-	}
+	/* 16-bit CCITT CRC:  Test 6 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 6", RTE_NET_CRC16_CCITT, crc16_vec2,
+		CRC16_VEC_LEN2, crc16_vec2_res);
 
-	return 0;
+	return ret;
 }
 
-REGISTER_FAST_TEST(crc_autotest, true, true, test_crc);
+REGISTER_FAST_TEST(crc_autotest, true, true, crc_autotest);
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 2b139fc35b..1b79470077 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -161,6 +161,11 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* net: Changed the API for CRC calculation to be thread safe.
+  An opaque context argument was introduced to the net CRC API containing
+  the algorithim type and length. This argument is added to
+  to ``rte_net_crc_calc``, ``rte_net_crc_set_alg`` and freed with ``rte_net_crc_free``.
+  These functions are versioned to retain binary compatiabilty until the next LTS release.
 
 ABI Changes
 -----------
diff --git a/drivers/crypto/qat/qat_sym.h b/drivers/crypto/qat/qat_sym.h
index f42336d7ed..849e047615 100644
--- a/drivers/crypto/qat/qat_sym.h
+++ b/drivers/crypto/qat/qat_sym.h
@@ -267,8 +267,7 @@ qat_crc_verify(struct qat_sym_session *ctx, struct rte_crypto_op *op)
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				crc_data_ofs);
 
-		crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 
 		if (crc != *(uint32_t *)(crc_data + crc_data_len))
 			op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
@@ -291,8 +290,7 @@ qat_crc_generate(struct qat_sym_session *ctx,
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				sym_op->auth.data.offset);
 		crc = (uint32_t *)(crc_data + crc_data_len);
-		*crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		*crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 	}
 }
 
diff --git a/drivers/crypto/qat/qat_sym_session.c b/drivers/crypto/qat/qat_sym_session.c
index 50d687fd37..7200022adf 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -3174,6 +3174,14 @@ qat_sec_session_set_docsis_parameters(struct rte_cryptodev *dev,
 		ret = qat_sym_session_configure_crc(dev, xform, session);
 		if (ret < 0)
 			return ret;
+	} else {
+		/* Initialize crc algorithm */
+		session->crc = rte_net_crc_set_alg(RTE_NET_CRC_AVX512,
+			RTE_NET_CRC32_ETH);
+		if (session->crc == NULL) {
+			QAT_LOG(ERR, "Cannot initialize CRC context");
+			return -1;
+		}
 	}
 	qat_sym_session_finalize(session);
 
diff --git a/drivers/crypto/qat/qat_sym_session.h b/drivers/crypto/qat/qat_sym_session.h
index 2ca6c8ddf5..2ef2066646 100644
--- a/drivers/crypto/qat/qat_sym_session.h
+++ b/drivers/crypto/qat/qat_sym_session.h
@@ -7,6 +7,7 @@
 #include <rte_crypto.h>
 #include <cryptodev_pmd.h>
 #include <rte_security.h>
+#include <rte_net_crc.h>
 
 #include "qat_common.h"
 #include "icp_qat_hw.h"
@@ -149,6 +150,7 @@ struct qat_sym_session {
 	uint8_t is_zuc256;
 	uint8_t is_wireless;
 	uint32_t slice_types;
+	struct rte_net_crc *crc;
 	enum qat_sym_proto_flag qat_proto_flag;
 	qat_sym_build_request_t build_request[2];
 #ifndef RTE_QAT_OPENSSL
diff --git a/lib/net/meson.build b/lib/net/meson.build
index 8afcc4ed37..b26b377e8e 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2020 Intel Corporation
 
+use_function_versioning=true
+
 headers = files(
         'rte_cksum.h',
         'rte_ip.h',
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 7a74d5406c..563ea809a9 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -1,10 +1,26 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020-2025 Intel Corporation
  */
 
 #ifndef _NET_CRC_H_
 #define _NET_CRC_H_
 
+#include "rte_net_crc.h"
+
+void
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
+
+struct rte_net_crc *
+rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v25(const void *data,
+	uint32_t data_len, enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 346c285c15..3a41df6eb9 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #include <stddef.h>
@@ -10,6 +10,8 @@
 #include <rte_net_crc.h>
 #include <rte_log.h>
 #include <rte_vect.h>
+#include <rte_function_versioning.h>
+#include <rte_malloc.h>
 
 #include "net_crc.h"
 
@@ -38,11 +40,21 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len);
 typedef uint32_t
 (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
 
+struct rte_net_crc {
+	enum rte_net_crc_alg alg;
+	enum rte_net_crc_type type;
+};
+
 static rte_net_crc_handler handlers_default[] = {
 	[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_default_handler,
 	[RTE_NET_CRC32_ETH] = rte_crc32_eth_default_handler,
 };
 
+static struct
+{
+	rte_net_crc_handler f[RTE_NET_CRC_REQS];
+} handlers_dpdk26[RTE_NET_CRC_AVX512 + 1];
+
 static const rte_net_crc_handler *handlers = handlers_default;
 
 static const rte_net_crc_handler handlers_scalar[] = {
@@ -286,10 +298,56 @@ rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len)
 	return handlers[RTE_NET_CRC32_ETH](data, data_len);
 }
 
+static void
+handlers_init(enum rte_net_crc_alg alg)
+{
+	handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler;
+	handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] = rte_crc32_eth_handler;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_avx512_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_avx512_handler;
+			break;
+		}
+#endif
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+		if (SSE42_PCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_sse42_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_sse42_handler;
+		}
+#endif
+		break;
+	case RTE_NET_CRC_NEON:
+#ifdef CC_ARM64_NEON_PMULL_SUPPORT
+		if (NEON_PMULL_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_neon_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_neon_handler;
+			break;
+		}
+#endif
+		/* fall-through */
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+}
+
 /* Public API */
 
 void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg)
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -316,9 +374,59 @@ rte_net_crc_set_alg(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
+VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+
+struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type)
+{
+	uint16_t max_simd_bitwidth;
+	struct rte_net_crc *crc;
+
+	crc = rte_zmalloc(NULL, sizeof(struct rte_net_crc), 0);
+	if (crc == NULL)
+		return NULL;
+	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
+	crc->type = type;
+	crc->alg = RTE_NET_CRC_SCALAR;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_512) {
+			crc->alg = RTE_NET_CRC_AVX512;
+			return crc;
+		}
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_SSE42;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_NEON:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_NEON;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+	return crc;
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
+MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
+	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
+	rte_net_crc_set_alg_v26);
+
+void rte_net_crc_free(struct rte_net_crc *crc)
+{
+	rte_free(crc);
+}
 
 uint32_t
-rte_net_crc_calc(const void *data,
+rte_net_crc_calc_v25(const void *data,
 	uint32_t data_len,
 	enum rte_net_crc_type type)
 {
@@ -330,6 +438,18 @@ rte_net_crc_calc(const void *data,
 
 	return ret;
 }
+VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len)
+{
+	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
+MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len),
+	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
@@ -338,4 +458,8 @@ RTE_INIT(rte_net_crc_init)
 	sse42_pclmulqdq_init();
 	avx512_vpclmulqdq_init();
 	neon_pmull_init();
+	handlers_init(RTE_NET_CRC_SCALAR);
+	handlers_init(RTE_NET_CRC_NEON);
+	handlers_init(RTE_NET_CRC_SSE42);
+	handlers_init(RTE_NET_CRC_AVX512);
 }
diff --git a/lib/net/rte_net_crc.h b/lib/net/rte_net_crc.h
index 72d3e10ff6..ffac8c2f1f 100644
--- a/lib/net/rte_net_crc.h
+++ b/lib/net/rte_net_crc.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #ifndef _RTE_NET_CRC_H_
@@ -26,8 +26,11 @@ enum rte_net_crc_alg {
 	RTE_NET_CRC_AVX512,
 };
 
+/** CRC context (algorithm, type) */
+struct rte_net_crc;
+
 /**
- * This API set the CRC computation algorithm (i.e. scalar version,
+ * This API set the CRC context (i.e. scalar version,
  * x86 64-bit sse4.2 intrinsic version, etc.) and internal data
  * structure.
  *
@@ -37,27 +40,45 @@ enum rte_net_crc_alg {
  *   - RTE_NET_CRC_SSE42 (Use 64-bit SSE4.2 intrinsic)
  *   - RTE_NET_CRC_NEON (Use ARM Neon intrinsic)
  *   - RTE_NET_CRC_AVX512 (Use 512-bit AVX intrinsic)
+ * @param type
+ *   CRC type (enum rte_net_crc_type)
+ *
+ * @return
+ *   Pointer to the CRC context
  */
-void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg);
+struct rte_net_crc *
+rte_net_crc_set_alg(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
 
 /**
  * CRC compute API
  *
+ * Note:
+ * The command line argument --force-max-simd-bitwidth will be ignored
+ * by processes that have not created this CRC context.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
  * @param data
  *   Pointer to the packet data for CRC computation
  * @param data_len
  *   Data length for CRC computation
- * @param type
- *   CRC type (enum rte_net_crc_type)
  *
  * @return
  *   CRC value
  */
 uint32_t
-rte_net_crc_calc(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type);
+rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
+/**
+ * Frees the memory space pointed to by the CRC context pointer.
+ * If the pointer is NULL, the function does nothing.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
+ */
+void
+rte_net_crc_free(struct rte_net_crc *crc);
 
 #ifdef __cplusplus
 }
diff --git a/lib/net/version.map b/lib/net/version.map
index bec4ce23ea..d03f3f6ad0 100644
--- a/lib/net/version.map
+++ b/lib/net/version.map
@@ -5,6 +5,7 @@ DPDK_25 {
 	rte_ether_format_addr;
 	rte_ether_unformat_addr;
 	rte_net_crc_calc;
+	rte_net_crc_free;
 	rte_net_crc_set_alg;
 	rte_net_get_ptype;
 	rte_net_make_rarp_packet;
@@ -12,3 +13,8 @@ DPDK_25 {
 
 	local: *;
 };
+
+DPDK_26 {
+	rte_net_crc_calc;
+	rte_net_crc_set_alg;
+} DPDK_25;
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* RE: [PATCH v22 00/27] remove use of VLAs for Windows
  2025-02-06 20:44  4%   ` David Marchand
@ 2025-02-07 14:23  3%     ` Konstantin Ananyev
  2025-02-18 14:22  0%       ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Konstantin Ananyev @ 2025-02-07 14:23 UTC (permalink / raw)
  To: David Marchand, Andre Muezerie; +Cc: dev, thomas, honnappa.nagarahalli

Hi David,

> > As per guidance technical board meeting 2024/04/17. This series
> > removes the use of VLAs from code built for Windows for all 3
> > toolchains. If there are additional opportunities to convert VLAs
> > to regular C arrays please provide the details for incorporation
> > into the series.
> >
> > MSVC does not support VLAs, replace VLAs with standard C arrays
> > or alloca(). alloca() is available for all toolchain/platform
> > combinations officially supported by DPDK.
> 
> - I have one concern wrt patch 7.
> This changes the API/ABI of the RCU library.
> ABI can't be broken in the 25.03 release.
> 
> Since MSVC builds do not include RCU yet, I skipped this change and
> adjusted this libray meson.build.
> 
> Konstantin, do you think patch 7 could be rewritten to make use of
> alloca() and avoid an API change?
> https://patchwork.dpdk.org/project/dpdk/patch/1738805610-17507-8-git-send-email-andremue@linux.microsoft.com/

I am not big fan of alloca() approach, but yes it is surely possible.
BTW, why it is considered ad API/ABI change?
Because we introduce extra limit on max allowable size?
If that would help somehow, we can make it even bigger: 1K or so. 

> 
> - There is also some VLA in examples/l2fwd-cat, so I had to adjust
> this example meson.build accordingly.
> 
> Series applied, thanks André.
> 
> 
> --
> David Marchand
> 


^ permalink raw reply	[relevance 3%]

* [PATCH v3 02/36] net/igc: merge with net/e1000
  @ 2025-02-07 12:44  1%   ` Anatoly Burakov
  0 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2025-02-07 12:44 UTC (permalink / raw)
  To: dev, Thomas Monjalon

IGC and E1000 drivers are derived from the same base code. Now that e1000
code has enabled support for i225 devices, move IGC ethdev code to e1000
directory (renaming references to base code from igc_* to e1000_*).

This patch also disables build of igc driver, as it is no longer able to
be built because the ethdev part was moved to e1000. It is effectively
removed from DPDK at this point.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 MAINTAINERS                                   |   8 +-
 devtools/libabigail.abignore                  |   1 +
 doc/guides/nics/igc.rst                       |   2 +-
 doc/guides/rel_notes/release_25_03.rst        |  13 +-
 drivers/net/intel/e1000/base/README           |   4 +-
 drivers/net/intel/{igc => e1000}/igc_ethdev.c | 910 +++++++++---------
 drivers/net/intel/{igc => e1000}/igc_ethdev.h |  32 +-
 drivers/net/intel/{igc => e1000}/igc_filter.c |  84 +-
 drivers/net/intel/{igc => e1000}/igc_filter.h |   0
 drivers/net/intel/{igc => e1000}/igc_flow.c   |   2 +-
 drivers/net/intel/{igc => e1000}/igc_flow.h   |   0
 drivers/net/intel/{igc => e1000}/igc_logs.c   |   2 +-
 drivers/net/intel/{igc => e1000}/igc_txrx.c   | 376 ++++----
 drivers/net/intel/{igc => e1000}/igc_txrx.h   |   6 +-
 drivers/net/intel/e1000/meson.build           |  11 +
 drivers/net/meson.build                       |   1 -
 16 files changed, 736 insertions(+), 716 deletions(-)
 rename drivers/net/intel/{igc => e1000}/igc_ethdev.c (73%)
 rename drivers/net/intel/{igc => e1000}/igc_ethdev.h (91%)
 rename drivers/net/intel/{igc => e1000}/igc_filter.c (81%)
 rename drivers/net/intel/{igc => e1000}/igc_filter.h (100%)
 rename drivers/net/intel/{igc => e1000}/igc_flow.c (99%)
 rename drivers/net/intel/{igc => e1000}/igc_flow.h (100%)
 rename drivers/net/intel/{igc => e1000}/igc_logs.c (90%)
 rename drivers/net/intel/{igc => e1000}/igc_txrx.c (87%)
 rename drivers/net/intel/{igc => e1000}/igc_txrx.h (97%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 766d9f1979..4faf7ce537 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -789,6 +789,8 @@ F: doc/guides/nics/e1000em.rst
 F: doc/guides/nics/intel_vf.rst
 F: doc/guides/nics/features/e1000.ini
 F: doc/guides/nics/features/igb*.ini
+F: doc/guides/nics/features/igc.ini
+F: doc/guides/nics/igc.rst
 
 Intel ixgbe
 M: Anatoly Burakov <anatoly.burakov@intel.com>
@@ -846,12 +848,6 @@ F: drivers/net/intel/cpfl/
 F: doc/guides/nics/cpfl.rst
 F: doc/guides/nics/features/cpfl.ini
 
-Intel igc
-T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/intel/igc/
-F: doc/guides/nics/igc.rst
-F: doc/guides/nics/features/igc.ini
-
 Intel ipn3ke
 M: Rosen Xu <rosen.xu@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..9ae1a36c3a 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -25,6 +25,7 @@
 ;
 ; SKIP_LIBRARY=librte_common_mlx5_glue
 ; SKIP_LIBRARY=librte_net_mlx4_glue
+; SKIP_LIBRARY=librte_net_igc
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Experimental APIs exceptions ;
diff --git a/doc/guides/nics/igc.rst b/doc/guides/nics/igc.rst
index c5af806b7b..c267431c5f 100644
--- a/doc/guides/nics/igc.rst
+++ b/doc/guides/nics/igc.rst
@@ -4,7 +4,7 @@
 IGC Poll Mode Driver
 ======================
 
-The IGC PMD (**librte_net_igc**) provides poll mode driver support for Foxville
+The IGC PMD (**librte_net_e1000**) provides poll mode driver support for Foxville
 I225 and I226 Series Network Adapters.
 
 - For information about I225, please refer to: `Intel® Ethernet Controller I225 Series
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 269ab6f68a..341fdb9a37 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -93,6 +93,9 @@ New Features
 
   Added network driver for the Yunsilicon metaScale serials NICs.
 
+* **Updated Intel e1000 driver.**
+
+   * Added support for the Intel i225-series NICs (previously handled by net/igc).
 
 Removed Items
 -------------
@@ -126,12 +129,20 @@ API Changes
   ``__rte_packed_begin`` / ``__rte_packed_end``.
 
 * build: The Intel networking drivers:
-  cpfl, e1000, fm10k, i40e, iavf, ice, idpf, igc, ipn3ke and ixgbe,
+  cpfl, e1000, fm10k, i40e, iavf, ice, idpf, ipn3ke and ixgbe,
   have been moved from ``drivers/net`` to a new ``drivers/net/intel`` directory.
   The resulting build output, including the driver filenames, is the same,
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* build: The Intel IGC networking driver was merged with e1000 driver and is no
+  longer provided as a separate driver. The resulting build output will not have
+  the ``librte_net_igc.*`` driver files any more, but the ``librte_net_e1000.*``
+  driver files will provide support all of the devices and features of the old
+  driver. In addition, to enable/disable the driver via Meson option, the path
+  has changed from ``-Denable_drivers=net/igc`` to
+  ``-Denable_drivers=net/intel/e1000``.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/intel/e1000/base/README b/drivers/net/intel/e1000/base/README
index b84ee5ad6e..5d083a0e48 100644
--- a/drivers/net/intel/e1000/base/README
+++ b/drivers/net/intel/e1000/base/README
@@ -2,7 +2,7 @@
  * Copyright(c) 2010-2020 Intel Corporation
  */
 
-This directory contains source code of FreeBSD em & igb drivers of version
+This directory contains source code of FreeBSD em, igb, and igc drivers of version
 cid-gigabit.2020.06.05.tar.gz released by ND. The sub-directory of base/
 contains the original source package.
 
@@ -24,6 +24,8 @@ This driver is valid for the product(s) listed below
 * Intel® Ethernet Controller I350 Series
 * Intel® Ethernet Controller I210 Series
 * Intel® Ethernet Controller I211
+* Intel® Ethernet Controller I225
+* Intel® Ethernet Controller I226
 * Intel® Ethernet Controller I354 Series
 * Intel® Ethernet Controller DH89XXCC Series
 
diff --git a/drivers/net/intel/igc/igc_ethdev.c b/drivers/net/intel/e1000/igc_ethdev.c
similarity index 73%
rename from drivers/net/intel/igc/igc_ethdev.c
rename to drivers/net/intel/e1000/igc_ethdev.c
index 87d7f7caa0..5563cee09c 100644
--- a/drivers/net/intel/igc/igc_ethdev.c
+++ b/drivers/net/intel/e1000/igc_ethdev.c
@@ -13,7 +13,7 @@
 #include <rte_malloc.h>
 #include <rte_alarm.h>
 
-#include "igc_logs.h"
+#include "e1000_logs.h"
 #include "igc_txrx.h"
 #include "igc_filter.h"
 #include "igc_flow.h"
@@ -106,18 +106,18 @@ static const struct rte_eth_desc_lim tx_desc_lim = {
 };
 
 static const struct rte_pci_id pci_id_igc_map[] = {
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I225_LM) },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I225_LMVP) },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I225_V)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I225_I)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I225_IT)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I225_K)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I226_K)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I226_LMVP)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I226_LM)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I226_V)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I226_IT)  },
-	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, IGC_DEV_ID_I226_BLANK_NVM)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I225_LM) },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I225_LMVP) },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I225_V)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I225_I)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I225_IT)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I225_K)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I226_K)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I226_LMVP)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I226_LM)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I226_V)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I226_IT)  },
+	{ RTE_PCI_DEVICE(IGC_INTEL_VENDOR_ID, E1000_DEV_ID_I226_BLANK_NVM)  },
 	{ .vendor_id = 0, /* sentinel */ },
 };
 
@@ -128,64 +128,64 @@ struct rte_igc_xstats_name_off {
 };
 
 static const struct rte_igc_xstats_name_off rte_igc_stats_strings[] = {
-	{"rx_crc_errors", offsetof(struct igc_hw_stats, crcerrs)},
-	{"rx_align_errors", offsetof(struct igc_hw_stats, algnerrc)},
-	{"rx_errors", offsetof(struct igc_hw_stats, rxerrc)},
-	{"rx_missed_packets", offsetof(struct igc_hw_stats, mpc)},
-	{"tx_single_collision_packets", offsetof(struct igc_hw_stats, scc)},
-	{"tx_multiple_collision_packets", offsetof(struct igc_hw_stats, mcc)},
-	{"tx_excessive_collision_packets", offsetof(struct igc_hw_stats,
+	{"rx_crc_errors", offsetof(struct e1000_hw_stats, crcerrs)},
+	{"rx_align_errors", offsetof(struct e1000_hw_stats, algnerrc)},
+	{"rx_errors", offsetof(struct e1000_hw_stats, rxerrc)},
+	{"rx_missed_packets", offsetof(struct e1000_hw_stats, mpc)},
+	{"tx_single_collision_packets", offsetof(struct e1000_hw_stats, scc)},
+	{"tx_multiple_collision_packets", offsetof(struct e1000_hw_stats, mcc)},
+	{"tx_excessive_collision_packets", offsetof(struct e1000_hw_stats,
 		ecol)},
-	{"tx_late_collisions", offsetof(struct igc_hw_stats, latecol)},
-	{"tx_total_collisions", offsetof(struct igc_hw_stats, colc)},
-	{"tx_deferred_packets", offsetof(struct igc_hw_stats, dc)},
-	{"tx_no_carrier_sense_packets", offsetof(struct igc_hw_stats, tncrs)},
-	{"tx_discarded_packets", offsetof(struct igc_hw_stats, htdpmc)},
-	{"rx_length_errors", offsetof(struct igc_hw_stats, rlec)},
-	{"rx_xon_packets", offsetof(struct igc_hw_stats, xonrxc)},
-	{"tx_xon_packets", offsetof(struct igc_hw_stats, xontxc)},
-	{"rx_xoff_packets", offsetof(struct igc_hw_stats, xoffrxc)},
-	{"tx_xoff_packets", offsetof(struct igc_hw_stats, xofftxc)},
-	{"rx_flow_control_unsupported_packets", offsetof(struct igc_hw_stats,
+	{"tx_late_collisions", offsetof(struct e1000_hw_stats, latecol)},
+	{"tx_total_collisions", offsetof(struct e1000_hw_stats, colc)},
+	{"tx_deferred_packets", offsetof(struct e1000_hw_stats, dc)},
+	{"tx_no_carrier_sense_packets", offsetof(struct e1000_hw_stats, tncrs)},
+	{"tx_discarded_packets", offsetof(struct e1000_hw_stats, htdpmc)},
+	{"rx_length_errors", offsetof(struct e1000_hw_stats, rlec)},
+	{"rx_xon_packets", offsetof(struct e1000_hw_stats, xonrxc)},
+	{"tx_xon_packets", offsetof(struct e1000_hw_stats, xontxc)},
+	{"rx_xoff_packets", offsetof(struct e1000_hw_stats, xoffrxc)},
+	{"tx_xoff_packets", offsetof(struct e1000_hw_stats, xofftxc)},
+	{"rx_flow_control_unsupported_packets", offsetof(struct e1000_hw_stats,
 		fcruc)},
-	{"rx_size_64_packets", offsetof(struct igc_hw_stats, prc64)},
-	{"rx_size_65_to_127_packets", offsetof(struct igc_hw_stats, prc127)},
-	{"rx_size_128_to_255_packets", offsetof(struct igc_hw_stats, prc255)},
-	{"rx_size_256_to_511_packets", offsetof(struct igc_hw_stats, prc511)},
-	{"rx_size_512_to_1023_packets", offsetof(struct igc_hw_stats,
+	{"rx_size_64_packets", offsetof(struct e1000_hw_stats, prc64)},
+	{"rx_size_65_to_127_packets", offsetof(struct e1000_hw_stats, prc127)},
+	{"rx_size_128_to_255_packets", offsetof(struct e1000_hw_stats, prc255)},
+	{"rx_size_256_to_511_packets", offsetof(struct e1000_hw_stats, prc511)},
+	{"rx_size_512_to_1023_packets", offsetof(struct e1000_hw_stats,
 		prc1023)},
-	{"rx_size_1024_to_max_packets", offsetof(struct igc_hw_stats,
+	{"rx_size_1024_to_max_packets", offsetof(struct e1000_hw_stats,
 		prc1522)},
-	{"rx_broadcast_packets", offsetof(struct igc_hw_stats, bprc)},
-	{"rx_multicast_packets", offsetof(struct igc_hw_stats, mprc)},
-	{"rx_undersize_errors", offsetof(struct igc_hw_stats, ruc)},
-	{"rx_fragment_errors", offsetof(struct igc_hw_stats, rfc)},
-	{"rx_oversize_errors", offsetof(struct igc_hw_stats, roc)},
-	{"rx_jabber_errors", offsetof(struct igc_hw_stats, rjc)},
-	{"rx_no_buffers", offsetof(struct igc_hw_stats, rnbc)},
-	{"rx_management_packets", offsetof(struct igc_hw_stats, mgprc)},
-	{"rx_management_dropped", offsetof(struct igc_hw_stats, mgpdc)},
-	{"tx_management_packets", offsetof(struct igc_hw_stats, mgptc)},
-	{"rx_total_packets", offsetof(struct igc_hw_stats, tpr)},
-	{"tx_total_packets", offsetof(struct igc_hw_stats, tpt)},
-	{"rx_total_bytes", offsetof(struct igc_hw_stats, tor)},
-	{"tx_total_bytes", offsetof(struct igc_hw_stats, tot)},
-	{"tx_size_64_packets", offsetof(struct igc_hw_stats, ptc64)},
-	{"tx_size_65_to_127_packets", offsetof(struct igc_hw_stats, ptc127)},
-	{"tx_size_128_to_255_packets", offsetof(struct igc_hw_stats, ptc255)},
-	{"tx_size_256_to_511_packets", offsetof(struct igc_hw_stats, ptc511)},
-	{"tx_size_512_to_1023_packets", offsetof(struct igc_hw_stats,
+	{"rx_broadcast_packets", offsetof(struct e1000_hw_stats, bprc)},
+	{"rx_multicast_packets", offsetof(struct e1000_hw_stats, mprc)},
+	{"rx_undersize_errors", offsetof(struct e1000_hw_stats, ruc)},
+	{"rx_fragment_errors", offsetof(struct e1000_hw_stats, rfc)},
+	{"rx_oversize_errors", offsetof(struct e1000_hw_stats, roc)},
+	{"rx_jabber_errors", offsetof(struct e1000_hw_stats, rjc)},
+	{"rx_no_buffers", offsetof(struct e1000_hw_stats, rnbc)},
+	{"rx_management_packets", offsetof(struct e1000_hw_stats, mgprc)},
+	{"rx_management_dropped", offsetof(struct e1000_hw_stats, mgpdc)},
+	{"tx_management_packets", offsetof(struct e1000_hw_stats, mgptc)},
+	{"rx_total_packets", offsetof(struct e1000_hw_stats, tpr)},
+	{"tx_total_packets", offsetof(struct e1000_hw_stats, tpt)},
+	{"rx_total_bytes", offsetof(struct e1000_hw_stats, tor)},
+	{"tx_total_bytes", offsetof(struct e1000_hw_stats, tot)},
+	{"tx_size_64_packets", offsetof(struct e1000_hw_stats, ptc64)},
+	{"tx_size_65_to_127_packets", offsetof(struct e1000_hw_stats, ptc127)},
+	{"tx_size_128_to_255_packets", offsetof(struct e1000_hw_stats, ptc255)},
+	{"tx_size_256_to_511_packets", offsetof(struct e1000_hw_stats, ptc511)},
+	{"tx_size_512_to_1023_packets", offsetof(struct e1000_hw_stats,
 		ptc1023)},
-	{"tx_size_1023_to_max_packets", offsetof(struct igc_hw_stats,
+	{"tx_size_1023_to_max_packets", offsetof(struct e1000_hw_stats,
 		ptc1522)},
-	{"tx_multicast_packets", offsetof(struct igc_hw_stats, mptc)},
-	{"tx_broadcast_packets", offsetof(struct igc_hw_stats, bptc)},
-	{"tx_tso_packets", offsetof(struct igc_hw_stats, tsctc)},
-	{"rx_sent_to_host_packets", offsetof(struct igc_hw_stats, rpthc)},
-	{"tx_sent_by_host_packets", offsetof(struct igc_hw_stats, hgptc)},
-	{"interrupt_assert_count", offsetof(struct igc_hw_stats, iac)},
+	{"tx_multicast_packets", offsetof(struct e1000_hw_stats, mptc)},
+	{"tx_broadcast_packets", offsetof(struct e1000_hw_stats, bptc)},
+	{"tx_tso_packets", offsetof(struct e1000_hw_stats, tsctc)},
+	{"rx_sent_to_host_packets", offsetof(struct e1000_hw_stats, rpthc)},
+	{"tx_sent_by_host_packets", offsetof(struct e1000_hw_stats, hgptc)},
+	{"interrupt_assert_count", offsetof(struct e1000_hw_stats, iac)},
 	{"rx_descriptor_lower_threshold",
-		offsetof(struct igc_hw_stats, icrxdmtc)},
+		offsetof(struct e1000_hw_stats, icrxdmtc)},
 };
 
 #define IGC_NB_XSTATS (sizeof(rte_igc_stats_strings) / \
@@ -391,24 +391,24 @@ eth_igc_configure(struct rte_eth_dev *dev)
 static int
 eth_igc_set_link_up(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	if (hw->phy.media_type == igc_media_type_copper)
-		igc_power_up_phy(hw);
+	if (hw->phy.media_type == e1000_media_type_copper)
+		e1000_power_up_phy(hw);
 	else
-		igc_power_up_fiber_serdes_link(hw);
+		e1000_power_up_fiber_serdes_link(hw);
 	return 0;
 }
 
 static int
 eth_igc_set_link_down(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	if (hw->phy.media_type == igc_media_type_copper)
-		igc_power_down_phy(hw);
+	if (hw->phy.media_type == e1000_media_type_copper)
+		e1000_power_down_phy(hw);
 	else
-		igc_shutdown_fiber_serdes_link(hw);
+		e1000_shutdown_fiber_serdes_link(hw);
 	return 0;
 }
 
@@ -418,17 +418,17 @@ eth_igc_set_link_down(struct rte_eth_dev *dev)
 static void
 igc_intr_other_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 
 	if (rte_intr_allow_others(intr_handle) &&
 		dev->data->dev_conf.intr_conf.lsc) {
-		IGC_WRITE_REG(hw, IGC_EIMC, 1u << IGC_MSIX_OTHER_INTR_VEC);
+		E1000_WRITE_REG(hw, E1000_EIMC, 1u << IGC_MSIX_OTHER_INTR_VEC);
 	}
 
-	IGC_WRITE_REG(hw, IGC_IMC, ~0);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_IMC, ~0);
+	E1000_WRITE_FLUSH(hw);
 }
 
 /*
@@ -438,17 +438,17 @@ static inline void
 igc_intr_other_enable(struct rte_eth_dev *dev)
 {
 	struct igc_interrupt *intr = IGC_DEV_PRIVATE_INTR(dev);
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 
 	if (rte_intr_allow_others(intr_handle) &&
 		dev->data->dev_conf.intr_conf.lsc) {
-		IGC_WRITE_REG(hw, IGC_EIMS, 1u << IGC_MSIX_OTHER_INTR_VEC);
+		E1000_WRITE_REG(hw, E1000_EIMS, 1u << IGC_MSIX_OTHER_INTR_VEC);
 	}
 
-	IGC_WRITE_REG(hw, IGC_IMS, intr->mask);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_IMS, intr->mask);
+	E1000_WRITE_FLUSH(hw);
 }
 
 /*
@@ -459,14 +459,14 @@ static void
 eth_igc_interrupt_get_status(struct rte_eth_dev *dev)
 {
 	uint32_t icr;
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_interrupt *intr = IGC_DEV_PRIVATE_INTR(dev);
 
 	/* read-on-clear nic registers here */
-	icr = IGC_READ_REG(hw, IGC_ICR);
+	icr = E1000_READ_REG(hw, E1000_ICR);
 
 	intr->flags = 0;
-	if (icr & IGC_ICR_LSC)
+	if (icr & E1000_ICR_LSC)
 		intr->flags |= IGC_FLAG_NEED_LINK_UPDATE;
 }
 
@@ -474,7 +474,7 @@ eth_igc_interrupt_get_status(struct rte_eth_dev *dev)
 static int
 eth_igc_link_update(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_eth_link link;
 	int link_check, count;
 
@@ -485,20 +485,20 @@ eth_igc_link_update(struct rte_eth_dev *dev, int wait_to_complete)
 	for (count = 0; count < IGC_LINK_UPDATE_CHECK_TIMEOUT; count++) {
 		/* Read the real link status */
 		switch (hw->phy.media_type) {
-		case igc_media_type_copper:
+		case e1000_media_type_copper:
 			/* Do the work to read phy */
-			igc_check_for_link(hw);
+			e1000_check_for_link(hw);
 			link_check = !hw->mac.get_link_status;
 			break;
 
-		case igc_media_type_fiber:
-			igc_check_for_link(hw);
-			link_check = (IGC_READ_REG(hw, IGC_STATUS) &
-				      IGC_STATUS_LU);
+		case e1000_media_type_fiber:
+			e1000_check_for_link(hw);
+			link_check = (E1000_READ_REG(hw, E1000_STATUS) &
+				      E1000_STATUS_LU);
 			break;
 
-		case igc_media_type_internal_serdes:
-			igc_check_for_link(hw);
+		case e1000_media_type_internal_serdes:
+			e1000_check_for_link(hw);
 			link_check = hw->mac.serdes_has_link;
 			break;
 
@@ -524,11 +524,11 @@ eth_igc_link_update(struct rte_eth_dev *dev, int wait_to_complete)
 				RTE_ETH_LINK_SPEED_FIXED);
 
 		if (speed == SPEED_2500) {
-			uint32_t tipg = IGC_READ_REG(hw, IGC_TIPG);
-			if ((tipg & IGC_TIPG_IPGT_MASK) != 0x0b) {
-				tipg &= ~IGC_TIPG_IPGT_MASK;
+			uint32_t tipg = E1000_READ_REG(hw, E1000_TIPG);
+			if ((tipg & E1000_TIPG_IPGT_MASK) != 0x0b) {
+				tipg &= ~E1000_TIPG_IPGT_MASK;
 				tipg |= 0x0b;
-				IGC_WRITE_REG(hw, IGC_TIPG, tipg);
+				E1000_WRITE_REG(hw, E1000_TIPG, tipg);
 			}
 		}
 	} else {
@@ -622,24 +622,24 @@ igc_update_queue_stats_handler(void *param)
 static void
 eth_igc_rxtx_control(struct rte_eth_dev *dev, bool enable)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t tctl, rctl;
 
-	tctl = IGC_READ_REG(hw, IGC_TCTL);
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
+	tctl = E1000_READ_REG(hw, E1000_TCTL);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
 
 	if (enable) {
 		/* enable Tx/Rx */
-		tctl |= IGC_TCTL_EN;
-		rctl |= IGC_RCTL_EN;
+		tctl |= E1000_TCTL_EN;
+		rctl |= E1000_RCTL_EN;
 	} else {
 		/* disable Tx/Rx */
-		tctl &= ~IGC_TCTL_EN;
-		rctl &= ~IGC_RCTL_EN;
+		tctl &= ~E1000_TCTL_EN;
+		rctl &= ~E1000_RCTL_EN;
 	}
-	IGC_WRITE_REG(hw, IGC_TCTL, tctl);
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_TCTL, tctl);
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
+	E1000_WRITE_FLUSH(hw);
 }
 
 /*
@@ -650,7 +650,7 @@ static int
 eth_igc_stop(struct rte_eth_dev *dev)
 {
 	struct igc_adapter *adapter = IGC_DEV_PRIVATE(dev);
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 	struct rte_eth_link link;
@@ -662,11 +662,11 @@ eth_igc_stop(struct rte_eth_dev *dev)
 	eth_igc_rxtx_control(dev, false);
 
 	/* disable all MSI-X interrupts */
-	IGC_WRITE_REG(hw, IGC_EIMC, 0x1f);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_EIMC, 0x1f);
+	E1000_WRITE_FLUSH(hw);
 
 	/* clear all MSI-X interrupts */
-	IGC_WRITE_REG(hw, IGC_EICR, 0x1f);
+	E1000_WRITE_REG(hw, E1000_EICR, 0x1f);
 
 	igc_intr_other_disable(dev);
 
@@ -675,17 +675,17 @@ eth_igc_stop(struct rte_eth_dev *dev)
 	/* disable intr eventfd mapping */
 	rte_intr_disable(intr_handle);
 
-	igc_reset_hw(hw);
+	e1000_reset_hw(hw);
 
 	/* disable all wake up */
-	IGC_WRITE_REG(hw, IGC_WUC, 0);
+	E1000_WRITE_REG(hw, E1000_WUC, 0);
 
 	/* disable checking EEE operation in MAC loopback mode */
-	igc_read_reg_check_clear_bits(hw, IGC_EEER, IGC_EEER_EEE_FRC_AN);
+	igc_read_reg_check_clear_bits(hw, E1000_EEER, IGC_EEER_EEE_FRC_AN);
 
 	/* Set bit for Go Link disconnect */
-	igc_read_reg_check_set_bits(hw, IGC_82580_PHY_POWER_MGMT,
-			IGC_82580_PM_GO_LINKD);
+	igc_read_reg_check_set_bits(hw, E1000_82580_PHY_POWER_MGMT,
+			E1000_82580_PM_GO_LINKD);
 
 	/* Power down the phy. Needed to make the link go Down */
 	eth_igc_set_link_down(dev);
@@ -721,7 +721,7 @@ eth_igc_stop(struct rte_eth_dev *dev)
  *  msix-vector, valid 0,1,2,3,4
  */
 static void
-igc_write_ivar(struct igc_hw *hw, uint8_t queue_index,
+igc_write_ivar(struct e1000_hw *hw, uint8_t queue_index,
 		bool tx, uint8_t msix_vector)
 {
 	uint8_t offset = 0;
@@ -744,15 +744,15 @@ igc_write_ivar(struct igc_hw *hw, uint8_t queue_index,
 	if (queue_index & 1)
 		offset += 16;
 
-	val = IGC_READ_REG_ARRAY(hw, IGC_IVAR0, reg_index);
+	val = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, reg_index);
 
 	/* clear bits */
 	val &= ~((uint32_t)0xFF << offset);
 
 	/* write vector and valid bit */
-	val |= (uint32_t)(msix_vector | IGC_IVAR_VALID) << offset;
+	val |= (uint32_t)(msix_vector | E1000_IVAR_VALID) << offset;
 
-	IGC_WRITE_REG_ARRAY(hw, IGC_IVAR0, reg_index, val);
+	E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, reg_index, val);
 }
 
 /* Sets up the hardware to generate MSI-X interrupts properly
@@ -762,7 +762,7 @@ igc_write_ivar(struct igc_hw *hw, uint8_t queue_index,
 static void
 igc_configure_msix_intr(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 
@@ -785,9 +785,9 @@ igc_configure_msix_intr(struct rte_eth_dev *dev)
 	}
 
 	/* turn on MSI-X capability first */
-	IGC_WRITE_REG(hw, IGC_GPIE, IGC_GPIE_MSIX_MODE |
-				IGC_GPIE_PBA | IGC_GPIE_EIAME |
-				IGC_GPIE_NSICR);
+	E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE |
+				E1000_GPIE_PBA | E1000_GPIE_EIAME |
+				E1000_GPIE_NSICR);
 
 	nb_efd = rte_intr_nb_efd_get(intr_handle);
 	if (nb_efd < 0)
@@ -799,14 +799,14 @@ igc_configure_msix_intr(struct rte_eth_dev *dev)
 		intr_mask |= (1u << IGC_MSIX_OTHER_INTR_VEC);
 
 	/* enable msix auto-clear */
-	igc_read_reg_check_set_bits(hw, IGC_EIAC, intr_mask);
+	igc_read_reg_check_set_bits(hw, E1000_EIAC, intr_mask);
 
 	/* set other cause interrupt vector */
-	igc_read_reg_check_set_bits(hw, IGC_IVAR_MISC,
-		(uint32_t)(IGC_MSIX_OTHER_INTR_VEC | IGC_IVAR_VALID) << 8);
+	igc_read_reg_check_set_bits(hw, E1000_IVAR_MISC,
+		(uint32_t)(IGC_MSIX_OTHER_INTR_VEC | E1000_IVAR_VALID) << 8);
 
 	/* enable auto-mask */
-	igc_read_reg_check_set_bits(hw, IGC_EIAM, intr_mask);
+	igc_read_reg_check_set_bits(hw, E1000_EIAM, intr_mask);
 
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
 		igc_write_ivar(hw, i, 0, vec);
@@ -815,7 +815,7 @@ igc_configure_msix_intr(struct rte_eth_dev *dev)
 			vec++;
 	}
 
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_FLUSH(hw);
 }
 
 /**
@@ -832,9 +832,9 @@ igc_lsc_interrupt_setup(struct rte_eth_dev *dev, uint8_t on)
 	struct igc_interrupt *intr = IGC_DEV_PRIVATE_INTR(dev);
 
 	if (on)
-		intr->mask |= IGC_ICR_LSC;
+		intr->mask |= E1000_ICR_LSC;
 	else
-		intr->mask &= ~IGC_ICR_LSC;
+		intr->mask &= ~E1000_ICR_LSC;
 }
 
 /*
@@ -845,7 +845,7 @@ static void
 igc_rxq_interrupt_setup(struct rte_eth_dev *dev)
 {
 	uint32_t mask;
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 	int misc_shift = rte_intr_allow_others(intr_handle) ? 1 : 0;
@@ -862,16 +862,16 @@ igc_rxq_interrupt_setup(struct rte_eth_dev *dev)
 		return;
 
 	mask = RTE_LEN2MASK(nb_efd, uint32_t) << misc_shift;
-	IGC_WRITE_REG(hw, IGC_EIMS, mask);
+	E1000_WRITE_REG(hw, E1000_EIMS, mask);
 }
 
 /*
  *  Get hardware rx-buffer size.
  */
 static inline int
-igc_get_rx_buffer_size(struct igc_hw *hw)
+igc_get_rx_buffer_size(struct e1000_hw *hw)
 {
-	return (IGC_READ_REG(hw, IGC_RXPBS) & 0x3f) << 10;
+	return (E1000_READ_REG(hw, E1000_RXPBS) & 0x3f) << 10;
 }
 
 /*
@@ -880,13 +880,13 @@ igc_get_rx_buffer_size(struct igc_hw *hw)
  * that the driver is loaded.
  */
 static void
-igc_hw_control_acquire(struct igc_hw *hw)
+igc_hw_control_acquire(struct e1000_hw *hw)
 {
 	uint32_t ctrl_ext;
 
 	/* Let firmware know the driver has taken over */
-	ctrl_ext = IGC_READ_REG(hw, IGC_CTRL_EXT);
-	IGC_WRITE_REG(hw, IGC_CTRL_EXT, ctrl_ext | IGC_CTRL_EXT_DRV_LOAD);
+	ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
+	E1000_WRITE_REG(hw, E1000_CTRL_EXT, ctrl_ext | E1000_CTRL_EXT_DRV_LOAD);
 }
 
 /*
@@ -895,18 +895,18 @@ igc_hw_control_acquire(struct igc_hw *hw)
  * driver is no longer loaded.
  */
 static void
-igc_hw_control_release(struct igc_hw *hw)
+igc_hw_control_release(struct e1000_hw *hw)
 {
 	uint32_t ctrl_ext;
 
 	/* Let firmware taken over control of h/w */
-	ctrl_ext = IGC_READ_REG(hw, IGC_CTRL_EXT);
-	IGC_WRITE_REG(hw, IGC_CTRL_EXT,
-			ctrl_ext & ~IGC_CTRL_EXT_DRV_LOAD);
+	ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
+	E1000_WRITE_REG(hw, E1000_CTRL_EXT,
+			ctrl_ext & ~E1000_CTRL_EXT_DRV_LOAD);
 }
 
 static int
-igc_hardware_init(struct igc_hw *hw)
+igc_hardware_init(struct e1000_hw *hw)
 {
 	uint32_t rx_buf_size;
 	int diag;
@@ -915,10 +915,10 @@ igc_hardware_init(struct igc_hw *hw)
 	igc_hw_control_acquire(hw);
 
 	/* Issue a global reset */
-	igc_reset_hw(hw);
+	e1000_reset_hw(hw);
 
 	/* disable all wake up */
-	IGC_WRITE_REG(hw, IGC_WUC, 0);
+	E1000_WRITE_REG(hw, E1000_WUC, 0);
 
 	/*
 	 * Hardware flow control
@@ -937,14 +937,14 @@ igc_hardware_init(struct igc_hw *hw)
 	hw->fc.low_water = hw->fc.high_water - 1500;
 	hw->fc.pause_time = IGC_FC_PAUSE_TIME;
 	hw->fc.send_xon = 1;
-	hw->fc.requested_mode = igc_fc_full;
+	hw->fc.requested_mode = e1000_fc_full;
 
-	diag = igc_init_hw(hw);
+	diag = e1000_init_hw(hw);
 	if (diag < 0)
 		return diag;
 
-	igc_get_phy_info(hw);
-	igc_check_for_link(hw);
+	e1000_get_phy_info(hw);
+	e1000_check_for_link(hw);
 
 	return 0;
 }
@@ -952,7 +952,7 @@ igc_hardware_init(struct igc_hw *hw)
 static int
 eth_igc_start(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_adapter *adapter = IGC_DEV_PRIVATE(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
@@ -967,11 +967,11 @@ eth_igc_start(struct rte_eth_dev *dev)
 	PMD_INIT_FUNC_TRACE();
 
 	/* disable all MSI-X interrupts */
-	IGC_WRITE_REG(hw, IGC_EIMC, 0x1f);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_EIMC, 0x1f);
+	E1000_WRITE_FLUSH(hw);
 
 	/* clear all MSI-X interrupts */
-	IGC_WRITE_REG(hw, IGC_EICR, 0x1f);
+	E1000_WRITE_REG(hw, E1000_EICR, 0x1f);
 
 	/* disable uio/vfio intr/eventfd mapping */
 	if (!adapter->stopped)
@@ -981,7 +981,7 @@ eth_igc_start(struct rte_eth_dev *dev)
 	eth_igc_set_link_up(dev);
 
 	/* Put the address into the Receive Address Array */
-	igc_rar_set(hw, hw->mac.addr, 0);
+	e1000_rar_set(hw, hw->mac.addr, 0);
 
 	/* Initialize the hardware */
 	if (igc_hardware_init(hw)) {
@@ -1025,36 +1025,36 @@ eth_igc_start(struct rte_eth_dev *dev)
 		adapter->base_time = 0;
 		adapter->cycle_time = NSEC_PER_SEC;
 
-		IGC_WRITE_REG(hw, IGC_TSSDP, 0);
-		IGC_WRITE_REG(hw, IGC_TSIM, TSINTR_TXTS);
-		IGC_WRITE_REG(hw, IGC_IMS, IGC_ICR_TS);
+		E1000_WRITE_REG(hw, E1000_TSSDP, 0);
+		E1000_WRITE_REG(hw, E1000_TSIM, TSINTR_TXTS);
+		E1000_WRITE_REG(hw, E1000_IMS, E1000_ICR_TS);
 
-		IGC_WRITE_REG(hw, IGC_TSAUXC, 0);
-		IGC_WRITE_REG(hw, IGC_I350_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
-		IGC_WRITE_REG(hw, IGC_TXPBS, IGC_TXPBSIZE_TSN);
+		E1000_WRITE_REG(hw, E1000_TSAUXC, 0);
+		E1000_WRITE_REG(hw, E1000_I350_DTXMXPKTSZ, E1000_DTXMXPKTSZ_TSN);
+		E1000_WRITE_REG(hw, E1000_TXPBS, E1000_TXPBSIZE_TSN);
 
-		tqavctrl = IGC_READ_REG(hw, IGC_I210_TQAVCTRL);
-		tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
-			    IGC_TQAVCTRL_ENHANCED_QAV;
-		IGC_WRITE_REG(hw, IGC_I210_TQAVCTRL, tqavctrl);
+		tqavctrl = E1000_READ_REG(hw, E1000_I210_TQAVCTRL);
+		tqavctrl |= E1000_TQAVCTRL_TRANSMIT_MODE_TSN |
+			    E1000_TQAVCTRL_ENHANCED_QAV;
+		E1000_WRITE_REG(hw, E1000_I210_TQAVCTRL, tqavctrl);
 
-		IGC_WRITE_REG(hw, IGC_QBVCYCLET_S, adapter->cycle_time);
-		IGC_WRITE_REG(hw, IGC_QBVCYCLET, adapter->cycle_time);
+		E1000_WRITE_REG(hw, E1000_QBVCYCLET_S, adapter->cycle_time);
+		E1000_WRITE_REG(hw, E1000_QBVCYCLET, adapter->cycle_time);
 
 		for (i = 0; i < dev->data->nb_tx_queues; i++) {
-			IGC_WRITE_REG(hw, IGC_STQT(i), 0);
-			IGC_WRITE_REG(hw, IGC_ENDQT(i), NSEC_PER_SEC);
+			E1000_WRITE_REG(hw, E1000_STQT(i), 0);
+			E1000_WRITE_REG(hw, E1000_ENDQT(i), NSEC_PER_SEC);
 
-			txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
-			IGC_WRITE_REG(hw, IGC_TXQCTL(i), txqctl);
+			txqctl |= E1000_TXQCTL_QUEUE_MODE_LAUNCHT;
+			E1000_WRITE_REG(hw, E1000_TXQCTL(i), txqctl);
 		}
 
 		clock_gettime(CLOCK_REALTIME, &system_time);
-		IGC_WRITE_REG(hw, IGC_SYSTIML, system_time.tv_nsec);
-		IGC_WRITE_REG(hw, IGC_SYSTIMH, system_time.tv_sec);
+		E1000_WRITE_REG(hw, E1000_SYSTIML, system_time.tv_nsec);
+		E1000_WRITE_REG(hw, E1000_SYSTIMH, system_time.tv_sec);
 
-		nsec = IGC_READ_REG(hw, IGC_SYSTIML);
-		sec = IGC_READ_REG(hw, IGC_SYSTIMH);
+		nsec = E1000_READ_REG(hw, E1000_SYSTIML);
+		sec = E1000_READ_REG(hw, E1000_SYSTIMH);
 		systime = (int64_t)sec * NSEC_PER_SEC + (int64_t)nsec;
 
 		if (systime > adapter->base_time) {
@@ -1066,11 +1066,11 @@ eth_igc_start(struct rte_eth_dev *dev)
 
 		baset_h = adapter->base_time / NSEC_PER_SEC;
 		baset_l = adapter->base_time % NSEC_PER_SEC;
-		IGC_WRITE_REG(hw, IGC_BASET_H, baset_h);
-		IGC_WRITE_REG(hw, IGC_BASET_L, baset_l);
+		E1000_WRITE_REG(hw, E1000_BASET_H, baset_h);
+		E1000_WRITE_REG(hw, E1000_BASET_L, baset_l);
 	}
 
-	igc_clear_hw_cntrs_base_generic(hw);
+	e1000_clear_hw_cntrs_base_generic(hw);
 
 	/* VLAN Offload Settings */
 	eth_igc_vlan_offload_set(dev,
@@ -1080,7 +1080,7 @@ eth_igc_start(struct rte_eth_dev *dev)
 	/* Setup link speed and duplex */
 	speeds = &dev->data->dev_conf.link_speeds;
 	if (*speeds == RTE_ETH_LINK_SPEED_AUTONEG) {
-		hw->phy.autoneg_advertised = IGC_ALL_SPEED_DUPLEX_2500;
+		hw->phy.autoneg_advertised = E1000_ALL_SPEED_DUPLEX_2500;
 		hw->mac.autoneg = 1;
 	} else {
 		int num_speeds = 0;
@@ -1129,7 +1129,7 @@ eth_igc_start(struct rte_eth_dev *dev)
 			goto error_invalid_config;
 	}
 
-	igc_setup_link(hw);
+	e1000_setup_link(hw);
 
 	if (rte_intr_allow_others(intr_handle)) {
 		/* check if lsc interrupt is enabled */
@@ -1167,13 +1167,13 @@ eth_igc_start(struct rte_eth_dev *dev)
 	if (dev->data->dev_conf.lpbk_mode == 1) {
 		uint32_t reg_val;
 
-		reg_val = IGC_READ_REG(hw, IGC_CTRL);
+		reg_val = E1000_READ_REG(hw, E1000_CTRL);
 		reg_val &= ~IGC_CTRL_SPEED_MASK;
-		reg_val |= IGC_CTRL_SLU | IGC_CTRL_FRCSPD |
-			IGC_CTRL_FRCDPX | IGC_CTRL_FD | IGC_CTRL_SPEED_2500;
-		IGC_WRITE_REG(hw, IGC_CTRL, reg_val);
+		reg_val |= E1000_CTRL_SLU | E1000_CTRL_FRCSPD |
+			E1000_CTRL_FRCDPX | E1000_CTRL_FD | IGC_CTRL_SPEED_2500;
+		E1000_WRITE_REG(hw, E1000_CTRL, reg_val);
 
-		igc_read_reg_check_set_bits(hw, IGC_EEER, IGC_EEER_EEE_FRC_AN);
+		igc_read_reg_check_set_bits(hw, E1000_EEER, IGC_EEER_EEE_FRC_AN);
 	}
 
 	return 0;
@@ -1186,7 +1186,7 @@ eth_igc_start(struct rte_eth_dev *dev)
 }
 
 static int
-igc_reset_swfw_lock(struct igc_hw *hw)
+igc_reset_swfw_lock(struct e1000_hw *hw)
 {
 	int ret_val;
 
@@ -1194,7 +1194,7 @@ igc_reset_swfw_lock(struct igc_hw *hw)
 	 * Do mac ops initialization manually here, since we will need
 	 * some function pointers set by this call.
 	 */
-	ret_val = igc_init_mac_params(hw);
+	ret_val = e1000_init_mac_params(hw);
 	if (ret_val)
 		return ret_val;
 
@@ -1203,10 +1203,10 @@ igc_reset_swfw_lock(struct igc_hw *hw)
 	 * it is due to an improper exit of the application.
 	 * So force the release of the faulty lock.
 	 */
-	if (igc_get_hw_semaphore_generic(hw) < 0)
+	if (e1000_get_hw_semaphore_generic(hw) < 0)
 		PMD_DRV_LOG(DEBUG, "SMBI lock released");
 
-	igc_put_hw_semaphore_generic(hw);
+	e1000_put_hw_semaphore_generic(hw);
 
 	if (hw->mac.ops.acquire_swfw_sync != NULL) {
 		uint16_t mask;
@@ -1216,7 +1216,7 @@ igc_reset_swfw_lock(struct igc_hw *hw)
 		 * If this is the case, it is due to an improper exit of the
 		 * application. So force the release of the faulty lock.
 		 */
-		mask = IGC_SWFW_PHY0_SM;
+		mask = E1000_SWFW_PHY0_SM;
 		if (hw->mac.ops.acquire_swfw_sync(hw, mask) < 0) {
 			PMD_DRV_LOG(DEBUG, "SWFW phy%d lock released",
 				    hw->bus.func);
@@ -1229,14 +1229,14 @@ igc_reset_swfw_lock(struct igc_hw *hw)
 		 * that if lock can not be taken it is due to an improper lock
 		 * of the semaphore.
 		 */
-		mask = IGC_SWFW_EEP_SM;
+		mask = E1000_SWFW_EEP_SM;
 		if (hw->mac.ops.acquire_swfw_sync(hw, mask) < 0)
 			PMD_DRV_LOG(DEBUG, "SWFW common locks released");
 
 		hw->mac.ops.release_swfw_sync(hw, mask);
 	}
 
-	return IGC_SUCCESS;
+	return E1000_SUCCESS;
 }
 
 /*
@@ -1265,7 +1265,7 @@ eth_igc_close(struct rte_eth_dev *dev)
 {
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_adapter *adapter = IGC_DEV_PRIVATE(dev);
 	int retry = 0;
 	int ret = 0;
@@ -1291,7 +1291,7 @@ eth_igc_close(struct rte_eth_dev *dev)
 		DELAY(200 * 1000); /* delay 200ms */
 	} while (retry++ < 5);
 
-	igc_phy_hw_reset(hw);
+	e1000_phy_hw_reset(hw);
 	igc_hw_control_release(hw);
 	igc_dev_free_queues(dev);
 
@@ -1304,7 +1304,7 @@ eth_igc_close(struct rte_eth_dev *dev)
 static void
 igc_identify_hardware(struct rte_eth_dev *dev, struct rte_pci_device *pci_dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
 	hw->vendor_id = pci_dev->id.vendor_id;
 	hw->device_id = pci_dev->id.device_id;
@@ -1317,7 +1317,7 @@ eth_igc_dev_init(struct rte_eth_dev *dev)
 {
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct igc_adapter *igc = IGC_DEV_PRIVATE(dev);
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	int i, error = 0;
 
 	PMD_INIT_FUNC_TRACE();
@@ -1348,50 +1348,50 @@ eth_igc_dev_init(struct rte_eth_dev *dev)
 	hw->hw_addr = (void *)pci_dev->mem_resource[0].addr;
 
 	igc_identify_hardware(dev, pci_dev);
-	if (igc_setup_init_funcs(hw, false) != IGC_SUCCESS) {
+	if (e1000_setup_init_funcs(hw, false) != E1000_SUCCESS) {
 		error = -EIO;
 		goto err_late;
 	}
 
-	igc_get_bus_info(hw);
+	e1000_get_bus_info(hw);
 
 	/* Reset any pending lock */
-	if (igc_reset_swfw_lock(hw) != IGC_SUCCESS) {
+	if (igc_reset_swfw_lock(hw) != E1000_SUCCESS) {
 		error = -EIO;
 		goto err_late;
 	}
 
 	/* Finish initialization */
-	if (igc_setup_init_funcs(hw, true) != IGC_SUCCESS) {
+	if (e1000_setup_init_funcs(hw, true) != E1000_SUCCESS) {
 		error = -EIO;
 		goto err_late;
 	}
 
 	hw->mac.autoneg = 1;
 	hw->phy.autoneg_wait_to_complete = 0;
-	hw->phy.autoneg_advertised = IGC_ALL_SPEED_DUPLEX_2500;
+	hw->phy.autoneg_advertised = E1000_ALL_SPEED_DUPLEX_2500;
 
 	/* Copper options */
-	if (hw->phy.media_type == igc_media_type_copper) {
+	if (hw->phy.media_type == e1000_media_type_copper) {
 		hw->phy.mdix = 0; /* AUTO_ALL_MODES */
 		hw->phy.disable_polarity_correction = 0;
-		hw->phy.ms_type = igc_ms_hw_default;
+		hw->phy.ms_type = e1000_ms_hw_default;
 	}
 
 	/*
 	 * Start from a known state, this is important in reading the nvm
 	 * and mac from that.
 	 */
-	igc_reset_hw(hw);
+	e1000_reset_hw(hw);
 
 	/* Make sure we have a good EEPROM before we read from it */
-	if (igc_validate_nvm_checksum(hw) < 0) {
+	if (e1000_validate_nvm_checksum(hw) < 0) {
 		/*
 		 * Some PCI-E parts fail the first check due to
 		 * the link being in sleep state, call it again,
 		 * if it fails a second time its a real issue.
 		 */
-		if (igc_validate_nvm_checksum(hw) < 0) {
+		if (e1000_validate_nvm_checksum(hw) < 0) {
 			PMD_INIT_LOG(ERR, "EEPROM checksum invalid");
 			error = -EIO;
 			goto err_late;
@@ -1399,7 +1399,7 @@ eth_igc_dev_init(struct rte_eth_dev *dev)
 	}
 
 	/* Read the permanent MAC address out of the EEPROM */
-	if (igc_read_mac_addr(hw) != 0) {
+	if (e1000_read_mac_addr(hw) != 0) {
 		PMD_INIT_LOG(ERR, "EEPROM error while reading MAC address");
 		error = -EIO;
 		goto err_late;
@@ -1432,7 +1432,7 @@ eth_igc_dev_init(struct rte_eth_dev *dev)
 	igc->stopped = 0;
 
 	/* Indicate SOL/IDER usage */
-	if (igc_check_reset_block(hw) < 0)
+	if (e1000_check_reset_block(hw) < 0)
 		PMD_INIT_LOG(ERR,
 			"PHY reset is blocked due to SOL/IDER session.");
 
@@ -1489,55 +1489,55 @@ eth_igc_reset(struct rte_eth_dev *dev)
 static int
 eth_igc_promiscuous_enable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t rctl;
 
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
-	rctl |= (IGC_RCTL_UPE | IGC_RCTL_MPE);
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
+	rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE);
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 	return 0;
 }
 
 static int
 eth_igc_promiscuous_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t rctl;
 
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
-	rctl &= (~IGC_RCTL_UPE);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
+	rctl &= (~E1000_RCTL_UPE);
 	if (dev->data->all_multicast == 1)
-		rctl |= IGC_RCTL_MPE;
+		rctl |= E1000_RCTL_MPE;
 	else
-		rctl &= (~IGC_RCTL_MPE);
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
+		rctl &= (~E1000_RCTL_MPE);
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 	return 0;
 }
 
 static int
 eth_igc_allmulticast_enable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t rctl;
 
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
-	rctl |= IGC_RCTL_MPE;
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
+	rctl |= E1000_RCTL_MPE;
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 	return 0;
 }
 
 static int
 eth_igc_allmulticast_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t rctl;
 
 	if (dev->data->promiscuous == 1)
 		return 0;	/* must remain in all_multicast mode */
 
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
-	rctl &= (~IGC_RCTL_MPE);
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
+	rctl &= (~E1000_RCTL_MPE);
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 	return 0;
 }
 
@@ -1545,11 +1545,11 @@ static int
 eth_igc_fw_version_get(struct rte_eth_dev *dev, char *fw_version,
 		       size_t fw_size)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	struct igc_fw_version fw;
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_fw_version fw;
 	int ret;
 
-	igc_get_fw_version(hw, &fw);
+	e1000_get_fw_version(hw, &fw);
 
 	/* if option rom is valid, display its version too */
 	if (fw.or_valid) {
@@ -1584,7 +1584,7 @@ eth_igc_fw_version_get(struct rte_eth_dev *dev, char *fw_version,
 static int
 eth_igc_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
 	dev_info->min_rx_bufsize = 256; /* See BSIZE field of RCTL register. */
 	dev_info->max_rx_pktlen = MAX_RX_JUMBO_FRAME_SIZE;
@@ -1637,17 +1637,17 @@ eth_igc_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 static int
 eth_igc_led_on(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	return igc_led_on(hw) == IGC_SUCCESS ? 0 : -ENOTSUP;
+	return e1000_led_on(hw) == E1000_SUCCESS ? 0 : -ENOTSUP;
 }
 
 static int
 eth_igc_led_off(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	return igc_led_off(hw) == IGC_SUCCESS ? 0 : -ENOTSUP;
+	return e1000_led_off(hw) == E1000_SUCCESS ? 0 : -ENOTSUP;
 }
 
 static const uint32_t *
@@ -1678,12 +1678,12 @@ eth_igc_supported_ptypes_get(__rte_unused struct rte_eth_dev *dev,
 static int
 eth_igc_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t frame_size = mtu + IGC_ETH_OVERHEAD;
 	uint32_t rctl;
 
 	/* if extend vlan has been enabled */
-	if (IGC_READ_REG(hw, IGC_CTRL_EXT) & IGC_CTRL_EXT_EXT_VLAN)
+	if (E1000_READ_REG(hw, E1000_CTRL_EXT) & IGC_CTRL_EXT_EXT_VLAN)
 		frame_size += VLAN_TAG_SIZE;
 
 	/*
@@ -1696,14 +1696,14 @@ eth_igc_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
 		return -EINVAL;
 	}
 
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
 	if (mtu > RTE_ETHER_MTU)
-		rctl |= IGC_RCTL_LPE;
+		rctl |= E1000_RCTL_LPE;
 	else
-		rctl &= ~IGC_RCTL_LPE;
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
+		rctl &= ~E1000_RCTL_LPE;
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 
-	IGC_WRITE_REG(hw, IGC_RLPML, frame_size);
+	E1000_WRITE_REG(hw, E1000_RLPML, frame_size);
 
 	return 0;
 }
@@ -1712,9 +1712,9 @@ static int
 eth_igc_rar_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr,
 		uint32_t index, uint32_t pool)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	igc_rar_set(hw, mac_addr->addr_bytes, index);
+	e1000_rar_set(hw, mac_addr->addr_bytes, index);
 	RTE_SET_USED(pool);
 	return 0;
 }
@@ -1723,18 +1723,18 @@ static void
 eth_igc_rar_clear(struct rte_eth_dev *dev, uint32_t index)
 {
 	uint8_t addr[RTE_ETHER_ADDR_LEN];
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
 	memset(addr, 0, sizeof(addr));
-	igc_rar_set(hw, addr, index);
+	e1000_rar_set(hw, addr, index);
 }
 
 static int
 eth_igc_default_mac_addr_set(struct rte_eth_dev *dev,
 			struct rte_ether_addr *addr)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	igc_rar_set(hw, addr->addr_bytes, 0);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	e1000_rar_set(hw, addr->addr_bytes, 0);
 	return 0;
 }
 
@@ -1743,8 +1743,8 @@ eth_igc_set_mc_addr_list(struct rte_eth_dev *dev,
 			 struct rte_ether_addr *mc_addr_set,
 			 uint32_t nb_mc_addr)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	igc_update_mc_addr_list(hw, (u8 *)mc_addr_set, nb_mc_addr);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	e1000_update_mc_addr_list(hw, (u8 *)mc_addr_set, nb_mc_addr);
 	return 0;
 }
 
@@ -1752,7 +1752,7 @@ eth_igc_set_mc_addr_list(struct rte_eth_dev *dev,
  * Read hardware registers
  */
 static void
-igc_read_stats_registers(struct igc_hw *hw, struct igc_hw_stats *stats)
+igc_read_stats_registers(struct e1000_hw *hw, struct e1000_hw_stats *stats)
 {
 	int pause_frames;
 
@@ -1763,119 +1763,119 @@ igc_read_stats_registers(struct igc_hw *hw, struct igc_hw_stats *stats)
 	uint64_t old_rpthc = stats->rpthc;
 	uint64_t old_hgptc = stats->hgptc;
 
-	stats->crcerrs += IGC_READ_REG(hw, IGC_CRCERRS);
-	stats->algnerrc += IGC_READ_REG(hw, IGC_ALGNERRC);
-	stats->rxerrc += IGC_READ_REG(hw, IGC_RXERRC);
-	stats->mpc += IGC_READ_REG(hw, IGC_MPC);
-	stats->scc += IGC_READ_REG(hw, IGC_SCC);
-	stats->ecol += IGC_READ_REG(hw, IGC_ECOL);
+	stats->crcerrs += E1000_READ_REG(hw, E1000_CRCERRS);
+	stats->algnerrc += E1000_READ_REG(hw, E1000_ALGNERRC);
+	stats->rxerrc += E1000_READ_REG(hw, E1000_RXERRC);
+	stats->mpc += E1000_READ_REG(hw, E1000_MPC);
+	stats->scc += E1000_READ_REG(hw, E1000_SCC);
+	stats->ecol += E1000_READ_REG(hw, E1000_ECOL);
 
-	stats->mcc += IGC_READ_REG(hw, IGC_MCC);
-	stats->latecol += IGC_READ_REG(hw, IGC_LATECOL);
-	stats->colc += IGC_READ_REG(hw, IGC_COLC);
+	stats->mcc += E1000_READ_REG(hw, E1000_MCC);
+	stats->latecol += E1000_READ_REG(hw, E1000_LATECOL);
+	stats->colc += E1000_READ_REG(hw, E1000_COLC);
 
-	stats->dc += IGC_READ_REG(hw, IGC_DC);
-	stats->tncrs += IGC_READ_REG(hw, IGC_TNCRS);
-	stats->htdpmc += IGC_READ_REG(hw, IGC_HTDPMC);
-	stats->rlec += IGC_READ_REG(hw, IGC_RLEC);
-	stats->xonrxc += IGC_READ_REG(hw, IGC_XONRXC);
-	stats->xontxc += IGC_READ_REG(hw, IGC_XONTXC);
+	stats->dc += E1000_READ_REG(hw, E1000_DC);
+	stats->tncrs += E1000_READ_REG(hw, E1000_TNCRS);
+	stats->htdpmc += E1000_READ_REG(hw, E1000_HTDPMC);
+	stats->rlec += E1000_READ_REG(hw, E1000_RLEC);
+	stats->xonrxc += E1000_READ_REG(hw, E1000_XONRXC);
+	stats->xontxc += E1000_READ_REG(hw, E1000_XONTXC);
 
 	/*
 	 * For watchdog management we need to know if we have been
 	 * paused during the last interval, so capture that here.
 	 */
-	pause_frames = IGC_READ_REG(hw, IGC_XOFFRXC);
+	pause_frames = E1000_READ_REG(hw, E1000_XOFFRXC);
 	stats->xoffrxc += pause_frames;
-	stats->xofftxc += IGC_READ_REG(hw, IGC_XOFFTXC);
-	stats->fcruc += IGC_READ_REG(hw, IGC_FCRUC);
-	stats->prc64 += IGC_READ_REG(hw, IGC_PRC64);
-	stats->prc127 += IGC_READ_REG(hw, IGC_PRC127);
-	stats->prc255 += IGC_READ_REG(hw, IGC_PRC255);
-	stats->prc511 += IGC_READ_REG(hw, IGC_PRC511);
-	stats->prc1023 += IGC_READ_REG(hw, IGC_PRC1023);
-	stats->prc1522 += IGC_READ_REG(hw, IGC_PRC1522);
-	stats->gprc += IGC_READ_REG(hw, IGC_GPRC);
-	stats->bprc += IGC_READ_REG(hw, IGC_BPRC);
-	stats->mprc += IGC_READ_REG(hw, IGC_MPRC);
-	stats->gptc += IGC_READ_REG(hw, IGC_GPTC);
+	stats->xofftxc += E1000_READ_REG(hw, E1000_XOFFTXC);
+	stats->fcruc += E1000_READ_REG(hw, E1000_FCRUC);
+	stats->prc64 += E1000_READ_REG(hw, E1000_PRC64);
+	stats->prc127 += E1000_READ_REG(hw, E1000_PRC127);
+	stats->prc255 += E1000_READ_REG(hw, E1000_PRC255);
+	stats->prc511 += E1000_READ_REG(hw, E1000_PRC511);
+	stats->prc1023 += E1000_READ_REG(hw, E1000_PRC1023);
+	stats->prc1522 += E1000_READ_REG(hw, E1000_PRC1522);
+	stats->gprc += E1000_READ_REG(hw, E1000_GPRC);
+	stats->bprc += E1000_READ_REG(hw, E1000_BPRC);
+	stats->mprc += E1000_READ_REG(hw, E1000_MPRC);
+	stats->gptc += E1000_READ_REG(hw, E1000_GPTC);
 
 	/* For the 64-bit byte counters the low dword must be read first. */
 	/* Both registers clear on the read of the high dword */
 
 	/* Workaround CRC bytes included in size, take away 4 bytes/packet */
-	stats->gorc += IGC_READ_REG(hw, IGC_GORCL);
-	stats->gorc += ((uint64_t)IGC_READ_REG(hw, IGC_GORCH) << 32);
+	stats->gorc += E1000_READ_REG(hw, E1000_GORCL);
+	stats->gorc += ((uint64_t)E1000_READ_REG(hw, E1000_GORCH) << 32);
 	stats->gorc -= (stats->gprc - old_gprc) * RTE_ETHER_CRC_LEN;
-	stats->gotc += IGC_READ_REG(hw, IGC_GOTCL);
-	stats->gotc += ((uint64_t)IGC_READ_REG(hw, IGC_GOTCH) << 32);
+	stats->gotc += E1000_READ_REG(hw, E1000_GOTCL);
+	stats->gotc += ((uint64_t)E1000_READ_REG(hw, E1000_GOTCH) << 32);
 	stats->gotc -= (stats->gptc - old_gptc) * RTE_ETHER_CRC_LEN;
 
-	stats->rnbc += IGC_READ_REG(hw, IGC_RNBC);
-	stats->ruc += IGC_READ_REG(hw, IGC_RUC);
-	stats->rfc += IGC_READ_REG(hw, IGC_RFC);
-	stats->roc += IGC_READ_REG(hw, IGC_ROC);
-	stats->rjc += IGC_READ_REG(hw, IGC_RJC);
+	stats->rnbc += E1000_READ_REG(hw, E1000_RNBC);
+	stats->ruc += E1000_READ_REG(hw, E1000_RUC);
+	stats->rfc += E1000_READ_REG(hw, E1000_RFC);
+	stats->roc += E1000_READ_REG(hw, E1000_ROC);
+	stats->rjc += E1000_READ_REG(hw, E1000_RJC);
 
-	stats->mgprc += IGC_READ_REG(hw, IGC_MGTPRC);
-	stats->mgpdc += IGC_READ_REG(hw, IGC_MGTPDC);
-	stats->mgptc += IGC_READ_REG(hw, IGC_MGTPTC);
-	stats->b2ospc += IGC_READ_REG(hw, IGC_B2OSPC);
-	stats->b2ogprc += IGC_READ_REG(hw, IGC_B2OGPRC);
-	stats->o2bgptc += IGC_READ_REG(hw, IGC_O2BGPTC);
-	stats->o2bspc += IGC_READ_REG(hw, IGC_O2BSPC);
+	stats->mgprc += E1000_READ_REG(hw, E1000_MGTPRC);
+	stats->mgpdc += E1000_READ_REG(hw, E1000_MGTPDC);
+	stats->mgptc += E1000_READ_REG(hw, E1000_MGTPTC);
+	stats->b2ospc += E1000_READ_REG(hw, E1000_B2OSPC);
+	stats->b2ogprc += E1000_READ_REG(hw, E1000_B2OGPRC);
+	stats->o2bgptc += E1000_READ_REG(hw, E1000_O2BGPTC);
+	stats->o2bspc += E1000_READ_REG(hw, E1000_O2BSPC);
 
-	stats->tpr += IGC_READ_REG(hw, IGC_TPR);
-	stats->tpt += IGC_READ_REG(hw, IGC_TPT);
+	stats->tpr += E1000_READ_REG(hw, E1000_TPR);
+	stats->tpt += E1000_READ_REG(hw, E1000_TPT);
 
-	stats->tor += IGC_READ_REG(hw, IGC_TORL);
-	stats->tor += ((uint64_t)IGC_READ_REG(hw, IGC_TORH) << 32);
+	stats->tor += E1000_READ_REG(hw, E1000_TORL);
+	stats->tor += ((uint64_t)E1000_READ_REG(hw, E1000_TORH) << 32);
 	stats->tor -= (stats->tpr - old_tpr) * RTE_ETHER_CRC_LEN;
-	stats->tot += IGC_READ_REG(hw, IGC_TOTL);
-	stats->tot += ((uint64_t)IGC_READ_REG(hw, IGC_TOTH) << 32);
+	stats->tot += E1000_READ_REG(hw, E1000_TOTL);
+	stats->tot += ((uint64_t)E1000_READ_REG(hw, E1000_TOTH) << 32);
 	stats->tot -= (stats->tpt - old_tpt) * RTE_ETHER_CRC_LEN;
 
-	stats->ptc64 += IGC_READ_REG(hw, IGC_PTC64);
-	stats->ptc127 += IGC_READ_REG(hw, IGC_PTC127);
-	stats->ptc255 += IGC_READ_REG(hw, IGC_PTC255);
-	stats->ptc511 += IGC_READ_REG(hw, IGC_PTC511);
-	stats->ptc1023 += IGC_READ_REG(hw, IGC_PTC1023);
-	stats->ptc1522 += IGC_READ_REG(hw, IGC_PTC1522);
-	stats->mptc += IGC_READ_REG(hw, IGC_MPTC);
-	stats->bptc += IGC_READ_REG(hw, IGC_BPTC);
-	stats->tsctc += IGC_READ_REG(hw, IGC_TSCTC);
+	stats->ptc64 += E1000_READ_REG(hw, E1000_PTC64);
+	stats->ptc127 += E1000_READ_REG(hw, E1000_PTC127);
+	stats->ptc255 += E1000_READ_REG(hw, E1000_PTC255);
+	stats->ptc511 += E1000_READ_REG(hw, E1000_PTC511);
+	stats->ptc1023 += E1000_READ_REG(hw, E1000_PTC1023);
+	stats->ptc1522 += E1000_READ_REG(hw, E1000_PTC1522);
+	stats->mptc += E1000_READ_REG(hw, E1000_MPTC);
+	stats->bptc += E1000_READ_REG(hw, E1000_BPTC);
+	stats->tsctc += E1000_READ_REG(hw, E1000_TSCTC);
 
-	stats->iac += IGC_READ_REG(hw, IGC_IAC);
-	stats->rpthc += IGC_READ_REG(hw, IGC_RPTHC);
-	stats->hgptc += IGC_READ_REG(hw, IGC_HGPTC);
-	stats->icrxdmtc += IGC_READ_REG(hw, IGC_ICRXDMTC);
+	stats->iac += E1000_READ_REG(hw, E1000_IAC);
+	stats->rpthc += E1000_READ_REG(hw, E1000_RPTHC);
+	stats->hgptc += E1000_READ_REG(hw, E1000_HGPTC);
+	stats->icrxdmtc += E1000_READ_REG(hw, E1000_ICRXDMTC);
 
 	/* Host to Card Statistics */
-	stats->hgorc += IGC_READ_REG(hw, IGC_HGORCL);
-	stats->hgorc += ((uint64_t)IGC_READ_REG(hw, IGC_HGORCH) << 32);
+	stats->hgorc += E1000_READ_REG(hw, E1000_HGORCL);
+	stats->hgorc += ((uint64_t)E1000_READ_REG(hw, E1000_HGORCH) << 32);
 	stats->hgorc -= (stats->rpthc - old_rpthc) * RTE_ETHER_CRC_LEN;
-	stats->hgotc += IGC_READ_REG(hw, IGC_HGOTCL);
-	stats->hgotc += ((uint64_t)IGC_READ_REG(hw, IGC_HGOTCH) << 32);
+	stats->hgotc += E1000_READ_REG(hw, E1000_HGOTCL);
+	stats->hgotc += ((uint64_t)E1000_READ_REG(hw, E1000_HGOTCH) << 32);
 	stats->hgotc -= (stats->hgptc - old_hgptc) * RTE_ETHER_CRC_LEN;
-	stats->lenerrs += IGC_READ_REG(hw, IGC_LENERRS);
+	stats->lenerrs += E1000_READ_REG(hw, E1000_LENERRS);
 }
 
 /*
  * Write 0 to all queue status registers
  */
 static void
-igc_reset_queue_stats_register(struct igc_hw *hw)
+igc_reset_queue_stats_register(struct e1000_hw *hw)
 {
 	int i;
 
 	for (i = 0; i < IGC_QUEUE_PAIRS_NUM; i++) {
-		IGC_WRITE_REG(hw, IGC_PQGPRC(i), 0);
-		IGC_WRITE_REG(hw, IGC_PQGPTC(i), 0);
-		IGC_WRITE_REG(hw, IGC_PQGORC(i), 0);
-		IGC_WRITE_REG(hw, IGC_PQGOTC(i), 0);
-		IGC_WRITE_REG(hw, IGC_PQMPRC(i), 0);
-		IGC_WRITE_REG(hw, IGC_RQDPC(i), 0);
-		IGC_WRITE_REG(hw, IGC_TQDPC(i), 0);
+		E1000_WRITE_REG(hw, IGC_PQGPRC(i), 0);
+		E1000_WRITE_REG(hw, E1000_PQGPTC(i), 0);
+		E1000_WRITE_REG(hw, IGC_PQGORC(i), 0);
+		E1000_WRITE_REG(hw, IGC_PQGOTC(i), 0);
+		E1000_WRITE_REG(hw, IGC_PQMPRC(i), 0);
+		E1000_WRITE_REG(hw, E1000_RQDPC(i), 0);
+		E1000_WRITE_REG(hw, IGC_TQDPC(i), 0);
 	}
 }
 
@@ -1885,7 +1885,7 @@ igc_reset_queue_stats_register(struct igc_hw *hw)
 static void
 igc_read_queue_stats_register(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_hw_queue_stats *queue_stats =
 				IGC_DEV_PRIVATE_QUEUE_STATS(dev);
 	int i;
@@ -1908,49 +1908,49 @@ igc_read_queue_stats_register(struct rte_eth_dev *dev)
 		 * then we add the high 4 bytes by 1 and replace the low 4
 		 * bytes by the new value.
 		 */
-		tmp = IGC_READ_REG(hw, IGC_PQGPRC(i));
+		tmp = E1000_READ_REG(hw, IGC_PQGPRC(i));
 		value.ddword = queue_stats->pqgprc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
 		value.dword[U32_0_IN_U64] = tmp;
 		queue_stats->pqgprc[i] = value.ddword;
 
-		tmp = IGC_READ_REG(hw, IGC_PQGPTC(i));
+		tmp = E1000_READ_REG(hw, E1000_PQGPTC(i));
 		value.ddword = queue_stats->pqgptc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
 		value.dword[U32_0_IN_U64] = tmp;
 		queue_stats->pqgptc[i] = value.ddword;
 
-		tmp = IGC_READ_REG(hw, IGC_PQGORC(i));
+		tmp = E1000_READ_REG(hw, IGC_PQGORC(i));
 		value.ddword = queue_stats->pqgorc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
 		value.dword[U32_0_IN_U64] = tmp;
 		queue_stats->pqgorc[i] = value.ddword;
 
-		tmp = IGC_READ_REG(hw, IGC_PQGOTC(i));
+		tmp = E1000_READ_REG(hw, IGC_PQGOTC(i));
 		value.ddword = queue_stats->pqgotc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
 		value.dword[U32_0_IN_U64] = tmp;
 		queue_stats->pqgotc[i] = value.ddword;
 
-		tmp = IGC_READ_REG(hw, IGC_PQMPRC(i));
+		tmp = E1000_READ_REG(hw, IGC_PQMPRC(i));
 		value.ddword = queue_stats->pqmprc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
 		value.dword[U32_0_IN_U64] = tmp;
 		queue_stats->pqmprc[i] = value.ddword;
 
-		tmp = IGC_READ_REG(hw, IGC_RQDPC(i));
+		tmp = E1000_READ_REG(hw, E1000_RQDPC(i));
 		value.ddword = queue_stats->rqdpc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
 		value.dword[U32_0_IN_U64] = tmp;
 		queue_stats->rqdpc[i] = value.ddword;
 
-		tmp = IGC_READ_REG(hw, IGC_TQDPC(i));
+		tmp = E1000_READ_REG(hw, IGC_TQDPC(i));
 		value.ddword = queue_stats->tqdpc[i];
 		if (value.dword[U32_0_IN_U64] > tmp)
 			value.dword[U32_1_IN_U64]++;
@@ -1963,8 +1963,8 @@ static int
 eth_igc_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *rte_stats)
 {
 	struct igc_adapter *igc = IGC_DEV_PRIVATE(dev);
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	struct igc_hw_stats *stats = IGC_DEV_PRIVATE_STATS(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw_stats *stats = IGC_DEV_PRIVATE_STATS(dev);
 	struct igc_hw_queue_stats *queue_stats =
 			IGC_DEV_PRIVATE_QUEUE_STATS(dev);
 	int i;
@@ -2025,8 +2025,8 @@ static int
 eth_igc_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 		   unsigned int n)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	struct igc_hw_stats *hw_stats =
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw_stats *hw_stats =
 			IGC_DEV_PRIVATE_STATS(dev);
 	unsigned int i;
 
@@ -2054,8 +2054,8 @@ eth_igc_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 static int
 eth_igc_xstats_reset(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	struct igc_hw_stats *hw_stats = IGC_DEV_PRIVATE_STATS(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw_stats *hw_stats = IGC_DEV_PRIVATE_STATS(dev);
 	struct igc_hw_queue_stats *queue_stats =
 			IGC_DEV_PRIVATE_QUEUE_STATS(dev);
 
@@ -2124,8 +2124,8 @@ static int
 eth_igc_xstats_get_by_id(struct rte_eth_dev *dev, const uint64_t *ids,
 		uint64_t *values, unsigned int n)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	struct igc_hw_stats *hw_stats = IGC_DEV_PRIVATE_STATS(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw_stats *hw_stats = IGC_DEV_PRIVATE_STATS(dev);
 	unsigned int i;
 
 	igc_read_stats_registers(hw, hw_stats);
@@ -2185,7 +2185,7 @@ eth_igc_queue_stats_mapping_set(struct rte_eth_dev *dev,
 static int
 eth_igc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 	uint32_t vec = IGC_MISC_VEC_ID;
@@ -2195,8 +2195,8 @@ eth_igc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
 
 	uint32_t mask = 1u << (queue_id + vec);
 
-	IGC_WRITE_REG(hw, IGC_EIMC, mask);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_EIMC, mask);
+	E1000_WRITE_FLUSH(hw);
 
 	return 0;
 }
@@ -2204,7 +2204,7 @@ eth_igc_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
 static int
 eth_igc_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
 	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 	uint32_t vec = IGC_MISC_VEC_ID;
@@ -2214,8 +2214,8 @@ eth_igc_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 
 	uint32_t mask = 1u << (queue_id + vec);
 
-	IGC_WRITE_REG(hw, IGC_EIMS, mask);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_EIMS, mask);
+	E1000_WRITE_FLUSH(hw);
 
 	rte_intr_enable(intr_handle);
 
@@ -2225,7 +2225,7 @@ eth_igc_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
 static int
 eth_igc_flow_ctrl_get(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t ctrl;
 	int tx_pause;
 	int rx_pause;
@@ -2240,13 +2240,13 @@ eth_igc_flow_ctrl_get(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	 * Return rx_pause and tx_pause status according to actual setting of
 	 * the TFCE and RFCE bits in the CTRL register.
 	 */
-	ctrl = IGC_READ_REG(hw, IGC_CTRL);
-	if (ctrl & IGC_CTRL_TFCE)
+	ctrl = E1000_READ_REG(hw, E1000_CTRL);
+	if (ctrl & E1000_CTRL_TFCE)
 		tx_pause = 1;
 	else
 		tx_pause = 0;
 
-	if (ctrl & IGC_CTRL_RFCE)
+	if (ctrl & E1000_CTRL_RFCE)
 		rx_pause = 1;
 	else
 		rx_pause = 0;
@@ -2266,7 +2266,7 @@ eth_igc_flow_ctrl_get(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 static int
 eth_igc_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t rx_buf_size;
 	uint32_t max_high_water;
 	uint32_t rctl;
@@ -2291,16 +2291,16 @@ eth_igc_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 
 	switch (fc_conf->mode) {
 	case RTE_ETH_FC_NONE:
-		hw->fc.requested_mode = igc_fc_none;
+		hw->fc.requested_mode = e1000_fc_none;
 		break;
 	case RTE_ETH_FC_RX_PAUSE:
-		hw->fc.requested_mode = igc_fc_rx_pause;
+		hw->fc.requested_mode = e1000_fc_rx_pause;
 		break;
 	case RTE_ETH_FC_TX_PAUSE:
-		hw->fc.requested_mode = igc_fc_tx_pause;
+		hw->fc.requested_mode = e1000_fc_tx_pause;
 		break;
 	case RTE_ETH_FC_FULL:
-		hw->fc.requested_mode = igc_fc_full;
+		hw->fc.requested_mode = e1000_fc_full;
 		break;
 	default:
 		PMD_DRV_LOG(ERR, "unsupported fc mode: %u", fc_conf->mode);
@@ -2312,23 +2312,23 @@ eth_igc_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	hw->fc.low_water      = fc_conf->low_water;
 	hw->fc.send_xon	      = fc_conf->send_xon;
 
-	err = igc_setup_link_generic(hw);
-	if (err == IGC_SUCCESS) {
+	err = e1000_setup_link_generic(hw);
+	if (err == E1000_SUCCESS) {
 		/**
 		 * check if we want to forward MAC frames - driver doesn't have
 		 * native capability to do that, so we'll write the registers
 		 * ourselves
 		 **/
-		rctl = IGC_READ_REG(hw, IGC_RCTL);
+		rctl = E1000_READ_REG(hw, E1000_RCTL);
 
 		/* set or clear MFLCN.PMCF bit depending on configuration */
 		if (fc_conf->mac_ctrl_frame_fwd != 0)
-			rctl |= IGC_RCTL_PMCF;
+			rctl |= E1000_RCTL_PMCF;
 		else
-			rctl &= ~IGC_RCTL_PMCF;
+			rctl &= ~E1000_RCTL_PMCF;
 
-		IGC_WRITE_REG(hw, IGC_RCTL, rctl);
-		IGC_WRITE_FLUSH(hw);
+		E1000_WRITE_REG(hw, E1000_RCTL, rctl);
+		E1000_WRITE_FLUSH(hw);
 
 		return 0;
 	}
@@ -2342,7 +2342,7 @@ eth_igc_rss_reta_update(struct rte_eth_dev *dev,
 			struct rte_eth_rss_reta_entry64 *reta_conf,
 			uint16_t reta_size)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint16_t i;
 
 	if (reta_size != RTE_ETH_RSS_RETA_SIZE_128) {
@@ -2374,8 +2374,8 @@ eth_igc_rss_reta_update(struct rte_eth_dev *dev,
 		if (mask == IGC_RSS_RDT_REG_SIZE_MASK)
 			reg.dword = 0;
 		else
-			reg.dword = IGC_READ_REG_LE_VALUE(hw,
-					IGC_RETA(i / IGC_RSS_RDT_REG_SIZE));
+			reg.dword = E1000_READ_REG_LE_VALUE(hw,
+					E1000_RETA(i / IGC_RSS_RDT_REG_SIZE));
 
 		/* update the register */
 		RTE_BUILD_BUG_ON(sizeof(reta.bytes) != IGC_RSS_RDT_REG_SIZE);
@@ -2386,8 +2386,8 @@ eth_igc_rss_reta_update(struct rte_eth_dev *dev,
 			else
 				reta.bytes[j] = reg.bytes[j];
 		}
-		IGC_WRITE_REG_LE_VALUE(hw,
-			IGC_RETA(i / IGC_RSS_RDT_REG_SIZE), reta.dword);
+		E1000_WRITE_REG_LE_VALUE(hw,
+			E1000_RETA(i / IGC_RSS_RDT_REG_SIZE), reta.dword);
 	}
 
 	return 0;
@@ -2398,7 +2398,7 @@ eth_igc_rss_reta_query(struct rte_eth_dev *dev,
 		       struct rte_eth_rss_reta_entry64 *reta_conf,
 		       uint16_t reta_size)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint16_t i;
 
 	if (reta_size != RTE_ETH_RSS_RETA_SIZE_128) {
@@ -2428,8 +2428,8 @@ eth_igc_rss_reta_query(struct rte_eth_dev *dev,
 
 		/* read register and get the queue index */
 		RTE_BUILD_BUG_ON(sizeof(reta.bytes) != IGC_RSS_RDT_REG_SIZE);
-		reta.dword = IGC_READ_REG_LE_VALUE(hw,
-				IGC_RETA(i / IGC_RSS_RDT_REG_SIZE));
+		reta.dword = E1000_READ_REG_LE_VALUE(hw,
+				E1000_RETA(i / IGC_RSS_RDT_REG_SIZE));
 		for (j = 0; j < IGC_RSS_RDT_REG_SIZE; j++) {
 			if (mask & (1u << j))
 				reta_conf[idx].reta[shift + j] = reta.bytes[j];
@@ -2443,7 +2443,7 @@ static int
 eth_igc_rss_hash_update(struct rte_eth_dev *dev,
 			struct rte_eth_rss_conf *rss_conf)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	igc_hw_rss_hash_set(hw, rss_conf);
 	return 0;
 }
@@ -2452,7 +2452,7 @@ static int
 eth_igc_rss_hash_conf_get(struct rte_eth_dev *dev,
 			struct rte_eth_rss_conf *rss_conf)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t *hash_key = (uint32_t *)rss_conf->rss_key;
 	uint32_t mrqc;
 	uint64_t rss_hf;
@@ -2470,32 +2470,32 @@ eth_igc_rss_hash_conf_get(struct rte_eth_dev *dev,
 
 		/* read RSS key from register */
 		for (i = 0; i < IGC_HKEY_MAX_INDEX; i++)
-			hash_key[i] = IGC_READ_REG_LE_VALUE(hw, IGC_RSSRK(i));
+			hash_key[i] = E1000_READ_REG_LE_VALUE(hw, E1000_RSSRK(i));
 	}
 
 	/* get RSS functions configured in MRQC register */
-	mrqc = IGC_READ_REG(hw, IGC_MRQC);
-	if ((mrqc & IGC_MRQC_ENABLE_RSS_4Q) == 0)
+	mrqc = E1000_READ_REG(hw, E1000_MRQC);
+	if ((mrqc & E1000_MRQC_ENABLE_RSS_4Q) == 0)
 		return 0;
 
 	rss_hf = 0;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV4)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV4)
 		rss_hf |= RTE_ETH_RSS_IPV4;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV4_TCP)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV4_TCP)
 		rss_hf |= RTE_ETH_RSS_NONFRAG_IPV4_TCP;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV6)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV6)
 		rss_hf |= RTE_ETH_RSS_IPV6;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV6_EX)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV6_EX)
 		rss_hf |= RTE_ETH_RSS_IPV6_EX;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV6_TCP)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV6_TCP)
 		rss_hf |= RTE_ETH_RSS_NONFRAG_IPV6_TCP;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV6_TCP_EX)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV6_TCP_EX)
 		rss_hf |= RTE_ETH_RSS_IPV6_TCP_EX;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV4_UDP)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV4_UDP)
 		rss_hf |= RTE_ETH_RSS_NONFRAG_IPV4_UDP;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV6_UDP)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV6_UDP)
 		rss_hf |= RTE_ETH_RSS_NONFRAG_IPV6_UDP;
-	if (mrqc & IGC_MRQC_RSS_FIELD_IPV6_UDP_EX)
+	if (mrqc & E1000_MRQC_RSS_FIELD_IPV6_UDP_EX)
 		rss_hf |= RTE_ETH_RSS_IPV6_UDP_EX;
 
 	rss_conf->rss_hf |= rss_hf;
@@ -2505,20 +2505,20 @@ eth_igc_rss_hash_conf_get(struct rte_eth_dev *dev,
 static int
 eth_igc_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_vfta *shadow_vfta = IGC_DEV_PRIVATE_VFTA(dev);
 	uint32_t vfta;
 	uint32_t vid_idx;
 	uint32_t vid_bit;
 
-	vid_idx = (vlan_id >> IGC_VFTA_ENTRY_SHIFT) & IGC_VFTA_ENTRY_MASK;
-	vid_bit = 1u << (vlan_id & IGC_VFTA_ENTRY_BIT_SHIFT_MASK);
+	vid_idx = (vlan_id >> E1000_VFTA_ENTRY_SHIFT) & E1000_VFTA_ENTRY_MASK;
+	vid_bit = 1u << (vlan_id & E1000_VFTA_ENTRY_BIT_SHIFT_MASK);
 	vfta = shadow_vfta->vfta[vid_idx];
 	if (on)
 		vfta |= vid_bit;
 	else
 		vfta &= ~vid_bit;
-	IGC_WRITE_REG_ARRAY(hw, IGC_VFTA, vid_idx, vfta);
+	E1000_WRITE_REG_ARRAY(hw, E1000_VFTA, vid_idx, vfta);
 
 	/* update local VFTA copy */
 	shadow_vfta->vfta[vid_idx] = vfta;
@@ -2529,54 +2529,54 @@ eth_igc_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 static void
 igc_vlan_hw_filter_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
-	igc_read_reg_check_clear_bits(hw, IGC_RCTL,
-			IGC_RCTL_CFIEN | IGC_RCTL_VFE);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	igc_read_reg_check_clear_bits(hw, E1000_RCTL,
+			E1000_RCTL_CFIEN | E1000_RCTL_VFE);
 }
 
 static void
 igc_vlan_hw_filter_enable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_vfta *shadow_vfta = IGC_DEV_PRIVATE_VFTA(dev);
 	uint32_t reg_val;
 	int i;
 
 	/* Filter Table Enable, CFI not used for packet acceptance */
-	reg_val = IGC_READ_REG(hw, IGC_RCTL);
-	reg_val &= ~IGC_RCTL_CFIEN;
-	reg_val |= IGC_RCTL_VFE;
-	IGC_WRITE_REG(hw, IGC_RCTL, reg_val);
+	reg_val = E1000_READ_REG(hw, E1000_RCTL);
+	reg_val &= ~E1000_RCTL_CFIEN;
+	reg_val |= E1000_RCTL_VFE;
+	E1000_WRITE_REG(hw, E1000_RCTL, reg_val);
 
 	/* restore VFTA table */
 	for (i = 0; i < IGC_VFTA_SIZE; i++)
-		IGC_WRITE_REG_ARRAY(hw, IGC_VFTA, i, shadow_vfta->vfta[i]);
+		E1000_WRITE_REG_ARRAY(hw, E1000_VFTA, i, shadow_vfta->vfta[i]);
 }
 
 static void
 igc_vlan_hw_strip_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	igc_read_reg_check_clear_bits(hw, IGC_CTRL, IGC_CTRL_VME);
+	igc_read_reg_check_clear_bits(hw, E1000_CTRL, E1000_CTRL_VME);
 }
 
 static void
 igc_vlan_hw_strip_enable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	igc_read_reg_check_set_bits(hw, IGC_CTRL, IGC_CTRL_VME);
+	igc_read_reg_check_set_bits(hw, E1000_CTRL, E1000_CTRL_VME);
 }
 
 static int
 igc_vlan_hw_extend_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t frame_size = dev->data->mtu + IGC_ETH_OVERHEAD;
 	uint32_t ctrl_ext;
 
-	ctrl_ext = IGC_READ_REG(hw, IGC_CTRL_EXT);
+	ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
 
 	/* if extend vlan hasn't been enabled */
 	if ((ctrl_ext & IGC_CTRL_EXT_EXT_VLAN) == 0)
@@ -2588,20 +2588,20 @@ igc_vlan_hw_extend_disable(struct rte_eth_dev *dev)
 			frame_size, VLAN_TAG_SIZE + RTE_ETHER_MIN_MTU);
 		return -EINVAL;
 	}
-	IGC_WRITE_REG(hw, IGC_RLPML, frame_size - VLAN_TAG_SIZE);
+	E1000_WRITE_REG(hw, E1000_RLPML, frame_size - VLAN_TAG_SIZE);
 
-	IGC_WRITE_REG(hw, IGC_CTRL_EXT, ctrl_ext & ~IGC_CTRL_EXT_EXT_VLAN);
+	E1000_WRITE_REG(hw, E1000_CTRL_EXT, ctrl_ext & ~IGC_CTRL_EXT_EXT_VLAN);
 	return 0;
 }
 
 static int
 igc_vlan_hw_extend_enable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t frame_size = dev->data->mtu + IGC_ETH_OVERHEAD;
 	uint32_t ctrl_ext;
 
-	ctrl_ext = IGC_READ_REG(hw, IGC_CTRL_EXT);
+	ctrl_ext = E1000_READ_REG(hw, E1000_CTRL_EXT);
 
 	/* if extend vlan has been enabled */
 	if (ctrl_ext & IGC_CTRL_EXT_EXT_VLAN)
@@ -2613,9 +2613,9 @@ igc_vlan_hw_extend_enable(struct rte_eth_dev *dev)
 			frame_size, MAX_RX_JUMBO_FRAME_SIZE);
 		return -EINVAL;
 	}
-	IGC_WRITE_REG(hw, IGC_RLPML, frame_size);
+	E1000_WRITE_REG(hw, E1000_RLPML, frame_size);
 
-	IGC_WRITE_REG(hw, IGC_CTRL_EXT, ctrl_ext | IGC_CTRL_EXT_EXT_VLAN);
+	E1000_WRITE_REG(hw, E1000_CTRL_EXT, ctrl_ext | IGC_CTRL_EXT_EXT_VLAN);
 	return 0;
 }
 
@@ -2654,15 +2654,15 @@ eth_igc_vlan_tpid_set(struct rte_eth_dev *dev,
 		      enum rte_vlan_type vlan_type,
 		      uint16_t tpid)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t reg_val;
 
 	/* only outer TPID of double VLAN can be configured*/
 	if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER) {
-		reg_val = IGC_READ_REG(hw, IGC_VET);
+		reg_val = E1000_READ_REG(hw, E1000_VET);
 		reg_val = (reg_val & (~IGC_VET_EXT)) |
 			((uint32_t)tpid << IGC_VET_EXT_SHIFT);
-		IGC_WRITE_REG(hw, IGC_VET, reg_val);
+		E1000_WRITE_REG(hw, E1000_VET, reg_val);
 
 		return 0;
 	}
@@ -2675,42 +2675,42 @@ eth_igc_vlan_tpid_set(struct rte_eth_dev *dev,
 static int
 eth_igc_timesync_enable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct timespec system_time;
 	struct igc_rx_queue *rxq;
 	uint32_t val;
 	uint16_t i;
 
-	IGC_WRITE_REG(hw, IGC_TSAUXC, 0x0);
+	E1000_WRITE_REG(hw, E1000_TSAUXC, 0x0);
 
 	clock_gettime(CLOCK_REALTIME, &system_time);
-	IGC_WRITE_REG(hw, IGC_SYSTIML, system_time.tv_nsec);
-	IGC_WRITE_REG(hw, IGC_SYSTIMH, system_time.tv_sec);
+	E1000_WRITE_REG(hw, E1000_SYSTIML, system_time.tv_nsec);
+	E1000_WRITE_REG(hw, E1000_SYSTIMH, system_time.tv_sec);
 
 	/* Enable timestamping of received PTP packets. */
-	val = IGC_READ_REG(hw, IGC_RXPBS);
-	val |= IGC_RXPBS_CFG_TS_EN;
-	IGC_WRITE_REG(hw, IGC_RXPBS, val);
+	val = E1000_READ_REG(hw, E1000_RXPBS);
+	val |= E1000_RXPBS_CFG_TS_EN;
+	E1000_WRITE_REG(hw, E1000_RXPBS, val);
 
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
-		val = IGC_READ_REG(hw, IGC_SRRCTL(i));
+		val = E1000_READ_REG(hw, E1000_SRRCTL(i));
 		/* For now, only support retrieving Rx timestamp from timer0. */
-		val |= IGC_SRRCTL_TIMER1SEL(0) | IGC_SRRCTL_TIMER0SEL(0) |
-		       IGC_SRRCTL_TIMESTAMP;
-		IGC_WRITE_REG(hw, IGC_SRRCTL(i), val);
+		val |= E1000_SRRCTL_TIMER1SEL(0) | E1000_SRRCTL_TIMER0SEL(0) |
+		       E1000_SRRCTL_TIMESTAMP;
+		E1000_WRITE_REG(hw, E1000_SRRCTL(i), val);
 	}
 
-	val = IGC_TSYNCRXCTL_ENABLED | IGC_TSYNCRXCTL_TYPE_ALL |
-	      IGC_TSYNCRXCTL_RXSYNSIG;
-	IGC_WRITE_REG(hw, IGC_TSYNCRXCTL, val);
+	val = E1000_TSYNCRXCTL_ENABLED | E1000_TSYNCRXCTL_TYPE_ALL |
+	      E1000_TSYNCRXCTL_RXSYNSIG;
+	E1000_WRITE_REG(hw, E1000_TSYNCRXCTL, val);
 
 	/* Enable Timestamping of transmitted PTP packets. */
-	IGC_WRITE_REG(hw, IGC_TSYNCTXCTL, IGC_TSYNCTXCTL_ENABLED |
-		      IGC_TSYNCTXCTL_TXSYNSIG);
+	E1000_WRITE_REG(hw, E1000_TSYNCTXCTL, E1000_TSYNCTXCTL_ENABLED |
+		      E1000_TSYNCTXCTL_TXSYNSIG);
 
 	/* Read TXSTMP registers to discard any timestamp previously stored. */
-	IGC_READ_REG(hw, IGC_TXSTMPL);
-	IGC_READ_REG(hw, IGC_TXSTMPH);
+	E1000_READ_REG(hw, E1000_TXSTMPL);
+	E1000_READ_REG(hw, E1000_TXSTMPH);
 
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
 		rxq = dev->data->rx_queues[i];
@@ -2723,10 +2723,10 @@ eth_igc_timesync_enable(struct rte_eth_dev *dev)
 static int
 eth_igc_timesync_read_time(struct rte_eth_dev *dev, struct timespec *ts)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	ts->tv_nsec = IGC_READ_REG(hw, IGC_SYSTIML);
-	ts->tv_sec = IGC_READ_REG(hw, IGC_SYSTIMH);
+	ts->tv_nsec = E1000_READ_REG(hw, E1000_SYSTIML);
+	ts->tv_sec = E1000_READ_REG(hw, E1000_SYSTIMH);
 
 	return 0;
 }
@@ -2734,10 +2734,10 @@ eth_igc_timesync_read_time(struct rte_eth_dev *dev, struct timespec *ts)
 static int
 eth_igc_timesync_write_time(struct rte_eth_dev *dev, const struct timespec *ts)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	IGC_WRITE_REG(hw, IGC_SYSTIML, ts->tv_nsec);
-	IGC_WRITE_REG(hw, IGC_SYSTIMH, ts->tv_sec);
+	E1000_WRITE_REG(hw, E1000_SYSTIML, ts->tv_nsec);
+	E1000_WRITE_REG(hw, E1000_SYSTIMH, ts->tv_sec);
 
 	return 0;
 }
@@ -2745,20 +2745,20 @@ eth_igc_timesync_write_time(struct rte_eth_dev *dev, const struct timespec *ts)
 static int
 eth_igc_timesync_adjust_time(struct rte_eth_dev *dev, int64_t delta)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t nsec, sec;
 	uint64_t systime, ns;
 	struct timespec ts;
 
-	nsec = (uint64_t)IGC_READ_REG(hw, IGC_SYSTIML);
-	sec = (uint64_t)IGC_READ_REG(hw, IGC_SYSTIMH);
+	nsec = (uint64_t)E1000_READ_REG(hw, E1000_SYSTIML);
+	sec = (uint64_t)E1000_READ_REG(hw, E1000_SYSTIMH);
 	systime = sec * NSEC_PER_SEC + nsec;
 
 	ns = systime + delta;
 	ts = rte_ns_to_timespec(ns);
 
-	IGC_WRITE_REG(hw, IGC_SYSTIML, ts.tv_nsec);
-	IGC_WRITE_REG(hw, IGC_SYSTIMH, ts.tv_sec);
+	E1000_WRITE_REG(hw, E1000_SYSTIML, ts.tv_nsec);
+	E1000_WRITE_REG(hw, E1000_SYSTIMH, ts.tv_sec);
 
 	return 0;
 }
@@ -2803,18 +2803,18 @@ static int
 eth_igc_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 			       struct timespec *timestamp)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct rte_eth_link link;
 	uint32_t val, nsec, sec;
 	uint64_t tx_timestamp;
 	int adjust = 0;
 
-	val = IGC_READ_REG(hw, IGC_TSYNCTXCTL);
-	if (!(val & IGC_TSYNCTXCTL_VALID))
+	val = E1000_READ_REG(hw, E1000_TSYNCTXCTL);
+	if (!(val & E1000_TSYNCTXCTL_VALID))
 		return -EINVAL;
 
-	nsec = (uint64_t)IGC_READ_REG(hw, IGC_TXSTMPL);
-	sec = (uint64_t)IGC_READ_REG(hw, IGC_TXSTMPH);
+	nsec = (uint64_t)E1000_READ_REG(hw, E1000_TXSTMPL);
+	sec = (uint64_t)E1000_READ_REG(hw, E1000_TXSTMPH);
 	tx_timestamp = sec * NSEC_PER_SEC + nsec;
 
 	/* Get current link speed. */
@@ -2845,22 +2845,22 @@ eth_igc_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 static int
 eth_igc_timesync_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t val;
 
 	/* Disable timestamping of transmitted PTP packets. */
-	IGC_WRITE_REG(hw, IGC_TSYNCTXCTL, 0);
+	E1000_WRITE_REG(hw, E1000_TSYNCTXCTL, 0);
 
 	/* Disable timestamping of received PTP packets. */
-	IGC_WRITE_REG(hw, IGC_TSYNCRXCTL, 0);
+	E1000_WRITE_REG(hw, E1000_TSYNCRXCTL, 0);
 
-	val = IGC_READ_REG(hw, IGC_RXPBS);
-	val &= ~IGC_RXPBS_CFG_TS_EN;
-	IGC_WRITE_REG(hw, IGC_RXPBS, val);
+	val = E1000_READ_REG(hw, E1000_RXPBS);
+	val &= ~E1000_RXPBS_CFG_TS_EN;
+	E1000_WRITE_REG(hw, E1000_RXPBS, val);
 
-	val = IGC_READ_REG(hw, IGC_SRRCTL(0));
-	val &= ~IGC_SRRCTL_TIMESTAMP;
-	IGC_WRITE_REG(hw, IGC_SRRCTL(0), val);
+	val = E1000_READ_REG(hw, E1000_SRRCTL(0));
+	val &= ~E1000_SRRCTL_TIMESTAMP;
+	E1000_WRITE_REG(hw, E1000_SRRCTL(0), val);
 
 	return 0;
 }
diff --git a/drivers/net/intel/igc/igc_ethdev.h b/drivers/net/intel/e1000/igc_ethdev.h
similarity index 91%
rename from drivers/net/intel/igc/igc_ethdev.h
rename to drivers/net/intel/e1000/igc_ethdev.h
index d3d3ddd6f6..7fa7877adf 100644
--- a/drivers/net/intel/igc/igc_ethdev.h
+++ b/drivers/net/intel/e1000/igc_ethdev.h
@@ -9,10 +9,10 @@
 #include <rte_flow.h>
 #include <rte_time.h>
 
-#include "base/igc_osdep.h"
-#include "base/igc_hw.h"
-#include "base/igc_i225.h"
-#include "base/igc_api.h"
+#include "base/e1000_osdep.h"
+#include "base/e1000_hw.h"
+#include "base/e1000_i225.h"
+#include "base/e1000_api.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -55,13 +55,13 @@ extern "C" {
 #define IGC_RX_DESCRIPTOR_MULTIPLE	8
 
 #define IGC_RXD_ALIGN	((uint16_t)(IGC_ALIGN / \
-		sizeof(union igc_adv_rx_desc)))
+		sizeof(union e1000_adv_rx_desc)))
 #define IGC_TXD_ALIGN	((uint16_t)(IGC_ALIGN / \
-		sizeof(union igc_adv_tx_desc)))
+		sizeof(union e1000_adv_tx_desc)))
 #define IGC_MIN_TXD	IGC_TX_DESCRIPTOR_MULTIPLE
-#define IGC_MAX_TXD	((uint16_t)(0x80000 / sizeof(union igc_adv_tx_desc)))
+#define IGC_MAX_TXD	((uint16_t)(0x80000 / sizeof(union e1000_adv_tx_desc)))
 #define IGC_MIN_RXD	IGC_RX_DESCRIPTOR_MULTIPLE
-#define IGC_MAX_RXD	((uint16_t)(0x80000 / sizeof(union igc_adv_rx_desc)))
+#define IGC_MAX_RXD	((uint16_t)(0x80000 / sizeof(union e1000_adv_rx_desc)))
 
 #define IGC_TX_MAX_SEG		UINT8_MAX
 #define IGC_TX_MAX_MTU_SEG	UINT8_MAX
@@ -224,8 +224,8 @@ TAILQ_HEAD(igc_flow_list, rte_flow);
  * Structure to store private data for each driver instance (for each port).
  */
 struct igc_adapter {
-	struct igc_hw		hw;
-	struct igc_hw_stats	stats;
+	struct e1000_hw		hw;
+	struct e1000_hw_stats	stats;
 	struct igc_hw_queue_stats queue_stats;
 	int16_t txq_stats_map[IGC_QUEUE_PAIRS_NUM];
 	int16_t rxq_stats_map[IGC_QUEUE_PAIRS_NUM];
@@ -268,27 +268,27 @@ struct igc_adapter {
 	(&((struct igc_adapter *)(_dev)->data->dev_private)->flow_list)
 
 static inline void
-igc_read_reg_check_set_bits(struct igc_hw *hw, uint32_t reg, uint32_t bits)
+igc_read_reg_check_set_bits(struct e1000_hw *hw, uint32_t reg, uint32_t bits)
 {
-	uint32_t reg_val = IGC_READ_REG(hw, reg);
+	uint32_t reg_val = E1000_READ_REG(hw, reg);
 
 	bits |= reg_val;
 	if (bits == reg_val)
 		return;	/* no need to write back */
 
-	IGC_WRITE_REG(hw, reg, bits);
+	E1000_WRITE_REG(hw, reg, bits);
 }
 
 static inline void
-igc_read_reg_check_clear_bits(struct igc_hw *hw, uint32_t reg, uint32_t bits)
+igc_read_reg_check_clear_bits(struct e1000_hw *hw, uint32_t reg, uint32_t bits)
 {
-	uint32_t reg_val = IGC_READ_REG(hw, reg);
+	uint32_t reg_val = E1000_READ_REG(hw, reg);
 
 	bits = reg_val & ~bits;
 	if (bits == reg_val)
 		return;	/* no need to write back */
 
-	IGC_WRITE_REG(hw, reg, bits);
+	E1000_WRITE_REG(hw, reg, bits);
 }
 
 #ifdef __cplusplus
diff --git a/drivers/net/intel/igc/igc_filter.c b/drivers/net/intel/e1000/igc_filter.c
similarity index 81%
rename from drivers/net/intel/igc/igc_filter.c
rename to drivers/net/intel/e1000/igc_filter.c
index bff98df200..3df7183dbb 100644
--- a/drivers/net/intel/igc/igc_filter.c
+++ b/drivers/net/intel/e1000/igc_filter.c
@@ -3,7 +3,7 @@
  */
 
 #include "rte_malloc.h"
-#include "igc_logs.h"
+#include "e1000_logs.h"
 #include "igc_txrx.h"
 #include "igc_filter.h"
 #include "igc_flow.h"
@@ -57,7 +57,7 @@ int
 igc_del_ethertype_filter(struct rte_eth_dev *dev,
 			const struct igc_ethertype_filter *filter)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_adapter *igc = IGC_DEV_PRIVATE(dev);
 	int ret;
 
@@ -77,8 +77,8 @@ igc_del_ethertype_filter(struct rte_eth_dev *dev,
 
 	igc->ethertype_filters[ret].ether_type = 0;
 
-	IGC_WRITE_REG(hw, IGC_ETQF(ret), 0);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_ETQF(ret), 0);
+	E1000_WRITE_FLUSH(hw);
 	return 0;
 }
 
@@ -86,7 +86,7 @@ int
 igc_add_ethertype_filter(struct rte_eth_dev *dev,
 			const struct igc_ethertype_filter *filter)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_adapter *igc = IGC_DEV_PRIVATE(dev);
 	uint32_t etqf;
 	int ret, empty;
@@ -114,13 +114,13 @@ igc_add_ethertype_filter(struct rte_eth_dev *dev,
 	ret = empty;
 
 	etqf = filter->ether_type;
-	etqf |= IGC_ETQF_FILTER_ENABLE | IGC_ETQF_QUEUE_ENABLE;
+	etqf |= E1000_ETQF_FILTER_ENABLE | E1000_ETQF_QUEUE_ENABLE;
 	etqf |= (uint32_t)filter->queue << IGC_ETQF_QUEUE_SHIFT;
 
 	memcpy(&igc->ethertype_filters[ret], filter, sizeof(*filter));
 
-	IGC_WRITE_REG(hw, IGC_ETQF(ret), etqf);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_ETQF(ret), etqf);
+	E1000_WRITE_FLUSH(hw);
 	return 0;
 }
 
@@ -128,13 +128,13 @@ igc_add_ethertype_filter(struct rte_eth_dev *dev,
 static void
 igc_clear_all_ethertype_filter(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_adapter *igc = IGC_DEV_PRIVATE(dev);
 	int i;
 
 	for (i = 0; i < IGC_MAX_ETQF_FILTERS; i++)
-		IGC_WRITE_REG(hw, IGC_ETQF(i), 0);
-	IGC_WRITE_FLUSH(hw);
+		E1000_WRITE_REG(hw, E1000_ETQF(i), 0);
+	E1000_WRITE_FLUSH(hw);
 
 	memset(&igc->ethertype_filters, 0, sizeof(igc->ethertype_filters));
 }
@@ -196,59 +196,59 @@ static void
 igc_enable_tuple_filter(struct rte_eth_dev *dev,
 			const struct igc_adapter *igc, uint8_t index)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	const struct igc_ntuple_filter *filter = &igc->ntuple_filters[index];
 	const struct igc_ntuple_info *info = &filter->tuple_info;
-	uint32_t ttqf, imir, imir_ext = IGC_IMIREXT_SIZE_BP;
+	uint32_t ttqf, imir, imir_ext = E1000_IMIREXT_SIZE_BP;
 
 	imir = info->dst_port;
-	imir |= (uint32_t)info->priority << IGC_IMIR_PRIORITY_SHIFT;
+	imir |= (uint32_t)info->priority << E1000_IMIR_PRIORITY_SHIFT;
 
 	/* 0b means not compare. */
 	if (info->dst_port_mask == 0)
-		imir |= IGC_IMIR_PORT_BP;
+		imir |= E1000_IMIR_PORT_BP;
 
-	ttqf = IGC_TTQF_DISABLE_MASK | IGC_TTQF_QUEUE_ENABLE;
-	ttqf |= (uint32_t)filter->queue << IGC_TTQF_QUEUE_SHIFT;
+	ttqf = E1000_TTQF_DISABLE_MASK | E1000_TTQF_QUEUE_ENABLE;
+	ttqf |= (uint32_t)filter->queue << E1000_TTQF_QUEUE_SHIFT;
 	ttqf |= info->proto;
 
 	if (info->proto_mask)
-		ttqf &= ~IGC_TTQF_MASK_ENABLE;
+		ttqf &= ~E1000_TTQF_MASK_ENABLE;
 
 	/* TCP flags bits setting. */
 	if (info->tcp_flags & RTE_NTUPLE_TCP_FLAGS_MASK) {
 		if (info->tcp_flags & RTE_TCP_URG_FLAG)
-			imir_ext |= IGC_IMIREXT_CTRL_URG;
+			imir_ext |= E1000_IMIREXT_CTRL_URG;
 		if (info->tcp_flags & RTE_TCP_ACK_FLAG)
-			imir_ext |= IGC_IMIREXT_CTRL_ACK;
+			imir_ext |= E1000_IMIREXT_CTRL_ACK;
 		if (info->tcp_flags & RTE_TCP_PSH_FLAG)
-			imir_ext |= IGC_IMIREXT_CTRL_PSH;
+			imir_ext |= E1000_IMIREXT_CTRL_PSH;
 		if (info->tcp_flags & RTE_TCP_RST_FLAG)
-			imir_ext |= IGC_IMIREXT_CTRL_RST;
+			imir_ext |= E1000_IMIREXT_CTRL_RST;
 		if (info->tcp_flags & RTE_TCP_SYN_FLAG)
-			imir_ext |= IGC_IMIREXT_CTRL_SYN;
+			imir_ext |= E1000_IMIREXT_CTRL_SYN;
 		if (info->tcp_flags & RTE_TCP_FIN_FLAG)
-			imir_ext |= IGC_IMIREXT_CTRL_FIN;
+			imir_ext |= E1000_IMIREXT_CTRL_FIN;
 	} else {
-		imir_ext |= IGC_IMIREXT_CTRL_BP;
+		imir_ext |= E1000_IMIREXT_CTRL_BP;
 	}
 
-	IGC_WRITE_REG(hw, IGC_IMIR(index), imir);
-	IGC_WRITE_REG(hw, IGC_TTQF(index), ttqf);
-	IGC_WRITE_REG(hw, IGC_IMIREXT(index), imir_ext);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_IMIR(index), imir);
+	E1000_WRITE_REG(hw, E1000_TTQF(index), ttqf);
+	E1000_WRITE_REG(hw, E1000_IMIREXT(index), imir_ext);
+	E1000_WRITE_FLUSH(hw);
 }
 
 /* Reset hardware register values */
 static void
 igc_disable_tuple_filter(struct rte_eth_dev *dev, uint8_t index)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 
-	IGC_WRITE_REG(hw, IGC_TTQF(index), IGC_TTQF_DISABLE_MASK);
-	IGC_WRITE_REG(hw, IGC_IMIR(index), 0);
-	IGC_WRITE_REG(hw, IGC_IMIREXT(index), 0);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_TTQF(index), E1000_TTQF_DISABLE_MASK);
+	E1000_WRITE_REG(hw, E1000_IMIR(index), 0);
+	E1000_WRITE_REG(hw, E1000_IMIREXT(index), 0);
+	E1000_WRITE_FLUSH(hw);
 }
 
 int
@@ -310,7 +310,7 @@ int
 igc_set_syn_filter(struct rte_eth_dev *dev,
 		const struct igc_syn_filter *filter)
 {
-	struct igc_hw *hw;
+	struct e1000_hw *hw;
 	struct igc_adapter *igc;
 	uint32_t synqf, rfctl;
 
@@ -331,7 +331,7 @@ igc_set_syn_filter(struct rte_eth_dev *dev,
 	synqf = (uint32_t)filter->queue << IGC_SYN_FILTER_QUEUE_SHIFT;
 	synqf |= IGC_SYN_FILTER_ENABLE;
 
-	rfctl = IGC_READ_REG(hw, IGC_RFCTL);
+	rfctl = E1000_READ_REG(hw, E1000_RFCTL);
 	if (filter->hig_pri)
 		rfctl |= IGC_RFCTL_SYNQFP;
 	else
@@ -340,9 +340,9 @@ igc_set_syn_filter(struct rte_eth_dev *dev,
 	memcpy(&igc->syn_filter, filter, sizeof(igc->syn_filter));
 	igc->syn_filter.enable = 1;
 
-	IGC_WRITE_REG(hw, IGC_RFCTL, rfctl);
-	IGC_WRITE_REG(hw, IGC_SYNQF(0), synqf);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_RFCTL, rfctl);
+	E1000_WRITE_REG(hw, E1000_SYNQF(0), synqf);
+	E1000_WRITE_FLUSH(hw);
 	return 0;
 }
 
@@ -350,11 +350,11 @@ igc_set_syn_filter(struct rte_eth_dev *dev,
 void
 igc_clear_syn_filter(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_adapter *igc = IGC_DEV_PRIVATE(dev);
 
-	IGC_WRITE_REG(hw, IGC_SYNQF(0), 0);
-	IGC_WRITE_FLUSH(hw);
+	E1000_WRITE_REG(hw, E1000_SYNQF(0), 0);
+	E1000_WRITE_FLUSH(hw);
 
 	memset(&igc->syn_filter, 0, sizeof(igc->syn_filter));
 }
diff --git a/drivers/net/intel/igc/igc_filter.h b/drivers/net/intel/e1000/igc_filter.h
similarity index 100%
rename from drivers/net/intel/igc/igc_filter.h
rename to drivers/net/intel/e1000/igc_filter.h
diff --git a/drivers/net/intel/igc/igc_flow.c b/drivers/net/intel/e1000/igc_flow.c
similarity index 99%
rename from drivers/net/intel/igc/igc_flow.c
rename to drivers/net/intel/e1000/igc_flow.c
index b778ac2613..b947cf8b08 100644
--- a/drivers/net/intel/igc/igc_flow.c
+++ b/drivers/net/intel/e1000/igc_flow.c
@@ -3,7 +3,7 @@
  */
 
 #include "rte_malloc.h"
-#include "igc_logs.h"
+#include "e1000_logs.h"
 #include "igc_txrx.h"
 #include "igc_filter.h"
 #include "igc_flow.h"
diff --git a/drivers/net/intel/igc/igc_flow.h b/drivers/net/intel/e1000/igc_flow.h
similarity index 100%
rename from drivers/net/intel/igc/igc_flow.h
rename to drivers/net/intel/e1000/igc_flow.h
diff --git a/drivers/net/intel/igc/igc_logs.c b/drivers/net/intel/e1000/igc_logs.c
similarity index 90%
rename from drivers/net/intel/igc/igc_logs.c
rename to drivers/net/intel/e1000/igc_logs.c
index 9cb8da527e..df91d173dd 100644
--- a/drivers/net/intel/igc/igc_logs.c
+++ b/drivers/net/intel/e1000/igc_logs.c
@@ -4,7 +4,7 @@
 
 #include <rte_common.h>
 
-#include "igc_logs.h"
+#include "e1000_logs.h"
 
 RTE_LOG_REGISTER_SUFFIX(igc_logtype_init, init, INFO);
 RTE_LOG_REGISTER_SUFFIX(igc_logtype_driver, driver, INFO);
diff --git a/drivers/net/intel/igc/igc_txrx.c b/drivers/net/intel/e1000/igc_txrx.c
similarity index 87%
rename from drivers/net/intel/igc/igc_txrx.c
rename to drivers/net/intel/e1000/igc_txrx.c
index fabab5b1a3..9b2eb343ef 100644
--- a/drivers/net/intel/igc/igc_txrx.c
+++ b/drivers/net/intel/e1000/igc_txrx.c
@@ -8,7 +8,7 @@
 #include <ethdev_driver.h>
 #include <rte_net.h>
 
-#include "igc_logs.h"
+#include "e1000_logs.h"
 #include "igc_txrx.h"
 
 #ifdef RTE_PMD_USE_PREFETCH
@@ -24,16 +24,16 @@
 #endif
 
 /* Multicast / Unicast table offset mask. */
-#define IGC_RCTL_MO_MSK			(3u << IGC_RCTL_MO_SHIFT)
+#define E1000_RCTL_MO_MSK			(3u << E1000_RCTL_MO_SHIFT)
 
 /* Loopback mode. */
-#define IGC_RCTL_LBM_SHIFT		6
-#define IGC_RCTL_LBM_MSK		(3u << IGC_RCTL_LBM_SHIFT)
+#define E1000_RCTL_LBM_SHIFT		6
+#define E1000_RCTL_LBM_MSK		(3u << E1000_RCTL_LBM_SHIFT)
 
 /* Hash select for MTA */
-#define IGC_RCTL_HSEL_SHIFT		8
-#define IGC_RCTL_HSEL_MSK		(3u << IGC_RCTL_HSEL_SHIFT)
-#define IGC_RCTL_PSP			(1u << 21)
+#define E1000_RCTL_HSEL_SHIFT		8
+#define E1000_RCTL_HSEL_MSK		(3u << E1000_RCTL_HSEL_SHIFT)
+#define E1000_RCTL_PSP			(1u << 21)
 
 /* Receive buffer size for header buffer */
 #define IGC_SRRCTL_BSIZEHEADER_SHIFT	8
@@ -109,14 +109,14 @@ rx_desc_statuserr_to_pkt_flags(uint32_t statuserr)
 	uint64_t pkt_flags = 0;
 	uint32_t tmp;
 
-	if (statuserr & IGC_RXD_STAT_VP)
+	if (statuserr & E1000_RXD_STAT_VP)
 		pkt_flags |= RTE_MBUF_F_RX_VLAN_STRIPPED;
 
-	tmp = !!(statuserr & (IGC_RXD_STAT_L4CS | IGC_RXD_STAT_UDPCS));
+	tmp = !!(statuserr & (IGC_RXD_STAT_L4CS | E1000_RXD_STAT_UDPCS));
 	tmp = (tmp << 1) | (uint32_t)!!(statuserr & IGC_RXD_EXT_ERR_L4E);
 	pkt_flags |= l4_chksum_flags[tmp];
 
-	tmp = !!(statuserr & IGC_RXD_STAT_IPCS);
+	tmp = !!(statuserr & E1000_RXD_STAT_IPCS);
 	tmp = (tmp << 1) | (uint32_t)!!(statuserr & IGC_RXD_EXT_ERR_IPE);
 	pkt_flags |= l3_chksum_flags[tmp];
 
@@ -193,7 +193,7 @@ rx_desc_pkt_info_to_pkt_type(uint32_t pkt_info)
 		[IGC_PACKET_TYPE_IPV4_EXT_SCTP] = RTE_PTYPE_L2_ETHER |
 			RTE_PTYPE_L3_IPV4_EXT | RTE_PTYPE_L4_SCTP,
 	};
-	if (unlikely(pkt_info & IGC_RXDADV_PKTTYPE_ETQF))
+	if (unlikely(pkt_info & E1000_RXDADV_PKTTYPE_ETQF))
 		return RTE_PTYPE_UNKNOWN;
 
 	pkt_info = (pkt_info >> IGC_PACKET_TYPE_SHIFT) & IGC_PACKET_TYPE_MASK;
@@ -203,7 +203,7 @@ rx_desc_pkt_info_to_pkt_type(uint32_t pkt_info)
 
 static inline void
 rx_desc_get_pkt_info(struct igc_rx_queue *rxq, struct rte_mbuf *rxm,
-		union igc_adv_rx_desc *rxd, uint32_t staterr)
+		union e1000_adv_rx_desc *rxd, uint32_t staterr)
 {
 	uint64_t pkt_flags;
 	uint32_t hlen_type_rss;
@@ -237,18 +237,18 @@ uint16_t
 igc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 {
 	struct igc_rx_queue * const rxq = rx_queue;
-	volatile union igc_adv_rx_desc * const rx_ring = rxq->rx_ring;
+	volatile union e1000_adv_rx_desc * const rx_ring = rxq->rx_ring;
 	struct igc_rx_entry * const sw_ring = rxq->sw_ring;
 	uint16_t rx_id = rxq->rx_tail;
 	uint16_t nb_rx = 0;
 	uint16_t nb_hold = 0;
 
 	while (nb_rx < nb_pkts) {
-		volatile union igc_adv_rx_desc *rxdp;
+		volatile union e1000_adv_rx_desc *rxdp;
 		struct igc_rx_entry *rxe;
 		struct rte_mbuf *rxm;
 		struct rte_mbuf *nmb;
-		union igc_adv_rx_desc rxd;
+		union e1000_adv_rx_desc rxd;
 		uint32_t staterr;
 		uint16_t data_len;
 
@@ -262,14 +262,14 @@ igc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		 */
 		rxdp = &rx_ring[rx_id];
 		staterr = rte_cpu_to_le_32(rxdp->wb.upper.status_error);
-		if (!(staterr & IGC_RXD_STAT_DD))
+		if (!(staterr & E1000_RXD_STAT_DD))
 			break;
 		rxd = *rxdp;
 
 		/*
 		 * End of packet.
 		 *
-		 * If the IGC_RXD_STAT_EOP flag is not set, the RX packet is
+		 * If the E1000_RXD_STAT_EOP flag is not set, the RX packet is
 		 * likely to be invalid and to be dropped by the various
 		 * validation checks performed by the network stack.
 		 *
@@ -391,7 +391,7 @@ igc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			"port_id=%u queue_id=%u rx_tail=%u nb_hold=%u nb_rx=%u",
 			rxq->port_id, rxq->queue_id, rx_id, nb_hold, nb_rx);
 		rx_id = (rx_id == 0) ? (rxq->nb_rx_desc - 1) : (rx_id - 1);
-		IGC_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id);
+		E1000_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id);
 		nb_hold = 0;
 	}
 	rxq->nb_rx_hold = nb_hold;
@@ -403,7 +403,7 @@ igc_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 			uint16_t nb_pkts)
 {
 	struct igc_rx_queue * const rxq = rx_queue;
-	volatile union igc_adv_rx_desc * const rx_ring = rxq->rx_ring;
+	volatile union e1000_adv_rx_desc * const rx_ring = rxq->rx_ring;
 	struct igc_rx_entry * const sw_ring = rxq->sw_ring;
 	struct rte_mbuf *first_seg = rxq->pkt_first_seg;
 	struct rte_mbuf *last_seg = rxq->pkt_last_seg;
@@ -413,11 +413,11 @@ igc_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	uint16_t nb_hold = 0;
 
 	while (nb_rx < nb_pkts) {
-		volatile union igc_adv_rx_desc *rxdp;
+		volatile union e1000_adv_rx_desc *rxdp;
 		struct igc_rx_entry *rxe;
 		struct rte_mbuf *rxm;
 		struct rte_mbuf *nmb;
-		union igc_adv_rx_desc rxd;
+		union e1000_adv_rx_desc rxd;
 		uint32_t staterr;
 		uint16_t data_len;
 
@@ -432,7 +432,7 @@ igc_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		 */
 		rxdp = &rx_ring[rx_id];
 		staterr = rte_cpu_to_le_32(rxdp->wb.upper.status_error);
-		if (!(staterr & IGC_RXD_STAT_DD))
+		if (!(staterr & E1000_RXD_STAT_DD))
 			break;
 		rxd = *rxdp;
 
@@ -559,7 +559,7 @@ igc_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		 * update the pointer to the last mbuf of the current scattered
 		 * packet and continue to parse the RX ring.
 		 */
-		if (!(staterr & IGC_RXD_STAT_EOP)) {
+		if (!(staterr & E1000_RXD_STAT_EOP)) {
 			last_seg = rxm;
 			goto next_desc;
 		}
@@ -631,7 +631,7 @@ igc_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 			"port_id=%u queue_id=%u rx_tail=%u nb_hold=%u nb_rx=%u",
 			rxq->port_id, rxq->queue_id, rx_id, nb_hold, nb_rx);
 		rx_id = (rx_id == 0) ? (rxq->nb_rx_desc - 1) : (rx_id - 1);
-		IGC_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id);
+		E1000_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id);
 		nb_hold = 0;
 	}
 	rxq->nb_rx_hold = nb_hold;
@@ -676,7 +676,7 @@ uint32_t eth_igc_rx_queue_count(void *rx_queue)
 	 */
 #define IGC_RXQ_SCAN_INTERVAL 4
 
-	volatile union igc_adv_rx_desc *rxdp;
+	volatile union e1000_adv_rx_desc *rxdp;
 	struct igc_rx_queue *rxq;
 	uint16_t desc = 0;
 
@@ -685,7 +685,7 @@ uint32_t eth_igc_rx_queue_count(void *rx_queue)
 
 	while (desc < rxq->nb_rx_desc - rxq->rx_tail) {
 		if (unlikely(!(rxdp->wb.upper.status_error &
-				IGC_RXD_STAT_DD)))
+				E1000_RXD_STAT_DD)))
 			return desc;
 		desc += IGC_RXQ_SCAN_INTERVAL;
 		rxdp += IGC_RXQ_SCAN_INTERVAL;
@@ -693,7 +693,7 @@ uint32_t eth_igc_rx_queue_count(void *rx_queue)
 	rxdp = &rxq->rx_ring[rxq->rx_tail + desc - rxq->nb_rx_desc];
 
 	while (desc < rxq->nb_rx_desc &&
-		(rxdp->wb.upper.status_error & IGC_RXD_STAT_DD)) {
+		(rxdp->wb.upper.status_error & E1000_RXD_STAT_DD)) {
 		desc += IGC_RXQ_SCAN_INTERVAL;
 		rxdp += IGC_RXQ_SCAN_INTERVAL;
 	}
@@ -718,7 +718,7 @@ int eth_igc_rx_descriptor_status(void *rx_queue, uint16_t offset)
 		desc -= rxq->nb_rx_desc;
 
 	status = &rxq->rx_ring[desc].wb.upper.status_error;
-	if (*status & rte_cpu_to_le_32(IGC_RXD_STAT_DD))
+	if (*status & rte_cpu_to_le_32(E1000_RXD_STAT_DD))
 		return RTE_ETH_RX_DESC_DONE;
 
 	return RTE_ETH_RX_DESC_AVAIL;
@@ -733,7 +733,7 @@ igc_alloc_rx_queue_mbufs(struct igc_rx_queue *rxq)
 
 	/* Initialize software ring entries. */
 	for (i = 0; i < rxq->nb_rx_desc; i++) {
-		volatile union igc_adv_rx_desc *rxd;
+		volatile union e1000_adv_rx_desc *rxd;
 		struct rte_mbuf *mbuf = rte_mbuf_raw_alloc(rxq->mb_pool);
 
 		if (mbuf == NULL) {
@@ -769,16 +769,16 @@ static uint8_t default_rss_key[40] = {
 void
 igc_rss_disable(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint32_t mrqc;
 
-	mrqc = IGC_READ_REG(hw, IGC_MRQC);
-	mrqc &= ~IGC_MRQC_ENABLE_MASK;
-	IGC_WRITE_REG(hw, IGC_MRQC, mrqc);
+	mrqc = E1000_READ_REG(hw, E1000_MRQC);
+	mrqc &= ~E1000_MRQC_ENABLE_MASK;
+	E1000_WRITE_REG(hw, E1000_MRQC, mrqc);
 }
 
 void
-igc_hw_rss_hash_set(struct igc_hw *hw, struct rte_eth_rss_conf *rss_conf)
+igc_hw_rss_hash_set(struct e1000_hw *hw, struct rte_eth_rss_conf *rss_conf)
 {
 	uint32_t *hash_key = (uint32_t *)rss_conf->rss_key;
 	uint32_t mrqc;
@@ -789,38 +789,38 @@ igc_hw_rss_hash_set(struct igc_hw *hw, struct rte_eth_rss_conf *rss_conf)
 
 		/* Fill in RSS hash key */
 		for (i = 0; i < IGC_HKEY_MAX_INDEX; i++)
-			IGC_WRITE_REG_LE_VALUE(hw, IGC_RSSRK(i), hash_key[i]);
+			E1000_WRITE_REG_LE_VALUE(hw, E1000_RSSRK(i), hash_key[i]);
 	}
 
 	/* Set configured hashing protocols in MRQC register */
 	rss_hf = rss_conf->rss_hf;
-	mrqc = IGC_MRQC_ENABLE_RSS_4Q; /* RSS enabled. */
+	mrqc = E1000_MRQC_ENABLE_RSS_4Q; /* RSS enabled. */
 	if (rss_hf & RTE_ETH_RSS_IPV4)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV4;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV4;
 	if (rss_hf & RTE_ETH_RSS_NONFRAG_IPV4_TCP)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV4_TCP;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV4_TCP;
 	if (rss_hf & RTE_ETH_RSS_IPV6)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV6;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV6;
 	if (rss_hf & RTE_ETH_RSS_IPV6_EX)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV6_EX;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV6_EX;
 	if (rss_hf & RTE_ETH_RSS_NONFRAG_IPV6_TCP)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV6_TCP;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV6_TCP;
 	if (rss_hf & RTE_ETH_RSS_IPV6_TCP_EX)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV6_TCP_EX;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV6_TCP_EX;
 	if (rss_hf & RTE_ETH_RSS_NONFRAG_IPV4_UDP)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV4_UDP;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV4_UDP;
 	if (rss_hf & RTE_ETH_RSS_NONFRAG_IPV6_UDP)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV6_UDP;
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV6_UDP;
 	if (rss_hf & RTE_ETH_RSS_IPV6_UDP_EX)
-		mrqc |= IGC_MRQC_RSS_FIELD_IPV6_UDP_EX;
-	IGC_WRITE_REG(hw, IGC_MRQC, mrqc);
+		mrqc |= E1000_MRQC_RSS_FIELD_IPV6_UDP_EX;
+	E1000_WRITE_REG(hw, E1000_MRQC, mrqc);
 }
 
 static void
 igc_rss_configure(struct rte_eth_dev *dev)
 {
 	struct rte_eth_rss_conf rss_conf;
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint16_t i;
 
 	/* Fill in redirection table. */
@@ -833,8 +833,8 @@ igc_rss_configure(struct rte_eth_dev *dev)
 		reta_idx = i % sizeof(reta);
 		reta.bytes[reta_idx] = q_idx;
 		if (reta_idx == sizeof(reta) - 1)
-			IGC_WRITE_REG_LE_VALUE(hw,
-				IGC_RETA(i / sizeof(reta)), reta.dword);
+			E1000_WRITE_REG_LE_VALUE(hw,
+				E1000_RETA(i / sizeof(reta)), reta.dword);
 	}
 
 	/*
@@ -903,7 +903,7 @@ igc_add_rss_filter(struct rte_eth_dev *dev, struct igc_rss_filter *rss)
 		.rss_key_len = rss->conf.key_len,
 		.rss_hf = rss->conf.types,
 	};
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_rss_filter *rss_filter = IGC_DEV_PRIVATE_RSS_FILTER(dev);
 	uint32_t i, j;
 
@@ -950,8 +950,8 @@ igc_add_rss_filter(struct rte_eth_dev *dev, struct igc_rss_filter *rss)
 		reta_idx = i % sizeof(reta);
 		reta.bytes[reta_idx] = q_idx;
 		if (reta_idx == sizeof(reta) - 1)
-			IGC_WRITE_REG_LE_VALUE(hw,
-				IGC_RETA(i / sizeof(reta)), reta.dword);
+			E1000_WRITE_REG_LE_VALUE(hw,
+				E1000_RETA(i / sizeof(reta)), reta.dword);
 	}
 
 	if (rss_conf.rss_key == NULL)
@@ -1008,7 +1008,7 @@ int
 igc_rx_init(struct rte_eth_dev *dev)
 {
 	struct igc_rx_queue *rxq;
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint64_t offloads = dev->data->dev_conf.rxmode.offloads;
 	uint32_t max_rx_pktlen;
 	uint32_t rctl;
@@ -1024,21 +1024,21 @@ igc_rx_init(struct rte_eth_dev *dev)
 	 * Make sure receives are disabled while setting
 	 * up the descriptor ring.
 	 */
-	rctl = IGC_READ_REG(hw, IGC_RCTL);
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl & ~IGC_RCTL_EN);
+	rctl = E1000_READ_REG(hw, E1000_RCTL);
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl & ~E1000_RCTL_EN);
 
 	/* Configure support of jumbo frames, if any. */
 	if (dev->data->mtu > RTE_ETHER_MTU)
-		rctl |= IGC_RCTL_LPE;
+		rctl |= E1000_RCTL_LPE;
 	else
-		rctl &= ~IGC_RCTL_LPE;
+		rctl &= ~E1000_RCTL_LPE;
 
 	max_rx_pktlen = dev->data->mtu + IGC_ETH_OVERHEAD;
 	/*
 	 * Set maximum packet length by default, and might be updated
 	 * together with enabling/disabling dual VLAN.
 	 */
-	IGC_WRITE_REG(hw, IGC_RLPML, max_rx_pktlen);
+	E1000_WRITE_REG(hw, E1000_RLPML, max_rx_pktlen);
 
 	/* Configure and enable each RX queue. */
 	rctl_bsize = 0;
@@ -1066,16 +1066,16 @@ igc_rx_init(struct rte_eth_dev *dev)
 				RTE_ETHER_CRC_LEN : 0;
 
 		bus_addr = rxq->rx_ring_phys_addr;
-		IGC_WRITE_REG(hw, IGC_RDLEN(rxq->reg_idx),
+		E1000_WRITE_REG(hw, E1000_RDLEN(rxq->reg_idx),
 				rxq->nb_rx_desc *
-				sizeof(union igc_adv_rx_desc));
-		IGC_WRITE_REG(hw, IGC_RDBAH(rxq->reg_idx),
+				sizeof(union e1000_adv_rx_desc));
+		E1000_WRITE_REG(hw, E1000_RDBAH(rxq->reg_idx),
 				(uint32_t)(bus_addr >> 32));
-		IGC_WRITE_REG(hw, IGC_RDBAL(rxq->reg_idx),
+		E1000_WRITE_REG(hw, E1000_RDBAL(rxq->reg_idx),
 				(uint32_t)bus_addr);
 
 		/* set descriptor configuration */
-		srrctl = IGC_SRRCTL_DESCTYPE_ADV_ONEBUF;
+		srrctl = E1000_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
 		srrctl |= (uint32_t)(RTE_PKTMBUF_HEADROOM / 64) <<
 				IGC_SRRCTL_BSIZEHEADER_SHIFT;
@@ -1093,11 +1093,11 @@ igc_rx_init(struct rte_eth_dev *dev)
 			 * determines the RX packet buffer size.
 			 */
 
-			srrctl |= ((buf_size >> IGC_SRRCTL_BSIZEPKT_SHIFT) &
-				   IGC_SRRCTL_BSIZEPKT_MASK);
+			srrctl |= ((buf_size >> E1000_SRRCTL_BSIZEPKT_SHIFT) &
+				   E1000_SRRCTL_BSIZEPKT_MASK);
 			buf_size = (uint16_t)((srrctl &
-					IGC_SRRCTL_BSIZEPKT_MASK) <<
-					IGC_SRRCTL_BSIZEPKT_SHIFT);
+					E1000_SRRCTL_BSIZEPKT_MASK) <<
+					E1000_SRRCTL_BSIZEPKT_SHIFT);
 
 			/* It adds dual VLAN length for supporting dual VLAN */
 			if (max_rx_pktlen > buf_size)
@@ -1113,19 +1113,19 @@ igc_rx_init(struct rte_eth_dev *dev)
 
 		/* Set if packets are dropped when no descriptors available */
 		if (rxq->drop_en)
-			srrctl |= IGC_SRRCTL_DROP_EN;
+			srrctl |= E1000_SRRCTL_DROP_EN;
 
-		IGC_WRITE_REG(hw, IGC_SRRCTL(rxq->reg_idx), srrctl);
+		E1000_WRITE_REG(hw, E1000_SRRCTL(rxq->reg_idx), srrctl);
 
 		/* Enable this RX queue. */
-		rxdctl = IGC_RXDCTL_QUEUE_ENABLE;
+		rxdctl = E1000_RXDCTL_QUEUE_ENABLE;
 		rxdctl |= ((uint32_t)rxq->pthresh << IGC_RXDCTL_PTHRESH_SHIFT) &
 				IGC_RXDCTL_PTHRESH_MSK;
 		rxdctl |= ((uint32_t)rxq->hthresh << IGC_RXDCTL_HTHRESH_SHIFT) &
 				IGC_RXDCTL_HTHRESH_MSK;
 		rxdctl |= ((uint32_t)rxq->wthresh << IGC_RXDCTL_WTHRESH_SHIFT) &
 				IGC_RXDCTL_WTHRESH_MSK;
-		IGC_WRITE_REG(hw, IGC_RXDCTL(rxq->reg_idx), rxdctl);
+		E1000_WRITE_REG(hw, E1000_RXDCTL(rxq->reg_idx), rxdctl);
 	}
 
 	if (offloads & RTE_ETH_RX_OFFLOAD_SCATTER)
@@ -1141,19 +1141,19 @@ igc_rx_init(struct rte_eth_dev *dev)
 	 * register, since the code above configures the SRRCTL register of
 	 * the RX queue in such a case.
 	 * All configurable sizes are:
-	 * 16384: rctl |= (IGC_RCTL_SZ_16384 | IGC_RCTL_BSEX);
-	 *  8192: rctl |= (IGC_RCTL_SZ_8192  | IGC_RCTL_BSEX);
-	 *  4096: rctl |= (IGC_RCTL_SZ_4096  | IGC_RCTL_BSEX);
-	 *  2048: rctl |= IGC_RCTL_SZ_2048;
-	 *  1024: rctl |= IGC_RCTL_SZ_1024;
-	 *   512: rctl |= IGC_RCTL_SZ_512;
-	 *   256: rctl |= IGC_RCTL_SZ_256;
+	 * 16384: rctl |= (E1000_RCTL_SZ_16384 | E1000_RCTL_BSEX);
+	 *  8192: rctl |= (E1000_RCTL_SZ_8192  | E1000_RCTL_BSEX);
+	 *  4096: rctl |= (E1000_RCTL_SZ_4096  | E1000_RCTL_BSEX);
+	 *  2048: rctl |= E1000_RCTL_SZ_2048;
+	 *  1024: rctl |= E1000_RCTL_SZ_1024;
+	 *   512: rctl |= E1000_RCTL_SZ_512;
+	 *   256: rctl |= E1000_RCTL_SZ_256;
 	 */
 	if (rctl_bsize > 0) {
 		if (rctl_bsize >= 512) /* 512 <= buf_size < 1024 - use 512 */
-			rctl |= IGC_RCTL_SZ_512;
+			rctl |= E1000_RCTL_SZ_512;
 		else /* 256 <= buf_size < 512 - use 256 */
-			rctl |= IGC_RCTL_SZ_256;
+			rctl |= E1000_RCTL_SZ_256;
 	}
 
 	/*
@@ -1162,61 +1162,61 @@ igc_rx_init(struct rte_eth_dev *dev)
 	igc_dev_mq_rx_configure(dev);
 
 	/* Update the rctl since igc_dev_mq_rx_configure may change its value */
-	rctl |= IGC_READ_REG(hw, IGC_RCTL);
+	rctl |= E1000_READ_REG(hw, E1000_RCTL);
 
 	/*
 	 * Setup the Checksum Register.
 	 * Receive Full-Packet Checksum Offload is mutually exclusive with RSS.
 	 */
-	rxcsum = IGC_READ_REG(hw, IGC_RXCSUM);
-	rxcsum |= IGC_RXCSUM_PCSD;
+	rxcsum = E1000_READ_REG(hw, E1000_RXCSUM);
+	rxcsum |= E1000_RXCSUM_PCSD;
 
 	/* Enable both L3/L4 rx checksum offload */
 	if (offloads & RTE_ETH_RX_OFFLOAD_IPV4_CKSUM)
-		rxcsum |= IGC_RXCSUM_IPOFL;
+		rxcsum |= E1000_RXCSUM_IPOFL;
 	else
-		rxcsum &= ~IGC_RXCSUM_IPOFL;
+		rxcsum &= ~E1000_RXCSUM_IPOFL;
 
 	if (offloads &
 		(RTE_ETH_RX_OFFLOAD_TCP_CKSUM | RTE_ETH_RX_OFFLOAD_UDP_CKSUM)) {
-		rxcsum |= IGC_RXCSUM_TUOFL;
+		rxcsum |= E1000_RXCSUM_TUOFL;
 		offloads |= RTE_ETH_RX_OFFLOAD_SCTP_CKSUM;
 	} else {
-		rxcsum &= ~IGC_RXCSUM_TUOFL;
+		rxcsum &= ~E1000_RXCSUM_TUOFL;
 	}
 
 	if (offloads & RTE_ETH_RX_OFFLOAD_SCTP_CKSUM)
-		rxcsum |= IGC_RXCSUM_CRCOFL;
+		rxcsum |= E1000_RXCSUM_CRCOFL;
 	else
-		rxcsum &= ~IGC_RXCSUM_CRCOFL;
+		rxcsum &= ~E1000_RXCSUM_CRCOFL;
 
-	IGC_WRITE_REG(hw, IGC_RXCSUM, rxcsum);
+	E1000_WRITE_REG(hw, E1000_RXCSUM, rxcsum);
 
 	/* Setup the Receive Control Register. */
 	if (offloads & RTE_ETH_RX_OFFLOAD_KEEP_CRC)
-		rctl &= ~IGC_RCTL_SECRC; /* Do not Strip Ethernet CRC. */
+		rctl &= ~E1000_RCTL_SECRC; /* Do not Strip Ethernet CRC. */
 	else
-		rctl |= IGC_RCTL_SECRC; /* Strip Ethernet CRC. */
+		rctl |= E1000_RCTL_SECRC; /* Strip Ethernet CRC. */
 
-	rctl &= ~IGC_RCTL_MO_MSK;
-	rctl &= ~IGC_RCTL_LBM_MSK;
-	rctl |= IGC_RCTL_EN | IGC_RCTL_BAM | IGC_RCTL_LBM_NO |
-			IGC_RCTL_DPF |
-			(hw->mac.mc_filter_type << IGC_RCTL_MO_SHIFT);
+	rctl &= ~E1000_RCTL_MO_MSK;
+	rctl &= ~E1000_RCTL_LBM_MSK;
+	rctl |= E1000_RCTL_EN | E1000_RCTL_BAM | E1000_RCTL_LBM_NO |
+			E1000_RCTL_DPF |
+			(hw->mac.mc_filter_type << E1000_RCTL_MO_SHIFT);
 
 	if (dev->data->dev_conf.lpbk_mode == 1)
-		rctl |= IGC_RCTL_LBM_MAC;
+		rctl |= E1000_RCTL_LBM_MAC;
 
-	rctl &= ~(IGC_RCTL_HSEL_MSK | IGC_RCTL_CFIEN | IGC_RCTL_CFI |
-			IGC_RCTL_PSP | IGC_RCTL_PMCF);
+	rctl &= ~(E1000_RCTL_HSEL_MSK | E1000_RCTL_CFIEN | E1000_RCTL_CFI |
+			E1000_RCTL_PSP | E1000_RCTL_PMCF);
 
 	/* Make sure VLAN Filters are off. */
-	rctl &= ~IGC_RCTL_VFE;
+	rctl &= ~E1000_RCTL_VFE;
 	/* Don't store bad packets. */
-	rctl &= ~IGC_RCTL_SBP;
+	rctl &= ~E1000_RCTL_SBP;
 
 	/* Enable Receives. */
-	IGC_WRITE_REG(hw, IGC_RCTL, rctl);
+	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 
 	/*
 	 * Setup the HW Rx Head and Tail Descriptor Pointers.
@@ -1226,21 +1226,21 @@ igc_rx_init(struct rte_eth_dev *dev)
 		uint32_t dvmolr;
 
 		rxq = dev->data->rx_queues[i];
-		IGC_WRITE_REG(hw, IGC_RDH(rxq->reg_idx), 0);
-		IGC_WRITE_REG(hw, IGC_RDT(rxq->reg_idx), rxq->nb_rx_desc - 1);
+		E1000_WRITE_REG(hw, E1000_RDH(rxq->reg_idx), 0);
+		E1000_WRITE_REG(hw, E1000_RDT(rxq->reg_idx), rxq->nb_rx_desc - 1);
 
-		dvmolr = IGC_READ_REG(hw, IGC_DVMOLR(rxq->reg_idx));
+		dvmolr = E1000_READ_REG(hw, E1000_DVMOLR(rxq->reg_idx));
 		if (rxq->offloads & RTE_ETH_RX_OFFLOAD_VLAN_STRIP)
-			dvmolr |= IGC_DVMOLR_STRVLAN;
+			dvmolr |= E1000_DVMOLR_STRVLAN;
 		else
-			dvmolr &= ~IGC_DVMOLR_STRVLAN;
+			dvmolr &= ~E1000_DVMOLR_STRVLAN;
 
 		if (offloads & RTE_ETH_RX_OFFLOAD_KEEP_CRC)
-			dvmolr &= ~IGC_DVMOLR_STRCRC;
+			dvmolr &= ~E1000_DVMOLR_STRCRC;
 		else
-			dvmolr |= IGC_DVMOLR_STRCRC;
+			dvmolr |= E1000_DVMOLR_STRCRC;
 
-		IGC_WRITE_REG(hw, IGC_DVMOLR(rxq->reg_idx), dvmolr);
+		E1000_WRITE_REG(hw, E1000_DVMOLR(rxq->reg_idx), dvmolr);
 		dev->data->rx_queue_state[i] = RTE_ETH_QUEUE_STATE_STARTED;
 	}
 
@@ -1250,7 +1250,7 @@ igc_rx_init(struct rte_eth_dev *dev)
 static void
 igc_reset_rx_queue(struct igc_rx_queue *rxq)
 {
-	static const union igc_adv_rx_desc zeroed_desc = { {0} };
+	static const union e1000_adv_rx_desc zeroed_desc = { {0} };
 	unsigned int i;
 
 	/* Zero out HW ring memory */
@@ -1270,7 +1270,7 @@ eth_igc_rx_queue_setup(struct rte_eth_dev *dev,
 			 const struct rte_eth_rxconf *rx_conf,
 			 struct rte_mempool *mp)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	const struct rte_memzone *rz;
 	struct igc_rx_queue *rxq;
 	unsigned int size;
@@ -1317,17 +1317,17 @@ eth_igc_rx_queue_setup(struct rte_eth_dev *dev,
 	 *  handle the maximum ring size is allocated in order to allow for
 	 *  resizing in later calls to the queue setup function.
 	 */
-	size = sizeof(union igc_adv_rx_desc) * IGC_MAX_RXD;
+	size = sizeof(union e1000_adv_rx_desc) * IGC_MAX_RXD;
 	rz = rte_eth_dma_zone_reserve(dev, "rx_ring", queue_idx, size,
 				      IGC_ALIGN, socket_id);
 	if (rz == NULL) {
 		igc_rx_queue_release(rxq);
 		return -ENOMEM;
 	}
-	rxq->rdt_reg_addr = IGC_PCI_REG_ADDR(hw, IGC_RDT(rxq->reg_idx));
-	rxq->rdh_reg_addr = IGC_PCI_REG_ADDR(hw, IGC_RDH(rxq->reg_idx));
+	rxq->rdt_reg_addr = E1000_PCI_REG_ADDR(hw, E1000_RDT(rxq->reg_idx));
+	rxq->rdh_reg_addr = E1000_PCI_REG_ADDR(hw, E1000_RDH(rxq->reg_idx));
 	rxq->rx_ring_phys_addr = rz->iova;
-	rxq->rx_ring = (union igc_adv_rx_desc *)rz->addr;
+	rxq->rx_ring = (union e1000_adv_rx_desc *)rz->addr;
 
 	/* Allocate software ring. */
 	rxq->sw_ring = rte_zmalloc("rxq->sw_ring",
@@ -1457,7 +1457,7 @@ static uint32_t igc_tx_launchtime(uint64_t txtime, uint16_t port_id)
  */
 static inline void
 igc_set_xmit_ctx(struct igc_tx_queue *txq,
-		volatile struct igc_adv_tx_context_desc *ctx_txd,
+		volatile struct e1000_adv_tx_context_desc *ctx_txd,
 		uint64_t ol_flags, union igc_tx_offload tx_offload,
 		uint64_t txtime)
 {
@@ -1475,7 +1475,7 @@ igc_set_xmit_ctx(struct igc_tx_queue *txq,
 	type_tucmd_mlhl = 0;
 
 	/* Specify which HW CTX to upload. */
-	mss_l4len_idx = (ctx_curr << IGC_ADVTXD_IDX_SHIFT);
+	mss_l4len_idx = (ctx_curr << E1000_ADVTXD_IDX_SHIFT);
 
 	if (ol_flags & RTE_MBUF_F_TX_VLAN)
 		tx_offload_mask.vlan_tci = 0xffff;
@@ -1484,51 +1484,51 @@ igc_set_xmit_ctx(struct igc_tx_queue *txq,
 	if (ol_flags & IGC_TX_OFFLOAD_SEG) {
 		/* implies IP cksum in IPv4 */
 		if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
-			type_tucmd_mlhl = IGC_ADVTXD_TUCMD_IPV4 |
-				IGC_ADVTXD_DTYP_CTXT | IGC_ADVTXD_DCMD_DEXT;
+			type_tucmd_mlhl = E1000_ADVTXD_TUCMD_IPV4 |
+				E1000_ADVTXD_DTYP_CTXT | E1000_ADVTXD_DCMD_DEXT;
 		else
-			type_tucmd_mlhl = IGC_ADVTXD_TUCMD_IPV6 |
-				IGC_ADVTXD_DTYP_CTXT | IGC_ADVTXD_DCMD_DEXT;
+			type_tucmd_mlhl = E1000_ADVTXD_TUCMD_IPV6 |
+				E1000_ADVTXD_DTYP_CTXT | E1000_ADVTXD_DCMD_DEXT;
 
 		if (ol_flags & RTE_MBUF_F_TX_TCP_SEG)
-			type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_TCP;
+			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_TCP;
 		else
-			type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_UDP;
+			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_UDP;
 
 		tx_offload_mask.data |= TX_TSO_CMP_MASK;
 		mss_l4len_idx |= (uint32_t)tx_offload.tso_segsz <<
-				IGC_ADVTXD_MSS_SHIFT;
+				E1000_ADVTXD_MSS_SHIFT;
 		mss_l4len_idx |= (uint32_t)tx_offload.l4_len <<
-				IGC_ADVTXD_L4LEN_SHIFT;
+				E1000_ADVTXD_L4LEN_SHIFT;
 	} else { /* no TSO, check if hardware checksum is needed */
 		if (ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK))
 			tx_offload_mask.data |= TX_MACIP_LEN_CMP_MASK;
 
 		if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
-			type_tucmd_mlhl = IGC_ADVTXD_TUCMD_IPV4;
+			type_tucmd_mlhl = E1000_ADVTXD_TUCMD_IPV4;
 
 		switch (ol_flags & RTE_MBUF_F_TX_L4_MASK) {
 		case RTE_MBUF_F_TX_TCP_CKSUM:
-			type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_TCP |
-				IGC_ADVTXD_DTYP_CTXT | IGC_ADVTXD_DCMD_DEXT;
+			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_TCP |
+				E1000_ADVTXD_DTYP_CTXT | E1000_ADVTXD_DCMD_DEXT;
 			mss_l4len_idx |= (uint32_t)sizeof(struct rte_tcp_hdr)
-				<< IGC_ADVTXD_L4LEN_SHIFT;
+				<< E1000_ADVTXD_L4LEN_SHIFT;
 			break;
 		case RTE_MBUF_F_TX_UDP_CKSUM:
-			type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_UDP |
-				IGC_ADVTXD_DTYP_CTXT | IGC_ADVTXD_DCMD_DEXT;
+			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_UDP |
+				E1000_ADVTXD_DTYP_CTXT | E1000_ADVTXD_DCMD_DEXT;
 			mss_l4len_idx |= (uint32_t)sizeof(struct rte_udp_hdr)
-				<< IGC_ADVTXD_L4LEN_SHIFT;
+				<< E1000_ADVTXD_L4LEN_SHIFT;
 			break;
 		case RTE_MBUF_F_TX_SCTP_CKSUM:
-			type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_SCTP |
-				IGC_ADVTXD_DTYP_CTXT | IGC_ADVTXD_DCMD_DEXT;
+			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_SCTP |
+				E1000_ADVTXD_DTYP_CTXT | E1000_ADVTXD_DCMD_DEXT;
 			mss_l4len_idx |= (uint32_t)sizeof(struct rte_sctp_hdr)
-				<< IGC_ADVTXD_L4LEN_SHIFT;
+				<< E1000_ADVTXD_L4LEN_SHIFT;
 			break;
 		default:
 			type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_RSV |
-				IGC_ADVTXD_DTYP_CTXT | IGC_ADVTXD_DCMD_DEXT;
+				E1000_ADVTXD_DTYP_CTXT | E1000_ADVTXD_DCMD_DEXT;
 			break;
 		}
 	}
@@ -1556,8 +1556,8 @@ static inline uint32_t
 tx_desc_vlan_flags_to_cmdtype(uint64_t ol_flags)
 {
 	uint32_t cmdtype;
-	static uint32_t vlan_cmd[2] = {0, IGC_ADVTXD_DCMD_VLE};
-	static uint32_t tso_cmd[2] = {0, IGC_ADVTXD_DCMD_TSE};
+	static uint32_t vlan_cmd[2] = {0, E1000_ADVTXD_DCMD_VLE};
+	static uint32_t tso_cmd[2] = {0, E1000_ADVTXD_DCMD_TSE};
 	cmdtype = vlan_cmd[(ol_flags & RTE_MBUF_F_TX_VLAN) != 0];
 	cmdtype |= tso_cmd[(ol_flags & IGC_TX_OFFLOAD_SEG) != 0];
 	return cmdtype;
@@ -1582,8 +1582,8 @@ igc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	struct igc_tx_queue * const txq = tx_queue;
 	struct igc_tx_entry * const sw_ring = txq->sw_ring;
 	struct igc_tx_entry *txe, *txn;
-	volatile union igc_adv_tx_desc * const txr = txq->tx_ring;
-	volatile union igc_adv_tx_desc *txd;
+	volatile union e1000_adv_tx_desc * const txr = txq->tx_ring;
+	volatile union e1000_adv_tx_desc *txd;
 	struct rte_mbuf *tx_pkt;
 	struct rte_mbuf *m_seg;
 	uint64_t buf_dma_addr;
@@ -1691,7 +1691,7 @@ igc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		/*
 		 * Check that this descriptor is free.
 		 */
-		if (!(txr[tx_end].wb.status & IGC_TXD_STAT_DD)) {
+		if (!(txr[tx_end].wb.status & E1000_TXD_STAT_DD)) {
 			if (nb_tx == 0)
 				return 0;
 			goto end_of_tx;
@@ -1701,43 +1701,43 @@ igc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		 * Set common flags of all TX Data Descriptors.
 		 *
 		 * The following bits must be set in all Data Descriptors:
-		 *   - IGC_ADVTXD_DTYP_DATA
-		 *   - IGC_ADVTXD_DCMD_DEXT
+		 *   - E1000_ADVTXD_DTYP_DATA
+		 *   - E1000_ADVTXD_DCMD_DEXT
 		 *
 		 * The following bits must be set in the first Data Descriptor
 		 * and are ignored in the other ones:
-		 *   - IGC_ADVTXD_DCMD_IFCS
-		 *   - IGC_ADVTXD_MAC_1588
-		 *   - IGC_ADVTXD_DCMD_VLE
+		 *   - E1000_ADVTXD_DCMD_IFCS
+		 *   - E1000_ADVTXD_MAC_1588
+		 *   - E1000_ADVTXD_DCMD_VLE
 		 *
 		 * The following bits must only be set in the last Data
 		 * Descriptor:
-		 *   - IGC_TXD_CMD_EOP
+		 *   - E1000_TXD_CMD_EOP
 		 *
 		 * The following bits can be set in any Data Descriptor, but
 		 * are only set in the last Data Descriptor:
-		 *   - IGC_TXD_CMD_RS
+		 *   - E1000_TXD_CMD_RS
 		 */
 		cmd_type_len = txq->txd_type |
-			IGC_ADVTXD_DCMD_IFCS | IGC_ADVTXD_DCMD_DEXT;
+			E1000_ADVTXD_DCMD_IFCS | E1000_ADVTXD_DCMD_DEXT;
 		if (tx_ol_req & IGC_TX_OFFLOAD_SEG)
 			pkt_len -= (tx_pkt->l2_len + tx_pkt->l3_len +
 					tx_pkt->l4_len);
-		olinfo_status = (pkt_len << IGC_ADVTXD_PAYLEN_SHIFT);
+		olinfo_status = (pkt_len << E1000_ADVTXD_PAYLEN_SHIFT);
 
 		/*
 		 * Timer 0 should be used to for packet timestamping,
 		 * sample the packet timestamp to reg 0
 		 */
 		if (ol_flags & RTE_MBUF_F_TX_IEEE1588_TMST)
-			cmd_type_len |= IGC_ADVTXD_MAC_TSTAMP;
+			cmd_type_len |= E1000_ADVTXD_MAC_TSTAMP;
 
 		if (tx_ol_req) {
 			/* Setup TX Advanced context descriptor if required */
 			if (new_ctx) {
-				volatile struct igc_adv_tx_context_desc *
+				volatile struct e1000_adv_tx_context_desc *
 					ctx_txd = (volatile struct
-					igc_adv_tx_context_desc *)&txr[tx_id];
+					e1000_adv_tx_context_desc *)&txr[tx_id];
 
 				txn = &sw_ring[txe->next_id];
 				RTE_MBUF_PREFETCH_TO_FREE(txn->mbuf);
@@ -1769,7 +1769,7 @@ igc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			olinfo_status |=
 				tx_desc_cksum_flags_to_olinfo(tx_ol_req);
 			olinfo_status |= (uint32_t)txq->ctx_curr <<
-					IGC_ADVTXD_IDX_SHIFT;
+					E1000_ADVTXD_IDX_SHIFT;
 		}
 
 		m_seg = tx_pkt;
@@ -1803,7 +1803,7 @@ igc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		 * and Report Status (RS).
 		 */
 		txd->read.cmd_type_len |=
-			rte_cpu_to_le_32(IGC_TXD_CMD_EOP | IGC_TXD_CMD_RS);
+			rte_cpu_to_le_32(E1000_TXD_CMD_EOP | E1000_TXD_CMD_RS);
 	}
 end_of_tx:
 	rte_wmb();
@@ -1811,7 +1811,7 @@ igc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	/*
 	 * Set the Transmit Descriptor Tail (TDT).
 	 */
-	IGC_PCI_REG_WRITE_RELAXED(txq->tdt_reg_addr, tx_id);
+	E1000_PCI_REG_WRITE_RELAXED(txq->tdt_reg_addr, tx_id);
 	PMD_TX_LOG(DEBUG, "port_id=%u queue_id=%u tx_tail=%u nb_tx=%u",
 		txq->port_id, txq->queue_id, tx_id, nb_tx);
 	txq->tx_tail = tx_id;
@@ -1833,7 +1833,7 @@ int eth_igc_tx_descriptor_status(void *tx_queue, uint16_t offset)
 		desc -= txq->nb_tx_desc;
 
 	status = &txq->tx_ring[desc].wb.status;
-	if (*status & rte_cpu_to_le_32(IGC_TXD_STAT_DD))
+	if (*status & rte_cpu_to_le_32(E1000_TXD_STAT_DD))
 		return RTE_ETH_TX_DESC_DONE;
 
 	return RTE_ETH_TX_DESC_FULL;
@@ -1887,16 +1887,16 @@ igc_reset_tx_queue(struct igc_tx_queue *txq)
 	/* Initialize ring entries */
 	prev = (uint16_t)(txq->nb_tx_desc - 1);
 	for (i = 0; i < txq->nb_tx_desc; i++) {
-		volatile union igc_adv_tx_desc *txd = &txq->tx_ring[i];
+		volatile union e1000_adv_tx_desc *txd = &txq->tx_ring[i];
 
-		txd->wb.status = IGC_TXD_STAT_DD;
+		txd->wb.status = E1000_TXD_STAT_DD;
 		txe[i].mbuf = NULL;
 		txe[i].last_id = i;
 		txe[prev].next_id = i;
 		prev = i;
 	}
 
-	txq->txd_type = IGC_ADVTXD_DTYP_DATA;
+	txq->txd_type = E1000_ADVTXD_DTYP_DATA;
 	igc_reset_tx_queue_stat(txq);
 }
 
@@ -1935,7 +1935,7 @@ int eth_igc_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 {
 	const struct rte_memzone *tz;
 	struct igc_tx_queue *txq;
-	struct igc_hw *hw;
+	struct e1000_hw *hw;
 	uint32_t size;
 
 	if (nb_desc % IGC_TX_DESCRIPTOR_MULTIPLE != 0 ||
@@ -1980,7 +1980,7 @@ int eth_igc_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 	 * handle the maximum ring size is allocated in order to allow for
 	 * resizing in later calls to the queue setup function.
 	 */
-	size = sizeof(union igc_adv_tx_desc) * IGC_MAX_TXD;
+	size = sizeof(union e1000_adv_tx_desc) * IGC_MAX_TXD;
 	tz = rte_eth_dma_zone_reserve(dev, "tx_ring", queue_idx, size,
 				      IGC_ALIGN, socket_id);
 	if (tz == NULL) {
@@ -1997,10 +1997,10 @@ int eth_igc_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 	txq->reg_idx = queue_idx;
 	txq->port_id = dev->data->port_id;
 
-	txq->tdt_reg_addr = IGC_PCI_REG_ADDR(hw, IGC_TDT(txq->reg_idx));
+	txq->tdt_reg_addr = E1000_PCI_REG_ADDR(hw, E1000_TDT(txq->reg_idx));
 	txq->tx_ring_phys_addr = tz->iova;
 
-	txq->tx_ring = (union igc_adv_tx_desc *)tz->addr;
+	txq->tx_ring = (union e1000_adv_tx_desc *)tz->addr;
 	/* Allocate software ring */
 	txq->sw_ring = rte_zmalloc("txq->sw_ring",
 				   sizeof(struct igc_tx_entry) * nb_desc,
@@ -2026,7 +2026,7 @@ eth_igc_tx_done_cleanup(void *txqueue, uint32_t free_cnt)
 {
 	struct igc_tx_queue *txq = txqueue;
 	struct igc_tx_entry *sw_ring;
-	volatile union igc_adv_tx_desc *txr;
+	volatile union e1000_adv_tx_desc *txr;
 	uint16_t tx_first; /* First segment analyzed. */
 	uint16_t tx_id;    /* Current segment being processed. */
 	uint16_t tx_last;  /* Last segment in the current packet. */
@@ -2067,7 +2067,7 @@ eth_igc_tx_done_cleanup(void *txqueue, uint32_t free_cnt)
 
 		if (sw_ring[tx_last].mbuf) {
 			if (!(txr[tx_last].wb.status &
-					rte_cpu_to_le_32(IGC_TXD_STAT_DD)))
+					rte_cpu_to_le_32(E1000_TXD_STAT_DD)))
 				break;
 
 			/* Get the start of the next packet. */
@@ -2139,7 +2139,7 @@ eth_igc_tx_done_cleanup(void *txqueue, uint32_t free_cnt)
 void
 igc_tx_init(struct rte_eth_dev *dev)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	uint64_t offloads = dev->data->dev_conf.txmode.offloads;
 	uint32_t tctl;
 	uint32_t txdctl;
@@ -2151,17 +2151,17 @@ igc_tx_init(struct rte_eth_dev *dev)
 		struct igc_tx_queue *txq = dev->data->tx_queues[i];
 		uint64_t bus_addr = txq->tx_ring_phys_addr;
 
-		IGC_WRITE_REG(hw, IGC_TDLEN(txq->reg_idx),
+		E1000_WRITE_REG(hw, E1000_TDLEN(txq->reg_idx),
 				txq->nb_tx_desc *
-				sizeof(union igc_adv_tx_desc));
-		IGC_WRITE_REG(hw, IGC_TDBAH(txq->reg_idx),
+				sizeof(union e1000_adv_tx_desc));
+		E1000_WRITE_REG(hw, E1000_TDBAH(txq->reg_idx),
 				(uint32_t)(bus_addr >> 32));
-		IGC_WRITE_REG(hw, IGC_TDBAL(txq->reg_idx),
+		E1000_WRITE_REG(hw, E1000_TDBAL(txq->reg_idx),
 				(uint32_t)bus_addr);
 
 		/* Setup the HW Tx Head and Tail descriptor pointers. */
-		IGC_WRITE_REG(hw, IGC_TDT(txq->reg_idx), 0);
-		IGC_WRITE_REG(hw, IGC_TDH(txq->reg_idx), 0);
+		E1000_WRITE_REG(hw, E1000_TDT(txq->reg_idx), 0);
+		E1000_WRITE_REG(hw, E1000_TDH(txq->reg_idx), 0);
 
 		/* Setup Transmit threshold registers. */
 		txdctl = ((uint32_t)txq->pthresh << IGC_TXDCTL_PTHRESH_SHIFT) &
@@ -2170,8 +2170,8 @@ igc_tx_init(struct rte_eth_dev *dev)
 				IGC_TXDCTL_HTHRESH_MSK;
 		txdctl |= ((uint32_t)txq->wthresh << IGC_TXDCTL_WTHRESH_SHIFT) &
 				IGC_TXDCTL_WTHRESH_MSK;
-		txdctl |= IGC_TXDCTL_QUEUE_ENABLE;
-		IGC_WRITE_REG(hw, IGC_TXDCTL(txq->reg_idx), txdctl);
+		txdctl |= E1000_TXDCTL_QUEUE_ENABLE;
+		E1000_WRITE_REG(hw, E1000_TXDCTL(txq->reg_idx), txdctl);
 		dev->data->tx_queue_state[i] = RTE_ETH_QUEUE_STATE_STARTED;
 	}
 
@@ -2185,16 +2185,16 @@ igc_tx_init(struct rte_eth_dev *dev)
 		}
 	}
 
-	igc_config_collision_dist(hw);
+	e1000_config_collision_dist(hw);
 
 	/* Program the Transmit Control Register. */
-	tctl = IGC_READ_REG(hw, IGC_TCTL);
-	tctl &= ~IGC_TCTL_CT;
-	tctl |= (IGC_TCTL_PSP | IGC_TCTL_RTLC | IGC_TCTL_EN |
-		 ((uint32_t)IGC_COLLISION_THRESHOLD << IGC_CT_SHIFT));
+	tctl = E1000_READ_REG(hw, E1000_TCTL);
+	tctl &= ~E1000_TCTL_CT;
+	tctl |= (E1000_TCTL_PSP | E1000_TCTL_RTLC | E1000_TCTL_EN |
+		 ((uint32_t)E1000_COLLISION_THRESHOLD << E1000_CT_SHIFT));
 
 	/* This write will effectively turn on the transmit unit. */
-	IGC_WRITE_REG(hw, IGC_TCTL, tctl);
+	E1000_WRITE_REG(hw, E1000_TCTL, tctl);
 }
 
 void
@@ -2237,7 +2237,7 @@ void
 eth_igc_vlan_strip_queue_set(struct rte_eth_dev *dev,
 			uint16_t rx_queue_id, int on)
 {
-	struct igc_hw *hw = IGC_DEV_PRIVATE_HW(dev);
+	struct e1000_hw *hw = IGC_DEV_PRIVATE_HW(dev);
 	struct igc_rx_queue *rxq = dev->data->rx_queues[rx_queue_id];
 	uint32_t reg_val;
 
@@ -2247,14 +2247,14 @@ eth_igc_vlan_strip_queue_set(struct rte_eth_dev *dev,
 		return;
 	}
 
-	reg_val = IGC_READ_REG(hw, IGC_DVMOLR(rx_queue_id));
+	reg_val = E1000_READ_REG(hw, E1000_DVMOLR(rx_queue_id));
 	if (on) {
-		reg_val |= IGC_DVMOLR_STRVLAN;
+		reg_val |= E1000_DVMOLR_STRVLAN;
 		rxq->offloads |= RTE_ETH_RX_OFFLOAD_VLAN_STRIP;
 	} else {
-		reg_val &= ~(IGC_DVMOLR_STRVLAN | IGC_DVMOLR_HIDVLAN);
+		reg_val &= ~(E1000_DVMOLR_STRVLAN | E1000_DVMOLR_HIDVLAN);
 		rxq->offloads &= ~RTE_ETH_RX_OFFLOAD_VLAN_STRIP;
 	}
 
-	IGC_WRITE_REG(hw, IGC_DVMOLR(rx_queue_id), reg_val);
+	E1000_WRITE_REG(hw, E1000_DVMOLR(rx_queue_id), reg_val);
 }
diff --git a/drivers/net/intel/igc/igc_txrx.h b/drivers/net/intel/e1000/igc_txrx.h
similarity index 97%
rename from drivers/net/intel/igc/igc_txrx.h
rename to drivers/net/intel/e1000/igc_txrx.h
index ad7d3b4ca5..1e63ddb5aa 100644
--- a/drivers/net/intel/igc/igc_txrx.h
+++ b/drivers/net/intel/e1000/igc_txrx.h
@@ -23,7 +23,7 @@ struct igc_rx_entry {
  */
 struct igc_rx_queue {
 	struct rte_mempool  *mb_pool;   /**< mbuf pool to populate RX ring. */
-	volatile union igc_adv_rx_desc *rx_ring;
+	volatile union e1000_adv_rx_desc *rx_ring;
 	/**< RX ring virtual address. */
 	uint64_t            rx_ring_phys_addr; /**< RX ring DMA address. */
 	volatile uint32_t   *rdt_reg_addr; /**< RDT register address. */
@@ -107,7 +107,7 @@ struct igc_tx_entry {
  * Structure associated with each TX queue.
  */
 struct igc_tx_queue {
-	volatile union igc_adv_tx_desc *tx_ring; /**< TX ring address */
+	volatile union e1000_adv_tx_desc *tx_ring; /**< TX ring address */
 	uint64_t               tx_ring_phys_addr; /**< TX ring DMA address. */
 	struct igc_tx_entry    *sw_ring; /**< virtual address of SW ring. */
 	volatile uint32_t      *tdt_reg_addr; /**< Address of TDT register. */
@@ -156,7 +156,7 @@ int igc_rx_init(struct rte_eth_dev *dev);
 void igc_tx_init(struct rte_eth_dev *dev);
 void igc_rss_disable(struct rte_eth_dev *dev);
 void
-igc_hw_rss_hash_set(struct igc_hw *hw, struct rte_eth_rss_conf *rss_conf);
+igc_hw_rss_hash_set(struct e1000_hw *hw, struct rte_eth_rss_conf *rss_conf);
 int igc_del_rss_filter(struct rte_eth_dev *dev);
 void igc_rss_conf_set(struct igc_rss_filter *out,
 		const struct rte_flow_action_rss *rss);
diff --git a/drivers/net/intel/e1000/meson.build b/drivers/net/intel/e1000/meson.build
index 296ec25f2c..cd42c0042a 100644
--- a/drivers/net/intel/e1000/meson.build
+++ b/drivers/net/intel/e1000/meson.build
@@ -14,4 +14,15 @@ sources = files(
         'igb_rxtx.c',
 )
 
+# do not build IGC on Windows
+if not is_windows
+        sources += files(
+                'igc_ethdev.c',
+                'igc_logs.c',
+                'igc_filter.c',
+                'igc_flow.c',
+                'igc_txrx.c',
+        )
+endif
+
 includes += include_directories('base')
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index bbcd8b229a..cdc3e7e664 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -31,7 +31,6 @@ drivers = [
         'intel/iavf',
         'intel/ice',
         'intel/idpf',
-        'intel/igc',
         'intel/ipn3ke',
         'intel/ixgbe',
         'ionic',
-- 
2.43.5


^ permalink raw reply	[relevance 1%]

* [PATCH v4] net: add thread-safe crc api
  2025-02-06 20:38  4%   ` [PATCH v3] " Arkadiusz Kusztal
@ 2025-02-07  6:37  4%     ` Arkadiusz Kusztal
  2025-02-07 18:24  4%       ` [PATCH v5] " Arkadiusz Kusztal
  0 siblings, 1 reply; 200+ results
From: Arkadiusz Kusztal @ 2025-02-07  6:37 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, kai.ji, brian.dooley, Arkadiusz Kusztal

The current net CRC API is not thread-safe, this patch
solves this by adding another, thread-safe API functions.
This API is also safe to use across multiple processes,
yet with limitations on max-simd-bitwidth, which will be checked only by
the process that created the CRC context; all other processes
(that did not create the context) will use the highest possible
SIMD extension that was built with the binary, but no higher than the one
requested by the CRC context.

Since the change of the API at this point is an ABI break,
these API symbols are versioned with the _26 suffix.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
v2:
- added multi-process safety
v3:
- made the crc context opaque
- versioned old APIs
v4:
- exported rte_net_crc_free symbol

 app/test/test_crc.c                    | 169 ++++++++++---------------
 doc/guides/rel_notes/release_25_03.rst |   3 +
 drivers/crypto/qat/qat_sym.h           |   6 +-
 drivers/crypto/qat/qat_sym_session.c   |   8 ++
 drivers/crypto/qat/qat_sym_session.h   |   2 +
 lib/net/meson.build                    |   2 +
 lib/net/net_crc.h                      |  18 ++-
 lib/net/rte_net_crc.c                  | 130 ++++++++++++++++++-
 lib/net/rte_net_crc.h                  |  39 ++++--
 lib/net/version.map                    |   6 +
 10 files changed, 266 insertions(+), 117 deletions(-)

diff --git a/app/test/test_crc.c b/app/test/test_crc.c
index b85fca35fe..d7a11e8025 100644
--- a/app/test/test_crc.c
+++ b/app/test/test_crc.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #include "test.h"
@@ -44,131 +44,100 @@ static const uint32_t crc32_vec_res = 0xb491aab4;
 static const uint32_t crc32_vec1_res = 0xac54d294;
 static const uint32_t crc32_vec2_res = 0xefaae02f;
 static const uint32_t crc16_vec_res = 0x6bec;
-static const uint16_t crc16_vec1_res = 0x8cdd;
-static const uint16_t crc16_vec2_res = 0xec5b;
+static const uint32_t crc16_vec1_res = 0x8cdd;
+static const uint32_t crc16_vec2_res = 0xec5b;
 
 static int
-crc_calc(const uint8_t *vec,
-	uint32_t vec_len,
-	enum rte_net_crc_type type)
+crc_all_algs(const char *desc, enum rte_net_crc_type type,
+	const uint8_t *data, int data_len, uint32_t res)
 {
-	/* compute CRC */
-	uint32_t ret = rte_net_crc_calc(vec, vec_len, type);
+	struct rte_net_crc *ctx;
+	uint32_t crc;
+	int ret = TEST_SUCCESS;
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SCALAR, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SCALAR\n", desc);
+		debug_hexdump(stdout, "SCALAR", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SSE42, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SSE42\n", desc);
+		debug_hexdump(stdout, "SSE", &crc, 4);
+		ret = TEST_FAILED;
+	}
 
-	/* dump data on console */
-	debug_hexdump(stdout, NULL, vec, vec_len);
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_AVX512, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s AVX512\n", desc);
+		debug_hexdump(stdout, "AVX512", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_NEON, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s NEON\n", desc);
+		debug_hexdump(stdout, "NEON", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
 
-	return  ret;
+	return ret;
 }
 
 static int
-test_crc_calc(void)
-{
+crc_autotest(void)
+{	uint8_t *test_data;
 	uint32_t i;
-	enum rte_net_crc_type type;
-	uint8_t *test_data;
-	uint32_t result;
-	int error;
+	int ret = TEST_SUCCESS;
 
 	/* 32-bit ethernet CRC: Test 1 */
-	type = RTE_NET_CRC32_ETH;
-
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc32_vec_res)
-		return -1;
+	ret = crc_all_algs("32-bit ethernet CRC: Test 1", RTE_NET_CRC32_ETH, crc_vec,
+		sizeof(crc_vec), crc32_vec_res);
 
 	/* 32-bit ethernet CRC: Test 2 */
 	test_data = rte_zmalloc(NULL, CRC32_VEC_LEN1, 0);
 	if (test_data == NULL)
 		return -7;
-
 	for (i = 0; i < CRC32_VEC_LEN1; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN1, type);
-	if (result != crc32_vec1_res) {
-		error = -2;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 2", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN1, crc32_vec1_res);
 
 	/* 32-bit ethernet CRC: Test 3 */
+	memset(test_data, 0, CRC32_VEC_LEN1);
 	for (i = 0; i < CRC32_VEC_LEN2; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN2, type);
-	if (result != crc32_vec2_res) {
-		error = -3;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 3", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN2, crc32_vec2_res);
 
 	/* 16-bit CCITT CRC:  Test 4 */
-	type = RTE_NET_CRC16_CCITT;
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc16_vec_res) {
-		error = -4;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 5 */
-	result = crc_calc(crc16_vec1, CRC16_VEC_LEN1, type);
-	if (result != crc16_vec1_res) {
-		error = -5;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 6 */
-	result = crc_calc(crc16_vec2, CRC16_VEC_LEN2, type);
-	if (result != crc16_vec2_res) {
-		error = -6;
-		goto fail;
-	}
-
-	rte_free(test_data);
-	return 0;
-
-fail:
-	rte_free(test_data);
-	return error;
-}
-
-static int
-test_crc(void)
-{
-	int ret;
-	/* set CRC scalar mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SCALAR);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (scalar): failed (%d)\n", ret);
-		return ret;
-	}
-	/* set CRC sse4.2 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SSE42);
+	crc_all_algs("16-bit CCITT CRC:  Test 4", RTE_NET_CRC16_CCITT, crc_vec,
+		sizeof(crc_vec), crc16_vec_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (x86_64_SSE4.2): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC avx512 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_AVX512);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (x86_64 AVX512): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC neon mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_NEON);
+	/* 16-bit CCITT CRC:  Test 5 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 5", RTE_NET_CRC16_CCITT, crc16_vec1,
+		CRC16_VEC_LEN1, crc16_vec1_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (arm64 neon pmull): failed (%d)\n", ret);
-		return ret;
-	}
+	/* 16-bit CCITT CRC:  Test 6 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 6", RTE_NET_CRC16_CCITT, crc16_vec2,
+		CRC16_VEC_LEN2, crc16_vec2_res);
 
-	return 0;
+	return ret;
 }
 
-REGISTER_FAST_TEST(crc_autotest, true, true, test_crc);
+REGISTER_FAST_TEST(crc_autotest, true, true, crc_autotest);
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 269ab6f68a..cdce459a87 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -132,6 +132,9 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* net: A thread/process-safe API was introduced. Old and new APIs share the same
+  function names, but the old one is versioned. Replaced functions are:
+  ``rte_net_crc_calc`` and ``rte_net_crc_set_alg``. The new one is ``rte_net_crc_free``.
 
 ABI Changes
 -----------
diff --git a/drivers/crypto/qat/qat_sym.h b/drivers/crypto/qat/qat_sym.h
index f42336d7ed..849e047615 100644
--- a/drivers/crypto/qat/qat_sym.h
+++ b/drivers/crypto/qat/qat_sym.h
@@ -267,8 +267,7 @@ qat_crc_verify(struct qat_sym_session *ctx, struct rte_crypto_op *op)
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				crc_data_ofs);
 
-		crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 
 		if (crc != *(uint32_t *)(crc_data + crc_data_len))
 			op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
@@ -291,8 +290,7 @@ qat_crc_generate(struct qat_sym_session *ctx,
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				sym_op->auth.data.offset);
 		crc = (uint32_t *)(crc_data + crc_data_len);
-		*crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		*crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 	}
 }
 
diff --git a/drivers/crypto/qat/qat_sym_session.c b/drivers/crypto/qat/qat_sym_session.c
index 50d687fd37..7200022adf 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -3174,6 +3174,14 @@ qat_sec_session_set_docsis_parameters(struct rte_cryptodev *dev,
 		ret = qat_sym_session_configure_crc(dev, xform, session);
 		if (ret < 0)
 			return ret;
+	} else {
+		/* Initialize crc algorithm */
+		session->crc = rte_net_crc_set_alg(RTE_NET_CRC_AVX512,
+			RTE_NET_CRC32_ETH);
+		if (session->crc == NULL) {
+			QAT_LOG(ERR, "Cannot initialize CRC context");
+			return -1;
+		}
 	}
 	qat_sym_session_finalize(session);
 
diff --git a/drivers/crypto/qat/qat_sym_session.h b/drivers/crypto/qat/qat_sym_session.h
index 2ca6c8ddf5..2ef2066646 100644
--- a/drivers/crypto/qat/qat_sym_session.h
+++ b/drivers/crypto/qat/qat_sym_session.h
@@ -7,6 +7,7 @@
 #include <rte_crypto.h>
 #include <cryptodev_pmd.h>
 #include <rte_security.h>
+#include <rte_net_crc.h>
 
 #include "qat_common.h"
 #include "icp_qat_hw.h"
@@ -149,6 +150,7 @@ struct qat_sym_session {
 	uint8_t is_zuc256;
 	uint8_t is_wireless;
 	uint32_t slice_types;
+	struct rte_net_crc *crc;
 	enum qat_sym_proto_flag qat_proto_flag;
 	qat_sym_build_request_t build_request[2];
 #ifndef RTE_QAT_OPENSSL
diff --git a/lib/net/meson.build b/lib/net/meson.build
index 8afcc4ed37..b26b377e8e 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2020 Intel Corporation
 
+use_function_versioning=true
+
 headers = files(
         'rte_cksum.h',
         'rte_ip.h',
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 7a74d5406c..563ea809a9 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -1,10 +1,26 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020-2025 Intel Corporation
  */
 
 #ifndef _NET_CRC_H_
 #define _NET_CRC_H_
 
+#include "rte_net_crc.h"
+
+void
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
+
+struct rte_net_crc *
+rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v25(const void *data,
+	uint32_t data_len, enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 346c285c15..74e3a5bcce 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #include <stddef.h>
@@ -10,6 +10,8 @@
 #include <rte_net_crc.h>
 #include <rte_log.h>
 #include <rte_vect.h>
+#include <rte_function_versioning.h>
+#include <rte_malloc.h>
 
 #include "net_crc.h"
 
@@ -38,11 +40,21 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len);
 typedef uint32_t
 (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
 
+struct rte_net_crc {
+	enum rte_net_crc_alg alg;
+	enum rte_net_crc_type type;
+};
+
 static rte_net_crc_handler handlers_default[] = {
 	[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_default_handler,
 	[RTE_NET_CRC32_ETH] = rte_crc32_eth_default_handler,
 };
 
+static struct
+{
+	rte_net_crc_handler f[RTE_NET_CRC_REQS];
+} handlers_dpdk26[RTE_NET_CRC_AVX512 + 1];
+
 static const rte_net_crc_handler *handlers = handlers_default;
 
 static const rte_net_crc_handler handlers_scalar[] = {
@@ -286,10 +298,56 @@ rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len)
 	return handlers[RTE_NET_CRC32_ETH](data, data_len);
 }
 
+static void
+handlers_init(enum rte_net_crc_alg alg)
+{
+	handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler;
+	handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] = rte_crc32_eth_handler;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_avx512_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_avx512_handler;
+			break;
+		}
+#endif
+	/* fall-through */
+	case RTE_NET_CRC_SSE42:
+#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+		if (SSE42_PCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_sse42_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_sse42_handler;
+		}
+#endif
+		break;
+	case RTE_NET_CRC_NEON:
+#ifdef CC_ARM64_NEON_PMULL_SUPPORT
+		if (NEON_PMULL_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_neon_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_neon_handler;
+			break;
+		}
+#endif
+	/* fall-through */
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+}
+
 /* Public API */
 
 void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg)
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -316,9 +374,59 @@ rte_net_crc_set_alg(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
+VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+
+struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type)
+{
+	uint16_t max_simd_bitwidth;
+	struct rte_net_crc *crc;
+
+	crc = rte_zmalloc(NULL, sizeof(struct rte_net_crc), 0);
+	if (crc == NULL)
+		return NULL;
+	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
+	crc->type = type;
+	crc->alg = RTE_NET_CRC_SCALAR;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_512) {
+			crc->alg = RTE_NET_CRC_AVX512;
+			return crc;
+		}
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_SSE42;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_NEON:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_NEON;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+	return crc;
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
+MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
+	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
+	rte_net_crc_set_alg_v26);
+
+void rte_net_crc_free(struct rte_net_crc *crc)
+{
+	rte_free(crc);
+}
 
 uint32_t
-rte_net_crc_calc(const void *data,
+rte_net_crc_calc_v25(const void *data,
 	uint32_t data_len,
 	enum rte_net_crc_type type)
 {
@@ -330,6 +438,18 @@ rte_net_crc_calc(const void *data,
 
 	return ret;
 }
+VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len)
+{
+	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
+MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len),
+	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
@@ -338,4 +458,8 @@ RTE_INIT(rte_net_crc_init)
 	sse42_pclmulqdq_init();
 	avx512_vpclmulqdq_init();
 	neon_pmull_init();
+	handlers_init(RTE_NET_CRC_SCALAR);
+	handlers_init(RTE_NET_CRC_NEON);
+	handlers_init(RTE_NET_CRC_SSE42);
+	handlers_init(RTE_NET_CRC_AVX512);
 }
diff --git a/lib/net/rte_net_crc.h b/lib/net/rte_net_crc.h
index 72d3e10ff6..ffac8c2f1f 100644
--- a/lib/net/rte_net_crc.h
+++ b/lib/net/rte_net_crc.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #ifndef _RTE_NET_CRC_H_
@@ -26,8 +26,11 @@ enum rte_net_crc_alg {
 	RTE_NET_CRC_AVX512,
 };
 
+/** CRC context (algorithm, type) */
+struct rte_net_crc;
+
 /**
- * This API set the CRC computation algorithm (i.e. scalar version,
+ * This API set the CRC context (i.e. scalar version,
  * x86 64-bit sse4.2 intrinsic version, etc.) and internal data
  * structure.
  *
@@ -37,27 +40,45 @@ enum rte_net_crc_alg {
  *   - RTE_NET_CRC_SSE42 (Use 64-bit SSE4.2 intrinsic)
  *   - RTE_NET_CRC_NEON (Use ARM Neon intrinsic)
  *   - RTE_NET_CRC_AVX512 (Use 512-bit AVX intrinsic)
+ * @param type
+ *   CRC type (enum rte_net_crc_type)
+ *
+ * @return
+ *   Pointer to the CRC context
  */
-void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg);
+struct rte_net_crc *
+rte_net_crc_set_alg(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
 
 /**
  * CRC compute API
  *
+ * Note:
+ * The command line argument --force-max-simd-bitwidth will be ignored
+ * by processes that have not created this CRC context.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
  * @param data
  *   Pointer to the packet data for CRC computation
  * @param data_len
  *   Data length for CRC computation
- * @param type
- *   CRC type (enum rte_net_crc_type)
  *
  * @return
  *   CRC value
  */
 uint32_t
-rte_net_crc_calc(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type);
+rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
+/**
+ * Frees the memory space pointed to by the CRC context pointer.
+ * If the pointer is NULL, the function does nothing.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
+ */
+void
+rte_net_crc_free(struct rte_net_crc *crc);
 
 #ifdef __cplusplus
 }
diff --git a/lib/net/version.map b/lib/net/version.map
index bec4ce23ea..d03f3f6ad0 100644
--- a/lib/net/version.map
+++ b/lib/net/version.map
@@ -5,6 +5,7 @@ DPDK_25 {
 	rte_ether_format_addr;
 	rte_ether_unformat_addr;
 	rte_net_crc_calc;
+	rte_net_crc_free;
 	rte_net_crc_set_alg;
 	rte_net_get_ptype;
 	rte_net_make_rarp_packet;
@@ -12,3 +13,8 @@ DPDK_25 {
 
 	local: *;
 };
+
+DPDK_26 {
+	rte_net_crc_calc;
+	rte_net_crc_set_alg;
+} DPDK_25;
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* [PATCH v8] graph: mcore: optimize graph search
  2025-02-06  2:53 11%           ` [PATCH v7 1/1] " Huichao Cai
  2025-02-06 20:10  0%             ` Patrick Robb
@ 2025-02-07  1:39 11%             ` Huichao Cai
  2025-02-22  6:59  0%               ` [EXTERNAL] " Kiran Kumar Kokkilagadda
  1 sibling, 1 reply; 200+ results
From: Huichao Cai @ 2025-02-07  1:39 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 devtools/libabigail.abignore               |  5 +++++
 doc/guides/rel_notes/release_25_03.rst     |  1 +
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..8876aaee2e 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,8 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+        name = rte_node
+        has_size_change = no
+        has_data_member_inserted_between =
+{offset_after(original_process), offset_of(xstat_off)}
\ No newline at end of file
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 269ab6f68a..16a888fd19 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -150,6 +150,7 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* graph: Added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index d3ec88519d..aef0f65673 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 
-- 
2.33.0


^ permalink raw reply	[relevance 11%]

* RE: [PATCH v2 1/3] net: add thread-safe crc api
  @ 2025-02-06 20:54  0%         ` Kusztal, ArkadiuszX
  0 siblings, 0 replies; 200+ results
From: Kusztal, ArkadiuszX @ 2025-02-06 20:54 UTC (permalink / raw)
  To: Ferruh Yigit, Marchand, David; +Cc: dev, Ji, Kai, Dooley, Brian



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, October 9, 2024 3:03 AM
> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand, David
> <david.marchand@redhat.com>
> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
> <brian.dooley@intel.com>
> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
> 
> On 10/8/2024 9:51 PM, Kusztal, ArkadiuszX wrote:
> > Hi Ferruh,
> > Thanks for the review, comments inline,
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Tuesday, October 8, 2024 5:43 AM
> >> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand,
> >> David <david.marchand@redhat.com>
> >> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
> >> <brian.dooley@intel.com>
> >> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
> >>
> >> On 10/1/2024 7:11 PM, Arkadiusz Kusztal wrote:
> >>> The current net CRC API is not thread-safe, this patch solves this
> >>> by adding another, thread-safe API functions.
> >>> This API is also safe to use across multiple processes, yet with
> >>> limitations on max-simd-bitwidth, which will be checked only by the
> >>> process that created the CRC context; all other processes will use
> >>> the same CRC function when used with the same CRC context.
> >>> It is an undefined behavior when process binaries are compiled with
> >>> different SIMD capabilities when the same CRC context is used.
> >>>
> >>> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> >>
> >> <...>
> >>
> >>> +static struct
> >>> +{
> >>> +	uint32_t (*f[RTE_NET_CRC_REQS])
> >>> +		(const uint8_t *data, uint32_t data_len);
> >>>
> >>
> >> It increases readability to typedef function pointers.
> >
> > Agree, though this typedef would be used here only, that’s why I left it out.
> > But I can add it then.
> >
> >>
> >>
> >>> +} handlers[RTE_NET_CRC_AVX512 + 1];
> >>>
> >>> -/**
> >>> - * Reflect the bits about the middle
> >>> - *
> >>> - * @param val
> >>> - *   value to be reflected
> >>> - *
> >>> - * @return
> >>> - *   reflected value
> >>> - */
> >>> -static uint32_t
> >>> +static inline uint32_t
> >>>
> >>
> >> Does changing to 'inline' required, as function is static compiler
> >> can do the same.
> >
> > True, though it may be more readable sometimes.
> > Of course there is no way that in O3 these functions would not be inlined by
> the compiler, regardless if the inline hint is present or not.
> >
> >>
> >>>  reflect_32bits(uint32_t val)
> >>>  {
> >>>  	uint32_t i, res = 0;
> >>> @@ -99,26 +43,7 @@ reflect_32bits(uint32_t val)
> >>>  	return res;
> >>>  }
> >>>
> >>> -static void
> >>> -crc32_eth_init_lut(uint32_t poly,
> >>> -	uint32_t *lut)
> >>> -{
> >>> -	uint32_t i, j;
> >>> -
> >>> -	for (i = 0; i < CRC_LUT_SIZE; i++) {
> >>> -		uint32_t crc = reflect_32bits(i);
> >>> -
> >>> -		for (j = 0; j < 8; j++) {
> >>> -			if (crc & 0x80000000L)
> >>> -				crc = (crc << 1) ^ poly;
> >>> -			else
> >>> -				crc <<= 1;
> >>> -		}
> >>> -		lut[i] = reflect_32bits(crc);
> >>> -	}
> >>> -}
> >>> -
> >>> -static __rte_always_inline uint32_t
> >>> +static inline uint32_t
> >>>
> >>
> >> Why not forcing inline anymore?
> >> Are these inline changes related to the thread-safety?
> >
> > O3 will inline it anyway, and with always_inline it will be inline even in debug
> mode. I just see no reason forcing it upon the compiler.
> >
> >>
> >>>  crc32_eth_calc_lut(const uint8_t *data,
> >>>  	uint32_t data_len,
> >>>  	uint32_t crc,
> >>> @@ -130,20 +55,9 @@ crc32_eth_calc_lut(const uint8_t *data,
> >>>  	return crc;
> >>>  }
> >>>
> >>> -static void
> >>> -rte_net_crc_scalar_init(void)
> >>> -{
> >>> -	/* 32-bit crc init */
> >>> -	crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
> >>> -
> >>> -	/* 16-bit CRC init */
> >>> -	crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
> >>> -}
> >>> -
> >>>  static inline uint32_t
> >>> -rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len)
> >>> +crc16_ccitt(const uint8_t *data, uint32_t data_len)
> >>>  {
> >>> -	/* return 16-bit CRC value */
> >>>
> >>
> >> Why not keep comments? Are they wrong?
> >
> > Functions names are very self-explanatory, that’s why I dropped comments. I
> can add comments if needed.
> >
> 
> I am for restricting changes to the target of the patch which is making CRC
> calculation thread safe, unless code changes invalidates the comments, lets
> keep them. Same goes with inline related modifications.
> 
> >>
> >> <...>
> >>
> >>> +static void
> >>> +crc_scalar_init(void)
> >>> +{
> >>> +	crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
> >>> +	crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
> >>> +
> >>> +	handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC16_CCITT] =
> >> crc16_ccitt;
> >>> +	handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC32_ETH] = crc32_eth;
> >>>
> >>
> >> +1 to remove global handlers pointer and add context,
> >>
> >> But current handlers array content is static, it can be set when
> >> defined, instead of functions.
> >
> > Can do it for scalars, but for SIMD there is this runtime check like this:
> > 	if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) { So compiled binary on
> AVX512
> > machine could filter it out on the machine which does not support it.
> > There is no NULL check in crc function so it would not change much when
> called -> Invalid address vs Invalid instruction.
> > There are not many checks there, as this is CRC after all, it should be a small as
> possible, yet probably NULL check could be advisable in crc function then.
> >
> 
> There is already AVX512_VPCLMULQDQ_CPU_SUPPORTED etc checks in
> 'rte_net_crc_set()'.
> So does it work to update 'handlers' statically, without condition, but have
> conditions when use them.

The problem here is that when some of the SIMD extensions would not be built with the binary, the field in the handler would be empty. But since the context may be shared across the processes, it may try to reference a NULL pointer. In a new patch, I am setting the highest possible compatible SIMD in the handlers array.
> 
> >
> >>
> >> <...>
> >>
> >>> -static uint32_t
> >>> -rte_crc32_eth_default_handler(const uint8_t *data, uint32_t
> >>> data_len)
> >>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
> >>> +	enum rte_net_crc_type type)
> >>>  {
> >>> -	handlers = NULL;
> >>> -	if (max_simd_bitwidth == 0)
> >>> -		max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> >>> -
> >>> -	handlers = avx512_vpclmulqdq_get_handlers();
> >>> -	if (handlers != NULL)
> >>> -		return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> -	handlers = sse42_pclmulqdq_get_handlers();
> >>> -	if (handlers != NULL)
> >>> -		return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> -	handlers = neon_pmull_get_handlers();
> >>> -	if (handlers != NULL)
> >>> -		return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> -	handlers = handlers_scalar;
> >>> -	return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> -}
> >>> +	uint16_t max_simd_bitwidth;
> >>>
> >>> -/* Public API */
> >>> -
> >>> -void
> >>> -rte_net_crc_set_alg(enum rte_net_crc_alg alg) -{
> >>> -	handlers = NULL;
> >>> -	if (max_simd_bitwidth == 0)
> >>> -		max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> >>> +	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> >>>
> >>>  	switch (alg) {
> >>>  	case RTE_NET_CRC_AVX512:
> >>> -		handlers = avx512_vpclmulqdq_get_handlers();
> >>> -		if (handlers != NULL)
> >>> -			break;
> >>> +#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
> >>> +		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED &&
> >>> +				max_simd_bitwidth >= RTE_VECT_SIMD_512) {
> >>> +			return (struct rte_net_crc){ RTE_NET_CRC_AVX512,
> >> type };
> >>> +		}
> >>> +#endif
> >>>  		/* fall-through */
> >>>  	case RTE_NET_CRC_SSE42:
> >>> -		handlers = sse42_pclmulqdq_get_handlers();
> >>> -		break; /* for x86, always break here */
> >>> +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
> >>> +		if (SSE42_PCLMULQDQ_CPU_SUPPORTED &&
> >>> +				max_simd_bitwidth >= RTE_VECT_SIMD_128) {
> >>> +			return (struct rte_net_crc){ RTE_NET_CRC_SSE42, type
> >> };
> >>> +		}
> >>> +#endif
> >>> +		break;
> >>>  	case RTE_NET_CRC_NEON:
> >>> -		handlers = neon_pmull_get_handlers();
> >>> -		/* fall-through */
> >>> -	case RTE_NET_CRC_SCALAR:
> >>> -		/* fall-through */
> >>> +#ifdef CC_ARM64_NEON_PMULL_SUPPORT
> >>> +		if (NEON_PMULL_CPU_SUPPORTED &&
> >>> +				max_simd_bitwidth >= RTE_VECT_SIMD_128) {
> >>> +			return (struct rte_net_crc){ RTE_NET_CRC_NEON, type
> >> };
> >>> +		}
> >>> +#endif
> >>>
> >>
> >> Is it more readable as following, up to you:
> >
> > Agree, I will change it.
> >
> >>
> >> ```
> >> rte_net_crc_set(alg, type) {
> >>   enum rte_net_crc_alg new_alg = RTE_NET_CRC_SCALAR;
> >>   switch (alg) {
> >>   case AVX512:
> >>     new_alg = ..
> >>   case NEON:
> >>     new_alg = ..
> >>   }
> >>   return struct rte_net_crc){ new_alg, type }; ```
> >>
> >>
> >>
> >>
> >>> +		break;
> >>>  	default:
> >>>  		break;
> >>>  	}
> >>> -
> >>> -	if (handlers == NULL)
> >>> -		handlers = handlers_scalar;
> >>> +	return (struct rte_net_crc){ RTE_NET_CRC_SCALAR, type };
> >>>  }
> >>>
> >>> -uint32_t
> >>> -rte_net_crc_calc(const void *data,
> >>> -	uint32_t data_len,
> >>> -	enum rte_net_crc_type type)
> >>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
> >>> +	const void *data, const uint32_t data_len)
> >>>  {
> >>> -	uint32_t ret;
> >>> -	rte_net_crc_handler f_handle;
> >>> -
> >>> -	f_handle = handlers[type];
> >>> -	ret = f_handle(data, data_len);
> >>> -
> >>> -	return ret;
> >>> +	return handlers[ctx->alg].f[ctx->type](data, data_len);
> >>>
> >>
> >> 'rte_net_crc()' gets input from user and "struct rte_net_crc" is not
> >> opaque, so user can provide invalid input, ctx->alg & ctx->type.
> >> To protect against it input values should be checked before using.
> >>
> >> Or I think user not need to know the details of the "struct
> >> rte_net_crc", so it can be an opaque variable for user.

Yes, I have changed it to opaque. It is a better approach.

> >
> > I would love it to be opaque, but then I would have to return a pointer, which
> then would involve some allocations/deallocations and I wanted to keep is as
> simple as possible.
> > So probably the checks would be a way to go.
> >
> 
> True, +1 for simplicity.
> 
> >>
> >> <...>
> >>
> >>> -/**
> >>> - * CRC compute API
> >>> - *
> >>> - * @param data
> >>> - *   Pointer to the packet data for CRC computation
> >>> - * @param data_len
> >>> - *   Data length for CRC computation
> >>> - * @param type
> >>> - *   CRC type (enum rte_net_crc_type)
> >>> - *
> >>> - * @return
> >>> - *   CRC value
> >>> - */
> >>> -uint32_t
> >>> -rte_net_crc_calc(const void *data,
> >>> -	uint32_t data_len,
> >>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
> >>>  	enum rte_net_crc_type type);
> >>>
> >>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
> >>> +	const void *data, const uint32_t data_len);
> >>> +
> >>>
> >>
> >> As these are APIs, can you please add doxygen comments to them?
> > +1
> > I think this change could be deferred to 25.03.
> > Adding this API without removing the old one should be possible without any
> unwanted consequences?
> >
> 
> This is not a new functionality but replacement of existing one, so it will be
> confusing to have two set of APIs for same functionality with similar names:
> rte_net_crc_calc()    and   rte_net_crc()
> rte_net_crc_set_alg() and   rte_net_crc_set()
> 
> Also there are some internal functions used by these APIs and supporting both
> new and old may force to have two version of these internal functions and it will
> create complexity/noise.
> 
> As this release is ABI break release, it is easier to update APIs (although
> deprecation notice is missing for this change).
> 
> 
> As an alternative option, do you think applying ABI versioning in 25.03 works for
> these APIs?
> If so, old version can be cleaned in v25.11.
> 
> 
> > I still have some second thoughts about this max-simd-width. DPDK does
> > not impose any restrictions on this parameter in the multi-process usage, there
> may be some room to alter some things there.
> >
> >>
> >>>  #ifdef __cplusplus
> >>>  }
> >>>  #endif
> >>> diff --git a/lib/net/version.map b/lib/net/version.map index
> >>> bec4ce23ea..47daf1464a 100644
> >>> --- a/lib/net/version.map
> >>> +++ b/lib/net/version.map
> >>> @@ -4,11 +4,25 @@ DPDK_25 {
> >>>  	rte_eth_random_addr;
> >>>  	rte_ether_format_addr;
> >>>  	rte_ether_unformat_addr;
> >>> -	rte_net_crc_calc;
> >>> -	rte_net_crc_set_alg;
> >>>  	rte_net_get_ptype;
> >>>  	rte_net_make_rarp_packet;
> >>>  	rte_net_skip_ip6_ext;
> >>> +	rte_net_crc;
> >>> +	rte_net_crc_set;
> >>>
> >>>  	local: *;
> >>>  };
> >>> +
> >>> +INTERNAL {
> >>> +	global:
> >>> +
> >>> +	rte_net_crc_sse42_init;
> >>> +	rte_crc16_ccitt_sse42_handler;
> >>> +	rte_crc32_eth_sse42_handler;
> >>> +	rte_net_crc_avx512_init;
> >>> +	rte_crc16_ccitt_avx512_handler;
> >>> +	rte_crc32_eth_avx512_handler;
> >>> +	rte_net_crc_neon_init;
> >>> +	rte_crc16_ccitt_neon_handler;
> >>> +	rte_crc32_eth_neon_handler;
> >>> +};
> >>>
> >>
> >> +1 to David's comment, these are used only within component, no need
> >> +to
> >> export.
> >>
> >


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v22 00/27] remove use of VLAs for Windows
  @ 2025-02-06 20:44  4%   ` David Marchand
  2025-02-07 14:23  3%     ` Konstantin Ananyev
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2025-02-06 20:44 UTC (permalink / raw)
  To: Andre Muezerie, konstantin.ananyev; +Cc: dev, thomas

On Thu, Feb 6, 2025 at 2:33 AM Andre Muezerie
<andremue@linux.microsoft.com> wrote:
>
> As per guidance technical board meeting 2024/04/17. This series
> removes the use of VLAs from code built for Windows for all 3
> toolchains. If there are additional opportunities to convert VLAs
> to regular C arrays please provide the details for incorporation
> into the series.
>
> MSVC does not support VLAs, replace VLAs with standard C arrays
> or alloca(). alloca() is available for all toolchain/platform
> combinations officially supported by DPDK.

- I have one concern wrt patch 7.
This changes the API/ABI of the RCU library.
ABI can't be broken in the 25.03 release.

Since MSVC builds do not include RCU yet, I skipped this change and
adjusted this libray meson.build.

Konstantin, do you think patch 7 could be rewritten to make use of
alloca() and avoid an API change?
https://patchwork.dpdk.org/project/dpdk/patch/1738805610-17507-8-git-send-email-andremue@linux.microsoft.com/


- There is also some VLA in examples/l2fwd-cat, so I had to adjust
this example meson.build accordingly.

Series applied, thanks André.


-- 
David Marchand


^ permalink raw reply	[relevance 4%]

* RE: [PATCH v2 1/3] net: add thread-safe crc api
  2024-12-02 22:36  3%   ` Stephen Hemminger
@ 2025-02-06 20:43  0%     ` Kusztal, ArkadiuszX
  0 siblings, 0 replies; 200+ results
From: Kusztal, ArkadiuszX @ 2025-02-06 20:43 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, ferruh.yigit, Ji, Kai, Dooley, Brian



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, December 2, 2024 11:36 PM
> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>
> Cc: dev@dpdk.org; ferruh.yigit@amd.com; Ji, Kai <kai.ji@intel.com>; Dooley,
> Brian <brian.dooley@intel.com>
> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
> 
> On Tue,  1 Oct 2024 19:11:48 +0100
> Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com> wrote:
> 
> > The current net CRC API is not thread-safe, this patch solves this by
> > adding another, thread-safe API functions.
> 
> Couldn't the old API be made threadsafe with TLS?
> 
> > This API is also safe to use across multiple processes, yet with
> > limitations on max-simd-bitwidth, which will be checked only by the
> > process that created the CRC context; all other processes will use the
> > same CRC function when used with the same CRC context.
> > It is an undefined behavior when process binaries are compiled with
> > different SIMD capabilities when the same CRC context is used.
> >
> > Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> 
> The API/ABI can't change for 25.03, do you want to support both?

I sent a new patch with versioned symbols. If this will not be accepted, this potentially can be moved to the .11 release.

> Or wait until 25.11?

^ permalink raw reply	[relevance 0%]

* [PATCH v3] net: add thread-safe crc api
      2024-12-02 22:36  3%   ` Stephen Hemminger
@ 2025-02-06 20:38  4%   ` Arkadiusz Kusztal
  2025-02-07  6:37  4%     ` [PATCH v4] " Arkadiusz Kusztal
  2 siblings, 1 reply; 200+ results
From: Arkadiusz Kusztal @ 2025-02-06 20:38 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, kai.ji, brian.dooley, Arkadiusz Kusztal

The current net CRC API is not thread-safe, this patch
solves this by adding another, thread-safe API functions.
This API is also safe to use across multiple processes,
yet with limitations on max-simd-bitwidth, which will be checked only by
the process that created the CRC context; all other processes
(that did not create the context) will use the highest possible
SIMD extension that was built with the binary, but no higher than the one
requested by the CRC context.

Since the change of the API at this point is an ABI break,
these API symbols are versioned with the _26 suffix.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
v2:
- added multi-process safety
v3:
- made the crc context opaque
- versioned old APIs

 app/test/test_crc.c                    | 169 ++++++++++---------------
 doc/guides/rel_notes/release_25_03.rst |   3 +
 drivers/crypto/qat/qat_sym.h           |   6 +-
 drivers/crypto/qat/qat_sym_session.c   |   8 ++
 drivers/crypto/qat/qat_sym_session.h   |   2 +
 lib/net/meson.build                    |   2 +
 lib/net/net_crc.h                      |  18 ++-
 lib/net/rte_net_crc.c                  | 130 ++++++++++++++++++-
 lib/net/rte_net_crc.h                  |  39 ++++--
 lib/net/version.map                    |   5 +
 10 files changed, 265 insertions(+), 117 deletions(-)

diff --git a/app/test/test_crc.c b/app/test/test_crc.c
index b85fca35fe..d7a11e8025 100644
--- a/app/test/test_crc.c
+++ b/app/test/test_crc.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #include "test.h"
@@ -44,131 +44,100 @@ static const uint32_t crc32_vec_res = 0xb491aab4;
 static const uint32_t crc32_vec1_res = 0xac54d294;
 static const uint32_t crc32_vec2_res = 0xefaae02f;
 static const uint32_t crc16_vec_res = 0x6bec;
-static const uint16_t crc16_vec1_res = 0x8cdd;
-static const uint16_t crc16_vec2_res = 0xec5b;
+static const uint32_t crc16_vec1_res = 0x8cdd;
+static const uint32_t crc16_vec2_res = 0xec5b;
 
 static int
-crc_calc(const uint8_t *vec,
-	uint32_t vec_len,
-	enum rte_net_crc_type type)
+crc_all_algs(const char *desc, enum rte_net_crc_type type,
+	const uint8_t *data, int data_len, uint32_t res)
 {
-	/* compute CRC */
-	uint32_t ret = rte_net_crc_calc(vec, vec_len, type);
+	struct rte_net_crc *ctx;
+	uint32_t crc;
+	int ret = TEST_SUCCESS;
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SCALAR, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SCALAR\n", desc);
+		debug_hexdump(stdout, "SCALAR", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_SSE42, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s SSE42\n", desc);
+		debug_hexdump(stdout, "SSE", &crc, 4);
+		ret = TEST_FAILED;
+	}
 
-	/* dump data on console */
-	debug_hexdump(stdout, NULL, vec, vec_len);
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_AVX512, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s AVX512\n", desc);
+		debug_hexdump(stdout, "AVX512", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
+
+	ctx = rte_net_crc_set_alg(RTE_NET_CRC_NEON, type);
+	TEST_ASSERT_NOT_NULL(ctx, "cannot allocate the CRC context");
+	crc = rte_net_crc_calc(ctx, data, data_len);
+	if (crc != res) {
+		RTE_LOG(ERR, USER1, "TEST FAILED: %s NEON\n", desc);
+		debug_hexdump(stdout, "NEON", &crc, 4);
+		ret = TEST_FAILED;
+	}
+	rte_net_crc_free(ctx);
 
-	return  ret;
+	return ret;
 }
 
 static int
-test_crc_calc(void)
-{
+crc_autotest(void)
+{	uint8_t *test_data;
 	uint32_t i;
-	enum rte_net_crc_type type;
-	uint8_t *test_data;
-	uint32_t result;
-	int error;
+	int ret = TEST_SUCCESS;
 
 	/* 32-bit ethernet CRC: Test 1 */
-	type = RTE_NET_CRC32_ETH;
-
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc32_vec_res)
-		return -1;
+	ret = crc_all_algs("32-bit ethernet CRC: Test 1", RTE_NET_CRC32_ETH, crc_vec,
+		sizeof(crc_vec), crc32_vec_res);
 
 	/* 32-bit ethernet CRC: Test 2 */
 	test_data = rte_zmalloc(NULL, CRC32_VEC_LEN1, 0);
 	if (test_data == NULL)
 		return -7;
-
 	for (i = 0; i < CRC32_VEC_LEN1; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN1, type);
-	if (result != crc32_vec1_res) {
-		error = -2;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 2", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN1, crc32_vec1_res);
 
 	/* 32-bit ethernet CRC: Test 3 */
+	memset(test_data, 0, CRC32_VEC_LEN1);
 	for (i = 0; i < CRC32_VEC_LEN2; i += 12)
 		rte_memcpy(&test_data[i], crc32_vec1, 12);
-
-	result = crc_calc(test_data, CRC32_VEC_LEN2, type);
-	if (result != crc32_vec2_res) {
-		error = -3;
-		goto fail;
-	}
+	ret |= crc_all_algs("32-bit ethernet CRC: Test 3", RTE_NET_CRC32_ETH, test_data,
+		CRC32_VEC_LEN2, crc32_vec2_res);
 
 	/* 16-bit CCITT CRC:  Test 4 */
-	type = RTE_NET_CRC16_CCITT;
-	result = crc_calc(crc_vec, CRC_VEC_LEN, type);
-	if (result != crc16_vec_res) {
-		error = -4;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 5 */
-	result = crc_calc(crc16_vec1, CRC16_VEC_LEN1, type);
-	if (result != crc16_vec1_res) {
-		error = -5;
-		goto fail;
-	}
-	/* 16-bit CCITT CRC:  Test 6 */
-	result = crc_calc(crc16_vec2, CRC16_VEC_LEN2, type);
-	if (result != crc16_vec2_res) {
-		error = -6;
-		goto fail;
-	}
-
-	rte_free(test_data);
-	return 0;
-
-fail:
-	rte_free(test_data);
-	return error;
-}
-
-static int
-test_crc(void)
-{
-	int ret;
-	/* set CRC scalar mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SCALAR);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (scalar): failed (%d)\n", ret);
-		return ret;
-	}
-	/* set CRC sse4.2 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_SSE42);
+	crc_all_algs("16-bit CCITT CRC:  Test 4", RTE_NET_CRC16_CCITT, crc_vec,
+		sizeof(crc_vec), crc16_vec_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test_crc (x86_64_SSE4.2): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC avx512 mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_AVX512);
-
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (x86_64 AVX512): failed (%d)\n", ret);
-		return ret;
-	}
-
-	/* set CRC neon mode */
-	rte_net_crc_set_alg(RTE_NET_CRC_NEON);
+	/* 16-bit CCITT CRC:  Test 5 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 5", RTE_NET_CRC16_CCITT, crc16_vec1,
+		CRC16_VEC_LEN1, crc16_vec1_res);
 
-	ret = test_crc_calc();
-	if (ret < 0) {
-		printf("test crc (arm64 neon pmull): failed (%d)\n", ret);
-		return ret;
-	}
+	/* 16-bit CCITT CRC:  Test 6 */
+	ret |= crc_all_algs("16-bit CCITT CRC:  Test 6", RTE_NET_CRC16_CCITT, crc16_vec2,
+		CRC16_VEC_LEN2, crc16_vec2_res);
 
-	return 0;
+	return ret;
 }
 
-REGISTER_FAST_TEST(crc_autotest, true, true, test_crc);
+REGISTER_FAST_TEST(crc_autotest, true, true, crc_autotest);
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 269ab6f68a..cdce459a87 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -132,6 +132,9 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* net: A thread/process-safe API was introduced. Old and new APIs share the same
+  function names, but the old one is versioned. Replaced functions are:
+  ``rte_net_crc_calc`` and ``rte_net_crc_set_alg``. The new one is ``rte_net_crc_free``.
 
 ABI Changes
 -----------
diff --git a/drivers/crypto/qat/qat_sym.h b/drivers/crypto/qat/qat_sym.h
index f42336d7ed..849e047615 100644
--- a/drivers/crypto/qat/qat_sym.h
+++ b/drivers/crypto/qat/qat_sym.h
@@ -267,8 +267,7 @@ qat_crc_verify(struct qat_sym_session *ctx, struct rte_crypto_op *op)
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				crc_data_ofs);
 
-		crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 
 		if (crc != *(uint32_t *)(crc_data + crc_data_len))
 			op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
@@ -291,8 +290,7 @@ qat_crc_generate(struct qat_sym_session *ctx,
 		crc_data = rte_pktmbuf_mtod_offset(sym_op->m_src, uint8_t *,
 				sym_op->auth.data.offset);
 		crc = (uint32_t *)(crc_data + crc_data_len);
-		*crc = rte_net_crc_calc(crc_data, crc_data_len,
-				RTE_NET_CRC32_ETH);
+		*crc = rte_net_crc_calc(ctx->crc, crc_data, crc_data_len);
 	}
 }
 
diff --git a/drivers/crypto/qat/qat_sym_session.c b/drivers/crypto/qat/qat_sym_session.c
index 50d687fd37..7200022adf 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -3174,6 +3174,14 @@ qat_sec_session_set_docsis_parameters(struct rte_cryptodev *dev,
 		ret = qat_sym_session_configure_crc(dev, xform, session);
 		if (ret < 0)
 			return ret;
+	} else {
+		/* Initialize crc algorithm */
+		session->crc = rte_net_crc_set_alg(RTE_NET_CRC_AVX512,
+			RTE_NET_CRC32_ETH);
+		if (session->crc == NULL) {
+			QAT_LOG(ERR, "Cannot initialize CRC context");
+			return -1;
+		}
 	}
 	qat_sym_session_finalize(session);
 
diff --git a/drivers/crypto/qat/qat_sym_session.h b/drivers/crypto/qat/qat_sym_session.h
index 2ca6c8ddf5..2ef2066646 100644
--- a/drivers/crypto/qat/qat_sym_session.h
+++ b/drivers/crypto/qat/qat_sym_session.h
@@ -7,6 +7,7 @@
 #include <rte_crypto.h>
 #include <cryptodev_pmd.h>
 #include <rte_security.h>
+#include <rte_net_crc.h>
 
 #include "qat_common.h"
 #include "icp_qat_hw.h"
@@ -149,6 +150,7 @@ struct qat_sym_session {
 	uint8_t is_zuc256;
 	uint8_t is_wireless;
 	uint32_t slice_types;
+	struct rte_net_crc *crc;
 	enum qat_sym_proto_flag qat_proto_flag;
 	qat_sym_build_request_t build_request[2];
 #ifndef RTE_QAT_OPENSSL
diff --git a/lib/net/meson.build b/lib/net/meson.build
index 8afcc4ed37..b26b377e8e 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017-2020 Intel Corporation
 
+use_function_versioning=true
+
 headers = files(
         'rte_cksum.h',
         'rte_ip.h',
diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h
index 7a74d5406c..563ea809a9 100644
--- a/lib/net/net_crc.h
+++ b/lib/net/net_crc.h
@@ -1,10 +1,26 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020-2025 Intel Corporation
  */
 
 #ifndef _NET_CRC_H_
 #define _NET_CRC_H_
 
+#include "rte_net_crc.h"
+
+void
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg);
+
+struct rte_net_crc *
+rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v25(const void *data,
+	uint32_t data_len, enum rte_net_crc_type type);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
 /*
  * Different implementations of CRC
  */
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 346c285c15..74e3a5bcce 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #include <stddef.h>
@@ -10,6 +10,8 @@
 #include <rte_net_crc.h>
 #include <rte_log.h>
 #include <rte_vect.h>
+#include <rte_function_versioning.h>
+#include <rte_malloc.h>
 
 #include "net_crc.h"
 
@@ -38,11 +40,21 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t data_len);
 typedef uint32_t
 (*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
 
+struct rte_net_crc {
+	enum rte_net_crc_alg alg;
+	enum rte_net_crc_type type;
+};
+
 static rte_net_crc_handler handlers_default[] = {
 	[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_default_handler,
 	[RTE_NET_CRC32_ETH] = rte_crc32_eth_default_handler,
 };
 
+static struct
+{
+	rte_net_crc_handler f[RTE_NET_CRC_REQS];
+} handlers_dpdk26[RTE_NET_CRC_AVX512 + 1];
+
 static const rte_net_crc_handler *handlers = handlers_default;
 
 static const rte_net_crc_handler handlers_scalar[] = {
@@ -286,10 +298,56 @@ rte_crc32_eth_default_handler(const uint8_t *data, uint32_t data_len)
 	return handlers[RTE_NET_CRC32_ETH](data, data_len);
 }
 
+static void
+handlers_init(enum rte_net_crc_alg alg)
+{
+	handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler;
+	handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] = rte_crc32_eth_handler;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_avx512_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_avx512_handler;
+			break;
+		}
+#endif
+	/* fall-through */
+	case RTE_NET_CRC_SSE42:
+#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+		if (SSE42_PCLMULQDQ_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_sse42_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_sse42_handler;
+		}
+#endif
+		break;
+	case RTE_NET_CRC_NEON:
+#ifdef CC_ARM64_NEON_PMULL_SUPPORT
+		if (NEON_PMULL_CPU_SUPPORTED) {
+			handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
+				rte_crc16_ccitt_neon_handler;
+			handlers_dpdk26[alg].f[RTE_NET_CRC32_ETH] =
+				rte_crc32_eth_neon_handler;
+			break;
+		}
+#endif
+	/* fall-through */
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+}
+
 /* Public API */
 
 void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg)
+rte_net_crc_set_alg_v25(enum rte_net_crc_alg alg)
 {
 	handlers = NULL;
 	if (max_simd_bitwidth == 0)
@@ -316,9 +374,59 @@ rte_net_crc_set_alg(enum rte_net_crc_alg alg)
 	if (handlers == NULL)
 		handlers = handlers_scalar;
 }
+VERSION_SYMBOL(rte_net_crc_set_alg, _v25, 25);
+
+struct rte_net_crc *rte_net_crc_set_alg_v26(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type)
+{
+	uint16_t max_simd_bitwidth;
+	struct rte_net_crc *crc;
+
+	crc = rte_zmalloc(NULL, sizeof(struct rte_net_crc), 0);
+	if (crc == NULL)
+		return NULL;
+	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
+	crc->type = type;
+	crc->alg = RTE_NET_CRC_SCALAR;
+
+	switch (alg) {
+	case RTE_NET_CRC_AVX512:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_512) {
+			crc->alg = RTE_NET_CRC_AVX512;
+			return crc;
+		}
+		/* fall-through */
+	case RTE_NET_CRC_SSE42:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_SSE42;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_NEON:
+		if (max_simd_bitwidth >= RTE_VECT_SIMD_128) {
+			crc->alg = RTE_NET_CRC_NEON;
+			return crc;
+		}
+		break;
+	case RTE_NET_CRC_SCALAR:
+		/* fall-through */
+	default:
+		break;
+	}
+	return crc;
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_set_alg, _v26, 26);
+MAP_STATIC_SYMBOL(struct rte_net_crc *rte_net_crc_set_alg(
+	enum rte_net_crc_alg alg, enum rte_net_crc_type type),
+	rte_net_crc_set_alg_v26);
+
+void rte_net_crc_free(struct rte_net_crc *crc)
+{
+	rte_free(crc);
+}
 
 uint32_t
-rte_net_crc_calc(const void *data,
+rte_net_crc_calc_v25(const void *data,
 	uint32_t data_len,
 	enum rte_net_crc_type type)
 {
@@ -330,6 +438,18 @@ rte_net_crc_calc(const void *data,
 
 	return ret;
 }
+VERSION_SYMBOL(rte_net_crc_calc, _v25, 25);
+
+uint32_t
+rte_net_crc_calc_v26(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len)
+{
+	return handlers_dpdk26[ctx->alg].f[ctx->type](data, data_len);
+}
+BIND_DEFAULT_SYMBOL(rte_net_crc_calc, _v26, 26);
+MAP_STATIC_SYMBOL(uint32_t rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len),
+	rte_net_crc_calc_v26);
 
 /* Call initialisation helpers for all crc algorithm handlers */
 RTE_INIT(rte_net_crc_init)
@@ -338,4 +458,8 @@ RTE_INIT(rte_net_crc_init)
 	sse42_pclmulqdq_init();
 	avx512_vpclmulqdq_init();
 	neon_pmull_init();
+	handlers_init(RTE_NET_CRC_SCALAR);
+	handlers_init(RTE_NET_CRC_NEON);
+	handlers_init(RTE_NET_CRC_SSE42);
+	handlers_init(RTE_NET_CRC_AVX512);
 }
diff --git a/lib/net/rte_net_crc.h b/lib/net/rte_net_crc.h
index 72d3e10ff6..ffac8c2f1f 100644
--- a/lib/net/rte_net_crc.h
+++ b/lib/net/rte_net_crc.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2017-2020 Intel Corporation
+ * Copyright(c) 2017-2025 Intel Corporation
  */
 
 #ifndef _RTE_NET_CRC_H_
@@ -26,8 +26,11 @@ enum rte_net_crc_alg {
 	RTE_NET_CRC_AVX512,
 };
 
+/** CRC context (algorithm, type) */
+struct rte_net_crc;
+
 /**
- * This API set the CRC computation algorithm (i.e. scalar version,
+ * This API set the CRC context (i.e. scalar version,
  * x86 64-bit sse4.2 intrinsic version, etc.) and internal data
  * structure.
  *
@@ -37,27 +40,45 @@ enum rte_net_crc_alg {
  *   - RTE_NET_CRC_SSE42 (Use 64-bit SSE4.2 intrinsic)
  *   - RTE_NET_CRC_NEON (Use ARM Neon intrinsic)
  *   - RTE_NET_CRC_AVX512 (Use 512-bit AVX intrinsic)
+ * @param type
+ *   CRC type (enum rte_net_crc_type)
+ *
+ * @return
+ *   Pointer to the CRC context
  */
-void
-rte_net_crc_set_alg(enum rte_net_crc_alg alg);
+struct rte_net_crc *
+rte_net_crc_set_alg(enum rte_net_crc_alg alg,
+	enum rte_net_crc_type type);
 
 /**
  * CRC compute API
  *
+ * Note:
+ * The command line argument --force-max-simd-bitwidth will be ignored
+ * by processes that have not created this CRC context.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
  * @param data
  *   Pointer to the packet data for CRC computation
  * @param data_len
  *   Data length for CRC computation
- * @param type
- *   CRC type (enum rte_net_crc_type)
  *
  * @return
  *   CRC value
  */
 uint32_t
-rte_net_crc_calc(const void *data,
-	uint32_t data_len,
-	enum rte_net_crc_type type);
+rte_net_crc_calc(const struct rte_net_crc *ctx,
+	const void *data, const uint32_t data_len);
+/**
+ * Frees the memory space pointed to by the CRC context pointer.
+ * If the pointer is NULL, the function does nothing.
+ *
+ * @param ctx
+ *   Pointer to the CRC context
+ */
+void
+rte_net_crc_free(struct rte_net_crc *crc);
 
 #ifdef __cplusplus
 }
diff --git a/lib/net/version.map b/lib/net/version.map
index bec4ce23ea..7b7f9227fa 100644
--- a/lib/net/version.map
+++ b/lib/net/version.map
@@ -12,3 +12,8 @@ DPDK_25 {
 
 	local: *;
 };
+
+DPDK_26 {
+	rte_net_crc_calc;
+	rte_net_crc_set_alg;
+} DPDK_25;
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* Re: [PATCH v7 1/1] graph: mcore: optimize graph search
  2025-02-06  2:53 11%           ` [PATCH v7 1/1] " Huichao Cai
@ 2025-02-06 20:10  0%             ` Patrick Robb
  2025-02-07  1:39 11%             ` [PATCH v8] " Huichao Cai
  1 sibling, 0 replies; 200+ results
From: Patrick Robb @ 2025-02-06 20:10 UTC (permalink / raw)
  To: Huichao Cai; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 3698 bytes --]

Recheck-request: iol-intel-Performance

Triggering a retest due to testbed instability yesterday.

On Wed, Feb 5, 2025 at 9:53 PM Huichao Cai <chcchc88@163.com> wrote:

> In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
> use a slower loop to search for the graph, modify the search logic
> to record the result of the first search, and use this record for
> subsequent searches to improve search speed.
>
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
>  devtools/libabigail.abignore               |  5 +++++
>  doc/guides/rel_notes/release_25_03.rst     |  1 +
>  lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
>  lib/graph/rte_graph_worker_common.h        |  1 +
>  4 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 21b8cd6113..8876aaee2e 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -33,3 +33,8 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +[suppress_type]
> +        name = rte_node
> +        has_size_change = no
> +        has_data_member_inserted_between =
> +{offset_after(original_process), offset_of(xstat_off)}
> \ No newline at end of file
> diff --git a/doc/guides/rel_notes/release_25_03.rst
> b/doc/guides/rel_notes/release_25_03.rst
> index 269ab6f68a..16a888fd19 100644
> --- a/doc/guides/rel_notes/release_25_03.rst
> +++ b/doc/guides/rel_notes/release_25_03.rst
> @@ -150,6 +150,7 @@ ABI Changes
>
>  * No ABI change that would break compatibility with 24.11.
>
> +* graph: Added ``graph`` field to the ``dispatch`` structure in the
> ``rte_node`` structure.
>
>  Known Issues
>  ------------
> diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c
> b/lib/graph/rte_graph_model_mcore_dispatch.c
> index a590fc9497..a81d338227 100644
> --- a/lib/graph/rte_graph_model_mcore_dispatch.c
> +++ b/lib/graph/rte_graph_model_mcore_dispatch.c
> @@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct
> rte_node *node,
>                                               struct rte_graph_rq_head *rq)
>  {
>         const unsigned int lcore_id = node->dispatch.lcore_id;
> -       struct rte_graph *graph;
> +       struct rte_graph *graph = node->dispatch.graph;
>
> -       SLIST_FOREACH(graph, rq, next)
> -               if (graph->dispatch.lcore_id == lcore_id)
> -                       break;
> +       if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
> +               SLIST_FOREACH(graph, rq, next)
> +                       if (graph->dispatch.lcore_id == lcore_id)
> +                               break;
> +               node->dispatch.graph = graph;
> +       }
>
>         return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false;
>  }
> diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index d3ec88519d..aef0f65673 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>                         unsigned int lcore_id;  /**< Node running lcore. */
>                         uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
>                         uint64_t total_sched_fail; /**< Number of
> scheduled failure. */
> +                       struct rte_graph *graph;  /**< Graph corresponding
> to lcore_id. */
>                 } dispatch;
>         };
>
> --
> 2.33.0
>
>

[-- Attachment #2: Type: text/html, Size: 4459 bytes --]

^ permalink raw reply	[relevance 0%]

* Re: [v2 3/4] crypto/virtio: add vhost backend to virtio_user
  2025-01-07 18:44  1% ` [v2 3/4] crypto/virtio: add vhost backend to virtio_user Gowrishankar Muthukrishnan
@ 2025-02-06 13:14  0%   ` Maxime Coquelin
  0 siblings, 0 replies; 200+ results
From: Maxime Coquelin @ 2025-02-06 13:14 UTC (permalink / raw)
  To: Gowrishankar Muthukrishnan, dev, Akhil Goyal, Chenbo Xia,
	Fan Zhang, Jay Zhou
  Cc: jerinj, anoobj, David Marchand



On 1/7/25 7:44 PM, Gowrishankar Muthukrishnan wrote:
> Add vhost backend to virtio_user crypto.
> 
> Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
> ---
>   drivers/crypto/virtio/meson.build             |   7 +
>   drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
>   drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
>   drivers/crypto/virtio/virtio_pci.h            |   7 +
>   drivers/crypto/virtio/virtio_ring.h           |   6 -
>   .../crypto/virtio/virtio_user/vhost_vdpa.c    | 312 +++++++
>   .../virtio/virtio_user/virtio_user_dev.c      | 776 ++++++++++++++++++
>   .../virtio/virtio_user/virtio_user_dev.h      |  88 ++
>   drivers/crypto/virtio/virtio_user_cryptodev.c | 587 +++++++++++++
>   9 files changed, 1815 insertions(+), 28 deletions(-)
>   create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
>   create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
>   create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
>   create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c
> 

I don't understand the purpose of the common base as as most of the code 
ends up being duplicated anyways.

Thanks,
Maxime

> diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
> index 8181c8296f..e5bce54cca 100644
> --- a/drivers/crypto/virtio/meson.build
> +++ b/drivers/crypto/virtio/meson.build
> @@ -16,3 +16,10 @@ sources = files(
>           'virtio_rxtx.c',
>           'virtqueue.c',
>   )
> +
> +if is_linux
> +    sources += files('virtio_user_cryptodev.c',
> +        'virtio_user/vhost_vdpa.c',
> +        'virtio_user/virtio_user_dev.c')
> +    deps += ['bus_vdev', 'common_virtio']
> +endif
> diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
> index d3db4f898e..c9f20cb338 100644
> --- a/drivers/crypto/virtio/virtio_cryptodev.c
> +++ b/drivers/crypto/virtio/virtio_cryptodev.c
> @@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
>   	return 0;
>   }
>   
> -/*
> - * This function is based on probe() function
> - * It returns 0 on success.
> - */
> -static int
> -crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
> -		struct rte_cryptodev_pmd_init_params *init_params)
> +int
> +crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
> +		struct rte_pci_device *pci_dev)
>   {
> -	struct rte_cryptodev *cryptodev;
>   	struct virtio_crypto_hw *hw;
>   
> -	PMD_INIT_FUNC_TRACE();
> -
> -	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
> -					init_params);
> -	if (cryptodev == NULL)
> -		return -ENODEV;
> -
>   	cryptodev->driver_id = cryptodev_virtio_driver_id;
>   	cryptodev->dev_ops = &virtio_crypto_dev_ops;
>   
> @@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
>   	hw->dev_id = cryptodev->data->dev_id;
>   	hw->virtio_dev_capabilities = virtio_capabilities;
>   
> -	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
> -		cryptodev->data->dev_id, pci_dev->id.vendor_id,
> -		pci_dev->id.device_id);
> +	if (pci_dev) {
> +		/* pci device init */
> +		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
> +			cryptodev->data->dev_id, pci_dev->id.vendor_id,
> +			pci_dev->id.device_id);
>   
> -	/* pci device init */
> -	if (vtpci_cryptodev_init(pci_dev, hw))
> +		if (vtpci_cryptodev_init(pci_dev, hw))
> +			return -1;
> +	}
> +
> +	if (virtio_crypto_init_device(cryptodev, features) < 0)
>   		return -1;
>   
> -	if (virtio_crypto_init_device(cryptodev,
> -			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
> +	return 0;
> +}
> +
> +/*
> + * This function is based on probe() function
> + * It returns 0 on success.
> + */
> +static int
> +crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
> +		struct rte_cryptodev_pmd_init_params *init_params)
> +{
> +	struct rte_cryptodev *cryptodev;
> +
> +	PMD_INIT_FUNC_TRACE();
> +
> +	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
> +					init_params);
> +	if (cryptodev == NULL)
> +		return -ENODEV;
> +
> +	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
> +			pci_dev) < 0)
>   		return -1;
>   
>   	rte_cryptodev_pmd_probing_finish(cryptodev);
> diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
> index b4bdd9800b..95a1e09dca 100644
> --- a/drivers/crypto/virtio/virtio_cryptodev.h
> +++ b/drivers/crypto/virtio/virtio_cryptodev.h
> @@ -74,4 +74,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
>   		struct rte_crypto_op **tx_pkts,
>   		uint16_t nb_pkts);
>   
> +int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
> +		struct rte_pci_device *pci_dev);
> +
>   #endif /* _VIRTIO_CRYPTODEV_H_ */
> diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
> index 79945cb88e..c75777e005 100644
> --- a/drivers/crypto/virtio/virtio_pci.h
> +++ b/drivers/crypto/virtio/virtio_pci.h
> @@ -20,6 +20,9 @@ struct virtqueue;
>   #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
>   #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
>   
> +/* VirtIO device IDs. */
> +#define VIRTIO_ID_CRYPTO  20
> +
>   /* VirtIO ABI version, this must match exactly. */
>   #define VIRTIO_PCI_ABI_VERSION 0
>   
> @@ -56,8 +59,12 @@ struct virtqueue;
>   #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
>   #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
>   #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
> +#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
>   #define VIRTIO_CONFIG_STATUS_FAILED    0x80
>   
> +/* The alignment to use between consumer and producer parts of vring. */
> +#define VIRTIO_VRING_ALIGN 4096
> +
>   /*
>    * Each virtqueue indirect descriptor list must be physically contiguous.
>    * To allow us to malloc(9) each list individually, limit the number
> diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
> index c74d1172b7..4b418f6e60 100644
> --- a/drivers/crypto/virtio/virtio_ring.h
> +++ b/drivers/crypto/virtio/virtio_ring.h
> @@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
>   				sizeof(struct vring_packed_desc_event)), align);
>   }
>   
> -static inline void
> -vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
> -{
> -	vring_init_split(vr, p, 0, align, num);
> -}
> -
>   /*
>    * The following is used with VIRTIO_RING_F_EVENT_IDX.
>    * Assuming a given event_idx value from the other size, if we have
> diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
> new file mode 100644
> index 0000000000..41696c4095
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
> @@ -0,0 +1,312 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2025 Marvell
> + */
> +
> +#include <sys/ioctl.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <sys/mman.h>
> +#include <fcntl.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +
> +#include <rte_memory.h>
> +
> +#include "virtio_user/vhost.h"
> +#include "virtio_user/vhost_logs.h"
> +
> +#include "virtio_user_dev.h"
> +#include "../virtio_pci.h"
> +
> +struct vhost_vdpa_data {
> +	int vhostfd;
> +	uint64_t protocol_features;
> +};
> +
> +#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
> +	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
> +	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
> +
> +/* vhost kernel & vdpa ioctls */
> +#define VHOST_VIRTIO 0xAF
> +#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
> +#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
> +#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
> +#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
> +#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
> +#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
> +#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
> +#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
> +#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
> +#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
> +#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
> +#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
> +#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
> +#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
> +#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
> +#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
> +#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
> +#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
> +#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
> +#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
> +#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
> +#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
> +
> +/* no alignment requirement */
> +struct vhost_iotlb_msg {
> +	uint64_t iova;
> +	uint64_t size;
> +	uint64_t uaddr;
> +#define VHOST_ACCESS_RO      0x1
> +#define VHOST_ACCESS_WO      0x2
> +#define VHOST_ACCESS_RW      0x3
> +	uint8_t perm;
> +#define VHOST_IOTLB_MISS           1
> +#define VHOST_IOTLB_UPDATE         2
> +#define VHOST_IOTLB_INVALIDATE     3
> +#define VHOST_IOTLB_ACCESS_FAIL    4
> +#define VHOST_IOTLB_BATCH_BEGIN    5
> +#define VHOST_IOTLB_BATCH_END      6
> +	uint8_t type;
> +};
> +
> +#define VHOST_IOTLB_MSG_V2 0x2
> +
> +struct vhost_vdpa_config {
> +	uint32_t off;
> +	uint32_t len;
> +	uint8_t buf[];
> +};
> +
> +struct vhost_msg {
> +	uint32_t type;
> +	uint32_t reserved;
> +	union {
> +		struct vhost_iotlb_msg iotlb;
> +		uint8_t padding[64];
> +	};
> +};
> +
> +
> +static int
> +vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
> +{
> +	int ret;
> +
> +	ret = ioctl(fd, request, arg);
> +	if (ret) {
> +		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
> +				request, strerror(errno));
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
> +{
> +	struct vhost_vdpa_data *data = dev->backend_data;
> +
> +	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
> +}
> +
> +static int
> +vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
> +{
> +	struct vhost_vdpa_data *data = dev->backend_data;
> +
> +	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
> +}
> +
> +static int
> +vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
> +{
> +	struct vhost_vdpa_data *data = dev->backend_data;
> +	int ret;
> +
> +	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
> +	if (ret) {
> +		PMD_DRV_LOG(ERR, "Failed to get features");
> +		return -1;
> +	}
> +
> +	/* Negotiated vDPA backend features */
> +	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
> +	if (ret < 0) {
> +		PMD_DRV_LOG(ERR, "Failed to get backend features");
> +		return -1;
> +	}
> +
> +	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
> +
> +	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
> +	if (ret < 0) {
> +		PMD_DRV_LOG(ERR, "Failed to set backend features");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
> +{
> +	struct vhost_vdpa_data *data = dev->backend_data;
> +
> +	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
> +}
> +
> +/**
> + * Set up environment to talk with a vhost vdpa backend.
> + *
> + * @return
> + *   - (-1) if fail to set up;
> + *   - (>=0) if successful.
> + */
> +static int
> +vhost_vdpa_setup(struct virtio_user_dev *dev)
> +{
> +	struct vhost_vdpa_data *data;
> +	uint32_t did = (uint32_t)-1;
> +
> +	data = malloc(sizeof(*data));
> +	if (!data) {
> +		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
> +		return -1;
> +	}
> +
> +	data->vhostfd = open(dev->path, O_RDWR);
> +	if (data->vhostfd < 0) {
> +		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
> +				dev->path, strerror(errno));
> +		free(data);
> +		return -1;
> +	}
> +
> +	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
> +			did != VIRTIO_ID_CRYPTO) {
> +		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
> +		close(data->vhostfd);
> +		free(data);
> +		return -1;
> +	}
> +
> +	dev->backend_data = data;
> +
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
> +{
> +	struct vhost_vring_state state = {
> +		.index = dev->max_queue_pairs,
> +		.num   = enable,
> +	};
> +
> +	return vhost_vdpa_set_vring_enable(dev, &state);
> +}
> +
> +static int
> +vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
> +				uint16_t pair_idx,
> +				int enable)
> +{
> +	struct vhost_vring_state state = {
> +		.index = pair_idx,
> +		.num   = enable,
> +	};
> +
> +	if (dev->qp_enabled[pair_idx] == enable)
> +		return 0;
> +
> +	if (vhost_vdpa_set_vring_enable(dev, &state))
> +		return -1;
> +
> +	dev->qp_enabled[pair_idx] = enable;
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
> +{
> +	/* TODO: It is W/A until a cleaner approach to find cpt status */
> +	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
> +{
> +	int nr_vrings = dev->max_queue_pairs;
> +
> +	return nr_vrings;
> +}
> +
> +static int
> +vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
> +{
> +	int i, nr_vrings;
> +
> +	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
> +
> +	for (i = 0; i < nr_vrings; i++) {
> +		if (dev->notify_area[i])
> +			munmap(dev->notify_area[i], getpagesize());
> +	}
> +	free(dev->notify_area);
> +	dev->notify_area = NULL;
> +
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
> +{
> +	struct vhost_vdpa_data *data = dev->backend_data;
> +	int nr_vrings, i, page_size = getpagesize();
> +	uint16_t **notify_area;
> +
> +	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
> +
> +	/* CQ is another vring */
> +	nr_vrings++;
> +
> +	notify_area = malloc(nr_vrings * sizeof(*notify_area));
> +	if (!notify_area) {
> +		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
> +		return -1;
> +	}
> +
> +	for (i = 0; i < nr_vrings; i++) {
> +		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
> +					data->vhostfd, i * page_size);
> +		if (notify_area[i] == MAP_FAILED) {
> +			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
> +					dev->path, i);
> +			i--;
> +			goto map_err;
> +		}
> +	}
> +	dev->notify_area = notify_area;
> +
> +	return 0;
> +
> +map_err:
> +	for (; i >= 0; i--)
> +		munmap(notify_area[i], page_size);
> +	free(notify_area);
> +
> +	return -1;
> +}
> +
> +struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
> +	.setup = vhost_vdpa_setup,
> +	.get_features = vhost_vdpa_get_features,
> +	.cvq_enable = vhost_vdpa_cvq_enable,
> +	.enable_qp = vhost_vdpa_enable_queue_pair,
> +	.update_link_state = vhost_vdpa_update_link_state,
> +	.map_notification_area = vhost_vdpa_map_notification_area,
> +	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
> +};
> diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
> new file mode 100644
> index 0000000000..ac53ca78d4
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
> @@ -0,0 +1,776 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2025 Marvell.
> + */
> +
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <fcntl.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <sys/mman.h>
> +#include <unistd.h>
> +#include <sys/eventfd.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <pthread.h>
> +
> +#include <rte_alarm.h>
> +#include <rte_string_fns.h>
> +#include <rte_eal_memconfig.h>
> +#include <rte_malloc.h>
> +#include <rte_io.h>
> +
> +#include "virtio_user/vhost.h"
> +#include "virtio_user/vhost_logs.h"
> +#include "virtio_logs.h"
> +
> +#include "cryptodev_pmd.h"
> +#include "virtio_crypto.h"
> +#include "virtio_cvq.h"
> +#include "virtio_user_dev.h"
> +#include "virtqueue.h"
> +
> +#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
> +
> +const char * const crypto_virtio_user_backend_strings[] = {
> +	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
> +	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
> +};
> +
> +static int
> +virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
> +{
> +	if (dev->kickfds[queue_sel] >= 0) {
> +		close(dev->kickfds[queue_sel]);
> +		dev->kickfds[queue_sel] = -1;
> +	}
> +
> +	if (dev->callfds[queue_sel] >= 0) {
> +		close(dev->callfds[queue_sel]);
> +		dev->callfds[queue_sel] = -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
> +{
> +	/* May use invalid flag, but some backend uses kickfd and
> +	 * callfd as criteria to judge if dev is alive. so finally we
> +	 * use real event_fd.
> +	 */
> +	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
> +	if (dev->callfds[queue_sel] < 0) {
> +		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
> +				dev->path, queue_sel, strerror(errno));
> +		return -1;
> +	}
> +	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
> +	if (dev->kickfds[queue_sel] < 0) {
> +		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
> +				dev->path, queue_sel, strerror(errno));
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
> +{
> +	struct vhost_vring_state state;
> +	int ret;
> +
> +	state.index = queue_sel;
> +	ret = dev->ops->get_vring_base(dev, &state);
> +	if (ret < 0) {
> +		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
> +{
> +	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
> +	 * firstly because vhost depends on this msg to allocate virtqueue
> +	 * pair.
> +	 */
> +	struct vhost_vring_file file;
> +	int ret;
> +
> +	file.index = queue_sel;
> +	file.fd = dev->callfds[queue_sel];
> +	ret = dev->ops->set_vring_call(dev, &file);
> +	if (ret < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
> +{
> +	int ret;
> +	struct vhost_vring_file file;
> +	struct vhost_vring_state state;
> +	struct vring *vring = &dev->vrings.split[queue_sel];
> +	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
> +	uint64_t desc_addr, avail_addr, used_addr;
> +	struct vhost_vring_addr addr = {
> +		.index = queue_sel,
> +		.log_guest_addr = 0,
> +		.flags = 0, /* disable log */
> +	};
> +
> +	if (queue_sel == dev->max_queue_pairs) {
> +		if (!dev->scvq) {
> +			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
> +					dev->path);
> +			goto err;
> +		}
> +
> +		/* Use shadow control queue information */
> +		vring = &dev->scvq->vq_split.ring;
> +		pq_vring = &dev->scvq->vq_packed.ring;
> +	}
> +
> +	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
> +		desc_addr = pq_vring->desc_iova;
> +		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
> +		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
> +						VIRTIO_VRING_ALIGN);
> +
> +		addr.desc_user_addr = desc_addr;
> +		addr.avail_user_addr = avail_addr;
> +		addr.used_user_addr = used_addr;
> +	} else {
> +		desc_addr = vring->desc_iova;
> +		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
> +		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
> +					VIRTIO_VRING_ALIGN);
> +
> +		addr.desc_user_addr = desc_addr;
> +		addr.avail_user_addr = avail_addr;
> +		addr.used_user_addr = used_addr;
> +	}
> +
> +	state.index = queue_sel;
> +	state.num = vring->num;
> +	ret = dev->ops->set_vring_num(dev, &state);
> +	if (ret < 0)
> +		goto err;
> +
> +	state.index = queue_sel;
> +	state.num = 0; /* no reservation */
> +	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
> +		state.num |= (1 << 15);
> +	ret = dev->ops->set_vring_base(dev, &state);
> +	if (ret < 0)
> +		goto err;
> +
> +	ret = dev->ops->set_vring_addr(dev, &addr);
> +	if (ret < 0)
> +		goto err;
> +
> +	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
> +	 * lastly because vhost depends on this msg to judge if
> +	 * virtio is ready.
> +	 */
> +	file.index = queue_sel;
> +	file.fd = dev->kickfds[queue_sel];
> +	ret = dev->ops->set_vring_kick(dev, &file);
> +	if (ret < 0)
> +		goto err;
> +
> +	return 0;
> +err:
> +	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
> +
> +	return -1;
> +}
> +
> +static int
> +virtio_user_foreach_queue(struct virtio_user_dev *dev,
> +			int (*fn)(struct virtio_user_dev *, uint32_t))
> +{
> +	uint32_t i, nr_vq;
> +
> +	nr_vq = dev->max_queue_pairs;
> +
> +	for (i = 0; i < nr_vq; i++)
> +		if (fn(dev, i) < 0)
> +			return -1;
> +
> +	return 0;
> +}
> +
> +int
> +crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
> +{
> +	uint64_t features;
> +	int ret = -1;
> +
> +	pthread_mutex_lock(&dev->mutex);
> +
> +	/* Step 0: tell vhost to create queues */
> +	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
> +		goto error;
> +
> +	features = dev->features;
> +
> +	ret = dev->ops->set_features(dev, features);
> +	if (ret < 0)
> +		goto error;
> +	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
> +error:
> +	pthread_mutex_unlock(&dev->mutex);
> +
> +	return ret;
> +}
> +
> +int
> +crypto_virtio_user_start_device(struct virtio_user_dev *dev)
> +{
> +	int ret;
> +
> +	/*
> +	 * XXX workaround!
> +	 *
> +	 * We need to make sure that the locks will be
> +	 * taken in the correct order to avoid deadlocks.
> +	 *
> +	 * Before releasing this lock, this thread should
> +	 * not trigger any memory hotplug events.
> +	 *
> +	 * This is a temporary workaround, and should be
> +	 * replaced when we get proper supports from the
> +	 * memory subsystem in the future.
> +	 */
> +	rte_mcfg_mem_read_lock();
> +	pthread_mutex_lock(&dev->mutex);
> +
> +	/* Step 2: share memory regions */
> +	ret = dev->ops->set_memory_table(dev);
> +	if (ret < 0)
> +		goto error;
> +
> +	/* Step 3: kick queues */
> +	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
> +	if (ret < 0)
> +		goto error;
> +
> +	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
> +	if (ret < 0)
> +		goto error;
> +
> +	/* Step 4: enable queues */
> +	for (int i = 0; i < dev->max_queue_pairs; i++) {
> +		ret = dev->ops->enable_qp(dev, i, 1);
> +		if (ret < 0)
> +			goto error;
> +	}
> +
> +	dev->started = true;
> +
> +	pthread_mutex_unlock(&dev->mutex);
> +	rte_mcfg_mem_read_unlock();
> +
> +	return 0;
> +error:
> +	pthread_mutex_unlock(&dev->mutex);
> +	rte_mcfg_mem_read_unlock();
> +
> +	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
> +
> +	/* TODO: free resource here or caller to check */
> +	return -1;
> +}
> +
> +int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
> +{
> +	uint32_t i;
> +	int ret;
> +
> +	pthread_mutex_lock(&dev->mutex);
> +	if (!dev->started)
> +		goto out;
> +
> +	for (i = 0; i < dev->max_queue_pairs; ++i) {
> +		ret = dev->ops->enable_qp(dev, i, 0);
> +		if (ret < 0)
> +			goto err;
> +	}
> +
> +	if (dev->scvq) {
> +		ret = dev->ops->cvq_enable(dev, 0);
> +		if (ret < 0)
> +			goto err;
> +	}
> +
> +	/* Stop the backend. */
> +	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
> +		goto err;
> +
> +	dev->started = false;
> +
> +out:
> +	pthread_mutex_unlock(&dev->mutex);
> +
> +	return 0;
> +err:
> +	pthread_mutex_unlock(&dev->mutex);
> +
> +	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
> +
> +	return -1;
> +}
> +
> +static int
> +virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
> +{
> +	int ret;
> +
> +	if (!dev->ops->get_config) {
> +		dev->max_queue_pairs = user_max_qp;
> +		return 0;
> +	}
> +
> +	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
> +			offsetof(struct virtio_crypto_config, max_dataqueues),
> +			sizeof(uint16_t));
> +	if (ret) {
> +		/*
> +		 * We need to know the max queue pair from the device so that
> +		 * the control queue gets the right index.
> +		 */
> +		dev->max_queue_pairs = 1;
> +		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
> +
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
> +{
> +	struct virtio_crypto_config config;
> +	int ret;
> +
> +	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
> +	dev->cipher_algo = 0;
> +	dev->auth_algo = 0;
> +	dev->akcipher_algo = 0;
> +
> +	if (!dev->ops->get_config)
> +		return 0;
> +
> +	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
> +	if (ret) {
> +		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
> +		return ret;
> +	}
> +
> +	dev->crypto_services = config.crypto_services;
> +	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
> +						config.cipher_algo_l;
> +	dev->hash_algo = config.hash_algo;
> +	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
> +						config.mac_algo_l;
> +	dev->aead_algo = config.aead_algo;
> +	dev->akcipher_algo = config.akcipher_algo;
> +	return 0;
> +}
> +
> +static int
> +virtio_user_dev_init_notify(struct virtio_user_dev *dev)
> +{
> +
> +	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
> +		goto err;
> +
> +	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
> +		if (dev->ops->map_notification_area &&
> +				dev->ops->map_notification_area(dev))
> +			goto err;
> +
> +	return 0;
> +err:
> +	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
> +
> +	return -1;
> +}
> +
> +static void
> +virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
> +{
> +	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
> +
> +	if (dev->ops->unmap_notification_area && dev->notify_area)
> +		dev->ops->unmap_notification_area(dev);
> +}
> +
> +static void
> +virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
> +			const void *addr,
> +			size_t len __rte_unused,
> +			void *arg)
> +{
> +	struct virtio_user_dev *dev = arg;
> +	struct rte_memseg_list *msl;
> +	uint16_t i;
> +	int ret = 0;
> +
> +	/* ignore externally allocated memory */
> +	msl = rte_mem_virt2memseg_list(addr);
> +	if (msl->external)
> +		return;
> +
> +	pthread_mutex_lock(&dev->mutex);
> +
> +	if (dev->started == false)
> +		goto exit;
> +
> +	/* Step 1: pause the active queues */
> +	for (i = 0; i < dev->queue_pairs; i++) {
> +		ret = dev->ops->enable_qp(dev, i, 0);
> +		if (ret < 0)
> +			goto exit;
> +	}
> +
> +	/* Step 2: update memory regions */
> +	ret = dev->ops->set_memory_table(dev);
> +	if (ret < 0)
> +		goto exit;
> +
> +	/* Step 3: resume the active queues */
> +	for (i = 0; i < dev->queue_pairs; i++) {
> +		ret = dev->ops->enable_qp(dev, i, 1);
> +		if (ret < 0)
> +			goto exit;
> +	}
> +
> +exit:
> +	pthread_mutex_unlock(&dev->mutex);
> +
> +	if (ret < 0)
> +		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
> +}
> +
> +static int
> +virtio_user_dev_setup(struct virtio_user_dev *dev)
> +{
> +	if (dev->is_server) {
> +		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
> +			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
> +			return -1;
> +		}
> +	}
> +
> +	switch (dev->backend_type) {
> +	case VIRTIO_USER_BACKEND_VHOST_VDPA:
> +		dev->ops = &virtio_ops_vdpa;
> +		dev->ops->setup = virtio_crypto_ops_vdpa.setup;
> +		dev->ops->get_features = virtio_crypto_ops_vdpa.get_features;
> +		dev->ops->cvq_enable = virtio_crypto_ops_vdpa.cvq_enable;
> +		dev->ops->enable_qp = virtio_crypto_ops_vdpa.enable_qp;
> +		dev->ops->update_link_state = virtio_crypto_ops_vdpa.update_link_state;
> +		dev->ops->map_notification_area = virtio_crypto_ops_vdpa.map_notification_area;
> +		dev->ops->unmap_notification_area = virtio_crypto_ops_vdpa.unmap_notification_area;
> +		break;
> +	default:
> +		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
> +		return -1;
> +	}
> +
> +	if (dev->ops->setup(dev) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +virtio_user_alloc_vrings(struct virtio_user_dev *dev)
> +{
> +	int i, size, nr_vrings;
> +	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
> +
> +	nr_vrings = dev->max_queue_pairs + 1;
> +
> +	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
> +	if (!dev->callfds) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
> +		return -1;
> +	}
> +
> +	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
> +	if (!dev->kickfds) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
> +		goto free_callfds;
> +	}
> +
> +	for (i = 0; i < nr_vrings; i++) {
> +		dev->callfds[i] = -1;
> +		dev->kickfds[i] = -1;
> +	}
> +
> +	if (packed_ring)
> +		size = sizeof(*dev->vrings.packed);
> +	else
> +		size = sizeof(*dev->vrings.split);
> +	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
> +	if (!dev->vrings.ptr) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
> +		goto free_kickfds;
> +	}
> +
> +	if (packed_ring) {
> +		dev->packed_queues = rte_zmalloc("virtio_user_dev",
> +				nr_vrings * sizeof(*dev->packed_queues), 0);
> +		if (!dev->packed_queues) {
> +			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
> +					dev->path);
> +			goto free_vrings;
> +		}
> +	}
> +
> +	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
> +			nr_vrings * sizeof(*dev->qp_enabled), 0);
> +	if (!dev->qp_enabled) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
> +		goto free_packed_queues;
> +	}
> +
> +	return 0;
> +
> +free_packed_queues:
> +	rte_free(dev->packed_queues);
> +	dev->packed_queues = NULL;
> +free_vrings:
> +	rte_free(dev->vrings.ptr);
> +	dev->vrings.ptr = NULL;
> +free_kickfds:
> +	rte_free(dev->kickfds);
> +	dev->kickfds = NULL;
> +free_callfds:
> +	rte_free(dev->callfds);
> +	dev->callfds = NULL;
> +
> +	return -1;
> +}
> +
> +static void
> +virtio_user_free_vrings(struct virtio_user_dev *dev)
> +{
> +	rte_free(dev->qp_enabled);
> +	dev->qp_enabled = NULL;
> +	rte_free(dev->packed_queues);
> +	dev->packed_queues = NULL;
> +	rte_free(dev->vrings.ptr);
> +	dev->vrings.ptr = NULL;
> +	rte_free(dev->kickfds);
> +	dev->kickfds = NULL;
> +	rte_free(dev->callfds);
> +	dev->callfds = NULL;
> +}
> +
> +#define VIRTIO_USER_SUPPORTED_FEATURES   \
> +	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
> +	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
> +	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
> +	 1ULL << VIRTIO_F_VERSION_1               | \
> +	 1ULL << VIRTIO_F_IN_ORDER                | \
> +	 1ULL << VIRTIO_F_RING_PACKED             | \
> +	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
> +	 1ULL << VIRTIO_F_ORDER_PLATFORM)
> +
> +int
> +crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
> +			int queue_size, int server)
> +{
> +	uint64_t backend_features;
> +
> +	pthread_mutex_init(&dev->mutex, NULL);
> +	strlcpy(dev->path, path, PATH_MAX);
> +
> +	dev->started = 0;
> +	dev->queue_pairs = 1; /* mq disabled by default */
> +	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
> +	dev->queue_size = queue_size;
> +	dev->is_server = server;
> +	dev->frontend_features = 0;
> +	dev->unsupported_features = 0;
> +	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
> +	dev->hw.modern = 1;
> +
> +	if (virtio_user_dev_setup(dev) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
> +		return -1;
> +	}
> +
> +	if (dev->ops->set_owner(dev) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
> +		goto destroy;
> +	}
> +
> +	if (dev->ops->get_backend_features(&backend_features) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
> +		goto destroy;
> +	}
> +
> +	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
> +
> +	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
> +		goto destroy;
> +	}
> +
> +	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
> +		goto destroy;
> +	}
> +
> +	if (virtio_user_dev_init_cipher_services(dev)) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
> +		goto destroy;
> +	}
> +
> +	dev->frontend_features &= ~dev->unsupported_features;
> +	dev->device_features &= ~dev->unsupported_features;
> +
> +	if (virtio_user_alloc_vrings(dev) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
> +		goto destroy;
> +	}
> +
> +	if (virtio_user_dev_init_notify(dev) < 0) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
> +		goto free_vrings;
> +	}
> +
> +	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
> +				virtio_user_mem_event_cb, dev)) {
> +		if (rte_errno != ENOTSUP) {
> +			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
> +					dev->path);
> +			goto notify_uninit;
> +		}
> +	}
> +
> +	return 0;
> +
> +notify_uninit:
> +	virtio_user_dev_uninit_notify(dev);
> +free_vrings:
> +	virtio_user_free_vrings(dev);
> +destroy:
> +	dev->ops->destroy(dev);
> +
> +	return -1;
> +}
> +
> +void
> +crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
> +{
> +	crypto_virtio_user_stop_device(dev);
> +
> +	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
> +
> +	virtio_user_dev_uninit_notify(dev);
> +
> +	virtio_user_free_vrings(dev);
> +
> +	if (dev->is_server)
> +		unlink(dev->path);
> +
> +	dev->ops->destroy(dev);
> +}
> +
> +#define CVQ_MAX_DATA_DESCS 32
> +
> +static inline void *
> +virtio_user_iova2virt(struct virtio_user_dev *dev __rte_unused, rte_iova_t iova)
> +{
> +	if (rte_eal_iova_mode() == RTE_IOVA_VA)
> +		return (void *)(uintptr_t)iova;
> +	else
> +		return rte_mem_iova2virt(iova);
> +}
> +
> +static inline int
> +desc_is_avail(struct vring_packed_desc *desc, bool wrap_counter)
> +{
> +	uint16_t flags = rte_atomic_load_explicit(&desc->flags, rte_memory_order_acquire);
> +
> +	return wrap_counter == !!(flags & VRING_PACKED_DESC_F_AVAIL) &&
> +		wrap_counter != !!(flags & VRING_PACKED_DESC_F_USED);
> +}
> +
> +int
> +crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
> +{
> +	int ret;
> +
> +	pthread_mutex_lock(&dev->mutex);
> +	dev->status = status;
> +	ret = dev->ops->set_status(dev, status);
> +	if (ret && ret != -ENOTSUP)
> +		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
> +
> +	pthread_mutex_unlock(&dev->mutex);
> +	return ret;
> +}
> +
> +int
> +crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
> +{
> +	int ret;
> +	uint8_t status;
> +
> +	pthread_mutex_lock(&dev->mutex);
> +
> +	ret = dev->ops->get_status(dev, &status);
> +	if (!ret) {
> +		dev->status = status;
> +		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
> +			"\t-RESET: %u "
> +			"\t-ACKNOWLEDGE: %u "
> +			"\t-DRIVER: %u "
> +			"\t-DRIVER_OK: %u "
> +			"\t-FEATURES_OK: %u "
> +			"\t-DEVICE_NEED_RESET: %u "
> +			"\t-FAILED: %u",
> +			dev->status,
> +			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
> +			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
> +			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
> +			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
> +			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
> +			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
> +			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
> +	} else if (ret != -ENOTSUP) {
> +		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
> +	}
> +
> +	pthread_mutex_unlock(&dev->mutex);
> +	return ret;
> +}
> +
> +int
> +crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
> +{
> +	if (dev->ops->update_link_state)
> +		return dev->ops->update_link_state(dev);
> +
> +	return 0;
> +}
> diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
> new file mode 100644
> index 0000000000..ef648fd14b
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
> @@ -0,0 +1,88 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2025 Marvell.
> + */
> +
> +#ifndef _VIRTIO_USER_DEV_H
> +#define _VIRTIO_USER_DEV_H
> +
> +#include <limits.h>
> +#include <stdbool.h>
> +
> +#include "../virtio_pci.h"
> +#include "../virtio_ring.h"
> +
> +extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
> +
> +enum virtio_user_backend_type {
> +	VIRTIO_USER_BACKEND_UNKNOWN,
> +	VIRTIO_USER_BACKEND_VHOST_USER,
> +	VIRTIO_USER_BACKEND_VHOST_VDPA,
> +};
> +
> +struct virtio_user_queue {
> +	uint16_t used_idx;
> +	bool avail_wrap_counter;
> +	bool used_wrap_counter;
> +};
> +
> +struct virtio_user_dev {
> +	union {
> +		struct virtio_crypto_hw hw;
> +		uint8_t dummy[256];
> +	};
> +
> +	void		*backend_data;
> +	uint16_t	**notify_area;
> +	char		path[PATH_MAX];
> +	bool		hw_cvq;
> +	uint16_t	max_queue_pairs;
> +	uint64_t	device_features; /* supported features by device */
> +	bool		*qp_enabled;
> +
> +	enum virtio_user_backend_type backend_type;
> +	bool		is_server;  /* server or client mode */
> +
> +	int		*callfds;
> +	int		*kickfds;
> +	uint16_t	queue_pairs;
> +	uint32_t	queue_size;
> +	uint64_t	features; /* the negotiated features with driver,
> +				   * and will be sync with device
> +				   */
> +	uint64_t	frontend_features; /* enabled frontend features */
> +	uint64_t	unsupported_features; /* unsupported features mask */
> +	uint8_t		status;
> +	uint32_t	crypto_status;
> +	uint32_t	crypto_services;
> +	uint64_t	cipher_algo;
> +	uint32_t	hash_algo;
> +	uint64_t	auth_algo;
> +	uint32_t	aead_algo;
> +	uint32_t	akcipher_algo;
> +
> +	union {
> +		void			*ptr;
> +		struct vring		*split;
> +		struct vring_packed	*packed;
> +	} vrings;
> +
> +	struct virtio_user_queue *packed_queues;
> +
> +	struct virtio_user_backend_ops *ops;
> +	pthread_mutex_t	mutex;
> +	bool		started;
> +
> +	struct virtqueue	*scvq;
> +};
> +
> +int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
> +int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
> +int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
> +int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
> +			int queue_size, int server);
> +void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
> +int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
> +int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
> +int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
> +extern const char * const crypto_virtio_user_backend_strings[];
> +#endif
> diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
> new file mode 100644
> index 0000000000..606639b872
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
> @@ -0,0 +1,587 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2025 Marvell
> + */
> +
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +#include <fcntl.h>
> +
> +#include <rte_malloc.h>
> +#include <rte_kvargs.h>
> +#include <bus_vdev_driver.h>
> +#include <rte_cryptodev.h>
> +#include <cryptodev_pmd.h>
> +#include <rte_alarm.h>
> +#include <rte_cycles.h>
> +#include <rte_io.h>
> +
> +#include "virtio_user/virtio_user_dev.h"
> +#include "virtio_user/vhost.h"
> +#include "virtio_user/vhost_logs.h"
> +#include "virtio_cryptodev.h"
> +#include "virtio_logs.h"
> +#include "virtio_pci.h"
> +#include "virtqueue.h"
> +
> +#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
> +
> +static void
> +virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
> +		     void *dst, int length __rte_unused)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	if (offset == offsetof(struct virtio_crypto_config, status)) {
> +		crypto_virtio_user_dev_update_link_state(dev);
> +		*(uint32_t *)dst = dev->crypto_status;
> +	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
> +		*(uint16_t *)dst = dev->max_queue_pairs;
> +	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
> +		*(uint32_t *)dst = dev->crypto_services;
> +	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
> +		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
> +	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
> +		*(uint32_t *)dst = dev->cipher_algo >> 32;
> +	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
> +		*(uint32_t *)dst = dev->hash_algo;
> +	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
> +		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
> +	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
> +		*(uint32_t *)dst = dev->auth_algo >> 32;
> +	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
> +		*(uint32_t *)dst = dev->aead_algo;
> +	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
> +		*(uint32_t *)dst = dev->akcipher_algo;
> +}
> +
> +static void
> +virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
> +		      const void *src, int length)
> +{
> +	RTE_SET_USED(hw);
> +	RTE_SET_USED(src);
> +
> +	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
> +		    offset, length);
> +}
> +
> +static void
> +virtio_user_reset(struct virtio_crypto_hw *hw)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
> +		crypto_virtio_user_stop_device(dev);
> +}
> +
> +static void
> +virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +	uint8_t old_status = dev->status;
> +
> +	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
> +			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
> +		crypto_virtio_user_dev_set_features(dev);
> +		/* Feature negotiation should be only done in probe time.
> +		 * So we skip any more request here.
> +		 */
> +		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
> +	}
> +
> +	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
> +		if (crypto_virtio_user_start_device(dev)) {
> +			crypto_virtio_user_dev_update_status(dev);
> +			return;
> +		}
> +	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
> +		virtio_user_reset(hw);
> +	}
> +
> +	crypto_virtio_user_dev_set_status(dev, status);
> +	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
> +		if (dev->ops->cvq_enable(dev, 1) < 0) {
> +			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
> +			crypto_virtio_user_dev_update_status(dev);
> +			return;
> +		}
> +	}
> +}
> +
> +static uint8_t
> +virtio_user_get_status(struct virtio_crypto_hw *hw)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	crypto_virtio_user_dev_update_status(dev);
> +
> +	return dev->status;
> +}
> +
> +#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
> +	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
> +	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
> +	 1ULL << VIRTIO_F_VERSION_1               | \
> +	 1ULL << VIRTIO_F_IN_ORDER                | \
> +	 1ULL << VIRTIO_F_RING_PACKED             | \
> +	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
> +	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
> +	 1ULL << VIRTIO_F_ORDER_PLATFORM)
> +
> +static uint64_t
> +virtio_user_get_features(struct virtio_crypto_hw *hw)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	/* unmask feature bits defined in vhost user protocol */
> +	return (dev->device_features | dev->frontend_features) &
> +		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
> +}
> +
> +static void
> +virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	dev->features = features & (dev->device_features | dev->frontend_features);
> +}
> +
> +static uint8_t
> +virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
> +{
> +	/* rxq interrupts and config interrupt are separated in virtio-user,
> +	 * here we only report config change.
> +	 */
> +	return VIRTIO_PCI_CAP_ISR_CFG;
> +}
> +
> +static uint16_t
> +virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
> +		    uint16_t vec __rte_unused)
> +{
> +	return 0;
> +}
> +
> +static uint16_t
> +virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
> +			  struct virtqueue *vq __rte_unused,
> +			  uint16_t vec)
> +{
> +	/* pretend we have done that */
> +	return vec;
> +}
> +
> +/* This function is to get the queue size, aka, number of descs, of a specified
> + * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
> + * max supported queues.
> + */
> +static uint16_t
> +virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	/* Currently, each queue has same queue size */
> +	return dev->queue_size;
> +}
> +
> +static void
> +virtio_user_setup_queue_packed(struct virtqueue *vq,
> +			       struct virtio_user_dev *dev)
> +{
> +	uint16_t queue_idx = vq->vq_queue_index;
> +	struct vring_packed *vring;
> +	uint64_t desc_addr;
> +	uint64_t avail_addr;
> +	uint64_t used_addr;
> +	uint16_t i;
> +
> +	vring  = &dev->vrings.packed[queue_idx];
> +	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
> +	avail_addr = desc_addr + vq->vq_nentries *
> +		sizeof(struct vring_packed_desc);
> +	used_addr = RTE_ALIGN_CEIL(avail_addr +
> +			   sizeof(struct vring_packed_desc_event),
> +			   VIRTIO_VRING_ALIGN);
> +	vring->num = vq->vq_nentries;
> +	vring->desc_iova = vq->vq_ring_mem;
> +	vring->desc = (void *)(uintptr_t)desc_addr;
> +	vring->driver = (void *)(uintptr_t)avail_addr;
> +	vring->device = (void *)(uintptr_t)used_addr;
> +	dev->packed_queues[queue_idx].avail_wrap_counter = true;
> +	dev->packed_queues[queue_idx].used_wrap_counter = true;
> +	dev->packed_queues[queue_idx].used_idx = 0;
> +
> +	for (i = 0; i < vring->num; i++)
> +		vring->desc[i].flags = 0;
> +}
> +
> +static void
> +virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
> +{
> +	uint16_t queue_idx = vq->vq_queue_index;
> +	uint64_t desc_addr, avail_addr, used_addr;
> +
> +	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
> +	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
> +	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
> +							 ring[vq->vq_nentries]),
> +				   VIRTIO_VRING_ALIGN);
> +
> +	dev->vrings.split[queue_idx].num = vq->vq_nentries;
> +	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
> +	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
> +	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
> +	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
> +}
> +
> +static int
> +virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +
> +	if (vtpci_with_packed_queue(hw))
> +		virtio_user_setup_queue_packed(vq, dev);
> +	else
> +		virtio_user_setup_queue_split(vq, dev);
> +
> +	if (dev->notify_area)
> +		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
> +
> +	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
> +		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
> +
> +	return 0;
> +}
> +
> +static void
> +virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
> +{
> +	/* For legacy devices, write 0 to VIRTIO_PCI_QUEUE_PFN port, QEMU
> +	 * correspondingly stops the ioeventfds, and reset the status of
> +	 * the device.
> +	 * For modern devices, set queue desc, avail, used in PCI bar to 0,
> +	 * not see any more behavior in QEMU.
> +	 *
> +	 * Here we just care about what information to deliver to vhost-user
> +	 * or vhost-kernel. So we just close ioeventfd for now.
> +	 */
> +
> +	RTE_SET_USED(hw);
> +	RTE_SET_USED(vq);
> +}
> +
> +static void
> +virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
> +{
> +	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
> +	uint64_t notify_data = 1;
> +
> +	if (!dev->notify_area) {
> +		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
> +			  sizeof(notify_data)) < 0)
> +			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
> +				    strerror(errno));
> +		return;
> +	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
> +		rte_write16(vq->vq_queue_index, vq->notify_addr);
> +		return;
> +	}
> +
> +	if (vtpci_with_packed_queue(hw)) {
> +		/* Bit[0:15]: vq queue index
> +		 * Bit[16:30]: avail index
> +		 * Bit[31]: avail wrap counter
> +		 */
> +		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
> +				VRING_PACKED_DESC_F_AVAIL)) << 31) |
> +				((uint32_t)vq->vq_avail_idx << 16) |
> +				vq->vq_queue_index;
> +	} else {
> +		/* Bit[0:15]: vq queue index
> +		 * Bit[16:31]: avail index
> +		 */
> +		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
> +				vq->vq_queue_index;
> +	}
> +	rte_write32(notify_data, vq->notify_addr);
> +}
> +
> +const struct virtio_pci_ops crypto_virtio_user_ops = {
> +	.read_dev_cfg	= virtio_user_read_dev_config,
> +	.write_dev_cfg	= virtio_user_write_dev_config,
> +	.reset		= virtio_user_reset,
> +	.get_status	= virtio_user_get_status,
> +	.set_status	= virtio_user_set_status,
> +	.get_features	= virtio_user_get_features,
> +	.set_features	= virtio_user_set_features,
> +	.get_isr	= virtio_user_get_isr,
> +	.set_config_irq	= virtio_user_set_config_irq,
> +	.set_queue_irq	= virtio_user_set_queue_irq,
> +	.get_queue_num	= virtio_user_get_queue_num,
> +	.setup_queue	= virtio_user_setup_queue,
> +	.del_queue	= virtio_user_del_queue,
> +	.notify_queue	= virtio_user_notify_queue,
> +};
> +
> +static const char * const valid_args[] = {
> +#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
> +	VIRTIO_USER_ARG_QUEUES_NUM,
> +#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
> +	VIRTIO_USER_ARG_QUEUE_SIZE,
> +#define VIRTIO_USER_ARG_PATH           "path"
> +	VIRTIO_USER_ARG_PATH,
> +#define VIRTIO_USER_ARG_SERVER_MODE    "server"
> +	VIRTIO_USER_ARG_SERVER_MODE,
> +	NULL
> +};
> +
> +#define VIRTIO_USER_DEF_Q_NUM	1
> +#define VIRTIO_USER_DEF_Q_SZ	256
> +#define VIRTIO_USER_DEF_SERVER_MODE	0
> +
> +static int
> +get_string_arg(const char *key __rte_unused,
> +		const char *value, void *extra_args)
> +{
> +	if (!value || !extra_args)
> +		return -EINVAL;
> +
> +	*(char **)extra_args = strdup(value);
> +
> +	if (!*(char **)extra_args)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static int
> +get_integer_arg(const char *key __rte_unused,
> +		const char *value, void *extra_args)
> +{
> +	uint64_t integer = 0;
> +	if (!value || !extra_args)
> +		return -EINVAL;
> +	errno = 0;
> +	integer = strtoull(value, NULL, 0);
> +	/* extra_args keeps default value, it should be replaced
> +	 * only in case of successful parsing of the 'value' arg
> +	 */
> +	if (errno == 0)
> +		*(uint64_t *)extra_args = integer;
> +	return -errno;
> +}
> +
> +static struct rte_cryptodev *
> +virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
> +{
> +	struct rte_cryptodev_pmd_init_params init_params = {
> +		.name = "",
> +		.private_data_size = sizeof(struct virtio_user_dev),
> +	};
> +	struct rte_cryptodev_data *data;
> +	struct rte_cryptodev *cryptodev;
> +	struct virtio_user_dev *dev;
> +	struct virtio_crypto_hw *hw;
> +
> +	init_params.socket_id = vdev->device.numa_node;
> +	init_params.private_data_size = sizeof(struct virtio_user_dev);
> +	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
> +	if (cryptodev == NULL) {
> +		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
> +		return NULL;
> +	}
> +
> +	data = cryptodev->data;
> +	dev = data->dev_private;
> +	hw = &dev->hw;
> +
> +	hw->dev_id = data->dev_id;
> +	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
> +
> +	return cryptodev;
> +}
> +
> +static void
> +virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
> +{
> +	rte_cryptodev_pmd_destroy(cryptodev);
> +}
> +
> +static int
> +virtio_user_pmd_probe(struct rte_vdev_device *vdev)
> +{
> +	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
> +	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
> +	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
> +	struct rte_cryptodev *cryptodev = NULL;
> +	struct rte_kvargs *kvlist = NULL;
> +	struct virtio_user_dev *dev;
> +	char *path = NULL;
> +	int ret;
> +
> +	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
> +
> +	if (!kvlist) {
> +		PMD_INIT_LOG(ERR, "error when parsing param");
> +		goto end;
> +	}
> +
> +	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
> +		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
> +					&get_string_arg, &path) < 0) {
> +			PMD_INIT_LOG(ERR, "error to parse %s",
> +					VIRTIO_USER_ARG_PATH);
> +			goto end;
> +		}
> +	} else {
> +		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
> +				VIRTIO_USER_ARG_PATH);
> +		goto end;
> +	}
> +
> +	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
> +		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
> +					&get_integer_arg, &queues) < 0) {
> +			PMD_INIT_LOG(ERR, "error to parse %s",
> +					VIRTIO_USER_ARG_QUEUES_NUM);
> +			goto end;
> +		}
> +	}
> +
> +	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
> +		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
> +					&get_integer_arg, &queue_size) < 0) {
> +			PMD_INIT_LOG(ERR, "error to parse %s",
> +					VIRTIO_USER_ARG_QUEUE_SIZE);
> +			goto end;
> +		}
> +	}
> +
> +	cryptodev = virtio_user_cryptodev_alloc(vdev);
> +	if (!cryptodev) {
> +		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
> +		goto end;
> +	}
> +
> +	dev = cryptodev->data->dev_private;
> +	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
> +			server_mode) < 0) {
> +		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
> +		virtio_user_cryptodev_free(cryptodev);
> +		goto end;
> +	}
> +
> +	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
> +			NULL) < 0) {
> +		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
> +		crypto_virtio_user_dev_uninit(dev);
> +		virtio_user_cryptodev_free(cryptodev);
> +		goto end;
> +	}
> +
> +	rte_cryptodev_pmd_probing_finish(cryptodev);
> +
> +	ret = 0;
> +end:
> +	rte_kvargs_free(kvlist);
> +	free(path);
> +	return ret;
> +}
> +
> +static int
> +virtio_user_pmd_remove(struct rte_vdev_device *vdev)
> +{
> +	struct rte_cryptodev *cryptodev;
> +	const char *name;
> +	int devid;
> +
> +	if (!vdev)
> +		return -EINVAL;
> +
> +	name = rte_vdev_device_name(vdev);
> +	PMD_DRV_LOG(INFO, "Removing %s", name);
> +
> +	devid = rte_cryptodev_get_dev_id(name);
> +	if (devid < 0)
> +		return -EINVAL;
> +
> +	rte_cryptodev_stop(devid);
> +
> +	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
> +	if (cryptodev == NULL)
> +		return -ENODEV;
> +
> +	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
> +		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
> +		return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
> +		uint64_t iova, size_t len)
> +{
> +	struct rte_cryptodev *cryptodev;
> +	struct virtio_user_dev *dev;
> +	const char *name;
> +
> +	if (!vdev)
> +		return -EINVAL;
> +
> +	name = rte_vdev_device_name(vdev);
> +	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
> +	if (cryptodev == NULL)
> +		return -EINVAL;
> +
> +	dev = cryptodev->data->dev_private;
> +
> +	if (dev->ops->dma_map)
> +		return dev->ops->dma_map(dev, addr, iova, len);
> +
> +	return 0;
> +}
> +
> +static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
> +		uint64_t iova, size_t len)
> +{
> +	struct rte_cryptodev *cryptodev;
> +	struct virtio_user_dev *dev;
> +	const char *name;
> +
> +	if (!vdev)
> +		return -EINVAL;
> +
> +	name = rte_vdev_device_name(vdev);
> +	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
> +	if (cryptodev == NULL)
> +		return -EINVAL;
> +
> +	dev = cryptodev->data->dev_private;
> +
> +	if (dev->ops->dma_unmap)
> +		return dev->ops->dma_unmap(dev, addr, iova, len);
> +
> +	return 0;
> +}
> +
> +static struct rte_vdev_driver virtio_user_driver = {
> +	.probe = virtio_user_pmd_probe,
> +	.remove = virtio_user_pmd_remove,
> +	.dma_map = virtio_user_pmd_dma_map,
> +	.dma_unmap = virtio_user_pmd_dma_unmap,
> +};
> +
> +static struct cryptodev_driver virtio_crypto_drv;
> +
> +RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
> +RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
> +	virtio_user_driver.driver,
> +	cryptodev_virtio_driver_id);
> +RTE_PMD_REGISTER_ALIAS(crypto_virtio_user, crypto_virtio);
> +RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
> +	"path=<path> "
> +	"queues=<int> "
> +	"queue_size=<int>");


^ permalink raw reply	[relevance 0%]

* [PATCH v7 1/1] graph: mcore: optimize graph search
  2024-12-16  1:43 11%         ` [PATCH v6] " Huichao Cai
  2024-12-16 14:49  4%           ` David Marchand
  2025-01-20 14:36  4%           ` Huichao Cai
@ 2025-02-06  2:53 11%           ` Huichao Cai
  2025-02-06 20:10  0%             ` Patrick Robb
  2025-02-07  1:39 11%             ` [PATCH v8] " Huichao Cai
  2 siblings, 2 replies; 200+ results
From: Huichao Cai @ 2025-02-06  2:53 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 devtools/libabigail.abignore               |  5 +++++
 doc/guides/rel_notes/release_25_03.rst     |  1 +
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..8876aaee2e 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,8 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+        name = rte_node
+        has_size_change = no
+        has_data_member_inserted_between =
+{offset_after(original_process), offset_of(xstat_off)}
\ No newline at end of file
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 269ab6f68a..16a888fd19 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -150,6 +150,7 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* graph: Added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index d3ec88519d..aef0f65673 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 
-- 
2.33.0


^ permalink raw reply	[relevance 11%]

* [PATCH v4 1/4] drivers: merge common and net idpf drivers
  @ 2025-02-05 11:55  2%   ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2025-02-05 11:55 UTC (permalink / raw)
  To: dev
  Cc: Bruce Richardson, Praveen Shetty, Thomas Monjalon, Jingjing Wu,
	Konstantin Ananyev

Rather than having some of the idpf code split out into the "common"
directory, used by both a net/idpf and a net/cpfl driver, we can
merge all idpf code together under net/idpf and have the cpfl driver
depend on "net/idpf" rather than "common/idpf".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Praveen Shetty <praveen.shetty@intel.com>
---
 devtools/libabigail.abignore                  |  1 +
 doc/guides/rel_notes/release_25_03.rst        |  6 ++++
 drivers/common/idpf/meson.build               | 34 -------------------
 drivers/common/meson.build                    |  1 -
 drivers/net/intel/cpfl/meson.build            |  2 +-
 .../{common => net/intel}/idpf/base/README    |  0
 .../intel}/idpf/base/idpf_alloc.h             |  0
 .../intel}/idpf/base/idpf_controlq.c          |  0
 .../intel}/idpf/base/idpf_controlq.h          |  0
 .../intel}/idpf/base/idpf_controlq_api.h      |  0
 .../intel}/idpf/base/idpf_controlq_setup.c    |  0
 .../intel}/idpf/base/idpf_devids.h            |  0
 .../intel}/idpf/base/idpf_lan_pf_regs.h       |  0
 .../intel}/idpf/base/idpf_lan_txrx.h          |  0
 .../intel}/idpf/base/idpf_lan_vf_regs.h       |  0
 .../intel}/idpf/base/idpf_osdep.h             |  0
 .../intel}/idpf/base/idpf_prototype.h         |  0
 .../intel}/idpf/base/idpf_type.h              |  0
 .../intel}/idpf/base/meson.build              |  0
 .../intel}/idpf/base/siov_regs.h              |  0
 .../intel}/idpf/base/virtchnl2.h              |  0
 .../intel}/idpf/base/virtchnl2_lan_desc.h     |  0
 .../intel}/idpf/idpf_common_device.c          |  0
 .../intel}/idpf/idpf_common_device.h          |  0
 .../intel}/idpf/idpf_common_logs.h            |  0
 .../intel}/idpf/idpf_common_rxtx.c            |  0
 .../intel}/idpf/idpf_common_rxtx.h            |  0
 .../intel}/idpf/idpf_common_rxtx_avx512.c     |  0
 .../intel}/idpf/idpf_common_virtchnl.c        |  0
 .../intel}/idpf/idpf_common_virtchnl.h        |  0
 drivers/net/intel/idpf/meson.build            | 20 +++++++++--
 .../{common => net/intel}/idpf/version.map    |  0
 drivers/net/meson.build                       |  2 +-
 33 files changed, 27 insertions(+), 39 deletions(-)
 delete mode 100644 drivers/common/idpf/meson.build
 rename drivers/{common => net/intel}/idpf/base/README (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_alloc.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_api.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_setup.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_devids.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_pf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_txrx.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_vf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_osdep.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_prototype.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_type.h (100%)
 rename drivers/{common => net/intel}/idpf/base/meson.build (100%)
 rename drivers/{common => net/intel}/idpf/base/siov_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2_lan_desc.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_logs.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx_avx512.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.h (100%)
 rename drivers/{common => net/intel}/idpf/version.map (100%)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..1dee6a954f 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -25,6 +25,7 @@
 ;
 ; SKIP_LIBRARY=librte_common_mlx5_glue
 ; SKIP_LIBRARY=librte_net_mlx4_glue
+; SKIP_LIBRARY=librte_common_idpf
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Experimental APIs exceptions ;
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index a88b04d958..79b1116f6e 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -115,6 +115,12 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* The driver ``common/idpf`` has been merged into the ``net/intel/idpf`` driver.
+  This change should have no impact to end applications, but,
+  when specifying the ``idpf`` or ``cpfl`` net drivers to meson via ``-Denable_drivers`` option,
+  there is no longer any need to also specify the ``common/idpf`` driver.
+  Note, however, ``net/intel/cpfl`` driver now depends upon the ``net/intel/idpf`` driver.
+
 
 ABI Changes
 -----------
diff --git a/drivers/common/idpf/meson.build b/drivers/common/idpf/meson.build
deleted file mode 100644
index 46fd45c03b..0000000000
--- a/drivers/common/idpf/meson.build
+++ /dev/null
@@ -1,34 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2022 Intel Corporation
-
-if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
-    subdir_done()
-endif
-
-includes += include_directories('../iavf')
-
-deps += ['mbuf']
-
-sources = files(
-        'idpf_common_device.c',
-        'idpf_common_rxtx.c',
-        'idpf_common_virtchnl.c',
-)
-
-if arch_subdir == 'x86'
-    if cc_has_avx512
-        cflags += ['-DCC_AVX512_SUPPORT']
-        avx512_args = cflags + cc_avx512_flags
-        if cc.has_argument('-march=skylake-avx512')
-            avx512_args += '-march=skylake-avx512'
-        endif
-        idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
-                'idpf_common_rxtx_avx512.c',
-                dependencies: [static_rte_mbuf,],
-                include_directories: includes,
-                c_args: avx512_args)
-        objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
-    endif
-endif
-
-subdir('base')
diff --git a/drivers/common/meson.build b/drivers/common/meson.build
index 8734af36aa..e1e3149d8f 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -6,7 +6,6 @@ drivers = [
         'cpt',
         'dpaax',
         'iavf',
-        'idpf',
         'ionic',
         'mvep',
         'octeontx',
diff --git a/drivers/net/intel/cpfl/meson.build b/drivers/net/intel/cpfl/meson.build
index 87fcfe0bb1..1f0269d50b 100644
--- a/drivers/net/intel/cpfl/meson.build
+++ b/drivers/net/intel/cpfl/meson.build
@@ -11,7 +11,7 @@ if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
     subdir_done()
 endif
 
-deps += ['hash', 'common_idpf']
+deps += ['hash', 'net_idpf']
 
 sources = files(
         'cpfl_ethdev.c',
diff --git a/drivers/common/idpf/base/README b/drivers/net/intel/idpf/base/README
similarity index 100%
rename from drivers/common/idpf/base/README
rename to drivers/net/intel/idpf/base/README
diff --git a/drivers/common/idpf/base/idpf_alloc.h b/drivers/net/intel/idpf/base/idpf_alloc.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_alloc.h
rename to drivers/net/intel/idpf/base/idpf_alloc.h
diff --git a/drivers/common/idpf/base/idpf_controlq.c b/drivers/net/intel/idpf/base/idpf_controlq.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.c
rename to drivers/net/intel/idpf/base/idpf_controlq.c
diff --git a/drivers/common/idpf/base/idpf_controlq.h b/drivers/net/intel/idpf/base/idpf_controlq.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.h
rename to drivers/net/intel/idpf/base/idpf_controlq.h
diff --git a/drivers/common/idpf/base/idpf_controlq_api.h b/drivers/net/intel/idpf/base/idpf_controlq_api.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_api.h
rename to drivers/net/intel/idpf/base/idpf_controlq_api.h
diff --git a/drivers/common/idpf/base/idpf_controlq_setup.c b/drivers/net/intel/idpf/base/idpf_controlq_setup.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_setup.c
rename to drivers/net/intel/idpf/base/idpf_controlq_setup.c
diff --git a/drivers/common/idpf/base/idpf_devids.h b/drivers/net/intel/idpf/base/idpf_devids.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_devids.h
rename to drivers/net/intel/idpf/base/idpf_devids.h
diff --git a/drivers/common/idpf/base/idpf_lan_pf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_pf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
diff --git a/drivers/common/idpf/base/idpf_lan_txrx.h b/drivers/net/intel/idpf/base/idpf_lan_txrx.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_txrx.h
rename to drivers/net/intel/idpf/base/idpf_lan_txrx.h
diff --git a/drivers/common/idpf/base/idpf_lan_vf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_vf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
diff --git a/drivers/common/idpf/base/idpf_osdep.h b/drivers/net/intel/idpf/base/idpf_osdep.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_osdep.h
rename to drivers/net/intel/idpf/base/idpf_osdep.h
diff --git a/drivers/common/idpf/base/idpf_prototype.h b/drivers/net/intel/idpf/base/idpf_prototype.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_prototype.h
rename to drivers/net/intel/idpf/base/idpf_prototype.h
diff --git a/drivers/common/idpf/base/idpf_type.h b/drivers/net/intel/idpf/base/idpf_type.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_type.h
rename to drivers/net/intel/idpf/base/idpf_type.h
diff --git a/drivers/common/idpf/base/meson.build b/drivers/net/intel/idpf/base/meson.build
similarity index 100%
rename from drivers/common/idpf/base/meson.build
rename to drivers/net/intel/idpf/base/meson.build
diff --git a/drivers/common/idpf/base/siov_regs.h b/drivers/net/intel/idpf/base/siov_regs.h
similarity index 100%
rename from drivers/common/idpf/base/siov_regs.h
rename to drivers/net/intel/idpf/base/siov_regs.h
diff --git a/drivers/common/idpf/base/virtchnl2.h b/drivers/net/intel/idpf/base/virtchnl2.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2.h
rename to drivers/net/intel/idpf/base/virtchnl2.h
diff --git a/drivers/common/idpf/base/virtchnl2_lan_desc.h b/drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2_lan_desc.h
rename to drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
diff --git a/drivers/common/idpf/idpf_common_device.c b/drivers/net/intel/idpf/idpf_common_device.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.c
rename to drivers/net/intel/idpf/idpf_common_device.c
diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/net/intel/idpf/idpf_common_device.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.h
rename to drivers/net/intel/idpf/idpf_common_device.h
diff --git a/drivers/common/idpf/idpf_common_logs.h b/drivers/net/intel/idpf/idpf_common_logs.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_logs.h
rename to drivers/net/intel/idpf/idpf_common_logs.h
diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/net/intel/idpf/idpf_common_rxtx.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.c
rename to drivers/net/intel/idpf/idpf_common_rxtx.c
diff --git a/drivers/common/idpf/idpf_common_rxtx.h b/drivers/net/intel/idpf/idpf_common_rxtx.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.h
rename to drivers/net/intel/idpf/idpf_common_rxtx.h
diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx_avx512.c
rename to drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.c b/drivers/net/intel/idpf/idpf_common_virtchnl.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.c
rename to drivers/net/intel/idpf/idpf_common_virtchnl.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.h b/drivers/net/intel/idpf/idpf_common_virtchnl.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.h
rename to drivers/net/intel/idpf/idpf_common_virtchnl.h
diff --git a/drivers/net/intel/idpf/meson.build b/drivers/net/intel/idpf/meson.build
index 34cbdc4da0..52405b5b35 100644
--- a/drivers/net/intel/idpf/meson.build
+++ b/drivers/net/intel/idpf/meson.build
@@ -7,13 +7,29 @@ if is_windows
     subdir_done()
 endif
 
-deps += ['common_idpf']
+includes += include_directories('../../../common/iavf')
 
 sources = files(
+        'idpf_common_device.c',
+        'idpf_common_rxtx.c',
+        'idpf_common_virtchnl.c',
+
         'idpf_ethdev.c',
         'idpf_rxtx.c',
 )
 
-if arch_subdir == 'x86'and cc_has_avx512
+if arch_subdir == 'x86' and cc_has_avx512
     cflags += ['-DCC_AVX512_SUPPORT']
+    avx512_args = cflags + cc_avx512_flags
+    if cc.has_argument('-march=skylake-avx512')
+        avx512_args += '-march=skylake-avx512'
+    endif
+    idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
+            'idpf_common_rxtx_avx512.c',
+            dependencies: static_rte_mbuf,
+            include_directories: includes,
+            c_args: avx512_args)
+    objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
 endif
+
+subdir('base')
diff --git a/drivers/common/idpf/version.map b/drivers/net/intel/idpf/version.map
similarity index 100%
rename from drivers/common/idpf/version.map
rename to drivers/net/intel/idpf/version.map
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 02a3f5a0b6..bcf6f9dc73 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,7 +24,6 @@ drivers = [
         'gve',
         'hinic',
         'hns3',
-        'intel/cpfl',
         'intel/e1000',
         'intel/fm10k',
         'intel/i40e',
@@ -34,6 +33,7 @@ drivers = [
         'intel/igc',
         'intel/ipn3ke',
         'intel/ixgbe',
+        'intel/cpfl',  # depends on idpf, so must come after it
         'ionic',
         'mana',
         'memif',
-- 
2.43.0


^ permalink raw reply	[relevance 2%]

* Re: [PATCH v1 00/42] Merge Intel IGC and E1000 drivers, and update E1000 base code
  2025-02-04 15:35  0%   ` Burakov, Anatoly
@ 2025-02-05 10:05  0%     ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Burakov, Anatoly; +Cc: dev, Bruce Richardson

On Tue, Feb 4, 2025 at 4:36 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 03/02/2025 9:18, David Marchand wrote:
> > Hello Anatoly,
> >
> > On Fri, Jan 31, 2025 at 1:59 PM Anatoly Burakov
> > <anatoly.burakov@intel.com> wrote:
> >>
> >> Intel IGC and E1000 drivers are distinct, but they are actually generated
> >> from the same base code. This patchset will merge together all e1000-derived
> >> drivers into one common base, with three different ethdev driver
> >> frontends (EM, IGB, and IGC).
> >>
> >> After the merge is done, base code is also updated to latest snapshot.
> >>
> >> Adam Ludkiewicz (1):
> >>    net/e1000/base: add WoL definitions
> >>
> >> Aleksandr Loktionov (1):
> >>    net/e1000/base: fix mac addr hash bit_shift
> >>
> >> Amir Avivi (1):
> >>    net/e1000/base: fix iterator type
> >>
> >> Anatoly Burakov (13):
> >>    net/e1000/base: add initial support for i225
> >>    net/e1000/base: add link bringup support for i225
> >>    net/e1000/base: add LED blink support for i225
> >>    net/e1000/base: add NVM/EEPROM support for i225
> >>    net/e1000/base: add LTR support in i225
> >>    net/e1000/base: add eee support for i225
> >>    net/e1000/base: add misc definitions for i225
> >>    net/e1000: merge igc with e1000
> >>    net/e1000: add missing i225 devices
> >>    net/e1000: add missing hardware support
> >>    net/e1000/base: correct minor formatting issues
> >>    net/e1000/base: correct mPHY access logic
> >>    net/e1000/base: update readme
> >>
> >> Barbara Skobiej (2):
> >>    net/e1000/base: fix reset for 82580
> >>    net/e1000/base: fix data type in MAC hash
> >>
> >> Carolyn Wyborny (1):
> >>    net/e1000/base: skip MANC check for 82575
> >>
> >> Dima Ruinskiy (4):
> >>    net/e1000/base: make e1000_access_phy_wakeup_reg_bm non-static
> >>    net/e1000/base: make debug prints more informative
> >>    net/e1000/base: hardcode bus parameters for ICH8
> >>    net/e1000/base: fix unchecked return
> >>
> >> Evgeny Efimov (1):
> >>    net/e1000/base: add EEE common API function
> >>
> >> Jakub Buchocki (1):
> >>    net/e1000/base: fix uninitialized variable usage
> >>
> >> Marcin Jurczak (1):
> >>    net/e1000/base: remove non-inclusive language
> >>
> >> Nir Efrati (6):
> >>    net/e1000/base: workaround for packet loss
> >>    net/e1000/base: add definition for EXFWSM register
> >>    net/e1000/base: use longer ULP exit timeout on more HW
> >>    net/e1000/base: remove redundant access to RO register
> >>    net/e1000/base: introduce PHY ID retry mechanism
> >>    net/e1000/base: add PHY read/write retry mechanism
> >>
> >> Pawel Malinowski (1):
> >>    net/e1000/base: fix semaphore timeout value
> >>
> >> Piotr Kubaj (1):
> >>    net/e1000/base: rename NVM version variable
> >>
> >> Piotr Pietruszewski (1):
> >>    net/e1000/base: improve code flow in ICH8LAN
> >>
> >> Przemyslaw Ciesielski (1):
> >>    net/e1000/base: fix static analysis warnings
> >>
> >> Sasha Neftin (4):
> >>    net/e1000/base: add queue select definitions
> >>    net/e1000/base: add profile information field
> >>    net/e1000/base: add LPI counters
> >>    net/e1000/base: improve NVM checksum handling
> >>
> >> Vitaly Lifshits (2):
> >>    net/e1000: add support for more I219 devices
> >>    net/e1000/base: correct disable k1 logic
> >>
> >>   drivers/net/intel/e1000/base/README           |    8 +-
> >>   .../net/intel/e1000/base/e1000_80003es2lan.c  |   10 +-
> >>   drivers/net/intel/e1000/base/e1000_82571.c    |    4 +-
> >>   drivers/net/intel/e1000/base/e1000_82575.c    |   21 +-
> >>   drivers/net/intel/e1000/base/e1000_82575.h    |   29 -
> >>   drivers/net/intel/e1000/base/e1000_api.c      |   76 +-
> >>   drivers/net/intel/e1000/base/e1000_api.h      |    4 +-
> >>   drivers/net/intel/e1000/base/e1000_base.c     |    3 +-
> >>   drivers/net/intel/e1000/base/e1000_defines.h  |  259 +-
> >>   drivers/net/intel/e1000/base/e1000_hw.h       |   86 +-
> >>   drivers/net/intel/e1000/base/e1000_i210.c     |   14 +-
> >>   drivers/net/intel/e1000/base/e1000_i210.h     |    4 +
> >>   drivers/net/intel/e1000/base/e1000_i225.c     | 1384 ++++++
> >>   drivers/net/intel/e1000/base/e1000_i225.h     |  117 +
> >>   drivers/net/intel/e1000/base/e1000_ich8lan.c  |  224 +-
> >>   drivers/net/intel/e1000/base/e1000_ich8lan.h  |    3 +-
> >>   drivers/net/intel/e1000/base/e1000_mac.c      |   62 +-
> >>   drivers/net/intel/e1000/base/e1000_mac.h      |    2 +-
> >>   drivers/net/intel/e1000/base/e1000_nvm.c      |    7 +-
> >>   drivers/net/intel/e1000/base/e1000_osdep.h    |   33 +-
> >>   drivers/net/intel/e1000/base/e1000_phy.c      |  447 +-
> >>   drivers/net/intel/e1000/base/e1000_phy.h      |   21 +
> >>   drivers/net/intel/e1000/base/e1000_regs.h     |   48 +-
> >>   drivers/net/intel/e1000/base/e1000_vf.c       |   14 +-
> >>   drivers/net/intel/e1000/base/meson.build      |    1 +
> >>   drivers/net/intel/e1000/em_ethdev.c           |   36 +-
> >>   drivers/net/intel/e1000/igb_ethdev.c          |    1 +
> >>   drivers/net/intel/{igc => e1000}/igc_ethdev.c |  914 ++--
> >>   drivers/net/intel/{igc => e1000}/igc_ethdev.h |   32 +-
> >>   drivers/net/intel/{igc => e1000}/igc_filter.c |   84 +-
> >>   drivers/net/intel/{igc => e1000}/igc_filter.h |    0
> >>   drivers/net/intel/{igc => e1000}/igc_flow.c   |    2 +-
> >>   drivers/net/intel/{igc => e1000}/igc_flow.h   |    0
> >>   drivers/net/intel/{igc => e1000}/igc_logs.c   |    2 +-
> >>   drivers/net/intel/{igc => e1000}/igc_txrx.c   |  376 +-
> >>   drivers/net/intel/{igc => e1000}/igc_txrx.h   |    6 +-
> >>   drivers/net/intel/e1000/meson.build           |   11 +
> >>   drivers/net/intel/igc/base/README             |   29 -
> >>   drivers/net/intel/igc/base/igc_82571.h        |   36 -
> >>   drivers/net/intel/igc/base/igc_82575.h        |  351 --
> >>   drivers/net/intel/igc/base/igc_api.c          | 1853 -------
> >>   drivers/net/intel/igc/base/igc_api.h          |  111 -
> >>   drivers/net/intel/igc/base/igc_base.c         |  190 -
> >>   drivers/net/intel/igc/base/igc_base.h         |  127 -
> >>   drivers/net/intel/igc/base/igc_defines.h      | 1670 -------
> >>   drivers/net/intel/igc/base/igc_hw.h           | 1059 ----
> >>   drivers/net/intel/igc/base/igc_i225.c         | 1372 -----
> >>   drivers/net/intel/igc/base/igc_i225.h         |  110 -
> >>   drivers/net/intel/igc/base/igc_ich8lan.h      |  296 --
> >>   drivers/net/intel/igc/base/igc_mac.c          | 2100 --------
> >>   drivers/net/intel/igc/base/igc_mac.h          |   64 -
> >>   drivers/net/intel/igc/base/igc_manage.c       |  547 --
> >>   drivers/net/intel/igc/base/igc_manage.h       |   65 -
> >>   drivers/net/intel/igc/base/igc_nvm.c          | 1324 -----
> >>   drivers/net/intel/igc/base/igc_nvm.h          |   69 -
> >>   drivers/net/intel/igc/base/igc_osdep.c        |   64 -
> >>   drivers/net/intel/igc/base/igc_osdep.h        |  163 -
> >>   drivers/net/intel/igc/base/igc_phy.c          | 4420 -----------------
> >>   drivers/net/intel/igc/base/igc_phy.h          |  337 --
> >>   drivers/net/intel/igc/base/igc_regs.h         |  732 ---
> >>   drivers/net/intel/igc/base/meson.build        |   19 -
> >>   drivers/net/intel/igc/igc_logs.h              |   43 -
> >>   drivers/net/intel/igc/meson.build             |   21 -
> >>   drivers/net/meson.build                       |    1 -
> >>   64 files changed, 3300 insertions(+), 18218 deletions(-)
> >>   create mode 100644 drivers/net/intel/e1000/base/e1000_i225.c
> >>   create mode 100644 drivers/net/intel/e1000/base/e1000_i225.h
> >>   rename drivers/net/intel/{igc => e1000}/igc_ethdev.c (73%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_ethdev.h (91%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_filter.c (81%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_filter.h (100%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_flow.c (99%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_flow.h (100%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_logs.c (90%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_txrx.c (87%)
> >>   rename drivers/net/intel/{igc => e1000}/igc_txrx.h (97%)
> >>   delete mode 100644 drivers/net/intel/igc/base/README
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_82571.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_82575.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_api.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_api.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_base.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_base.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_defines.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_hw.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_i225.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_i225.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_ich8lan.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_mac.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_mac.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_manage.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_manage.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_nvm.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_nvm.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_osdep.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_osdep.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_phy.c
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_phy.h
> >>   delete mode 100644 drivers/net/intel/igc/base/igc_regs.h
> >>   delete mode 100644 drivers/net/intel/igc/base/meson.build
> >>   delete mode 100644 drivers/net/intel/igc/igc_logs.h
> >>   delete mode 100644 drivers/net/intel/igc/meson.build
> >
> > Consolidation is a good thing, there are two small issues with this
> > series though:
> > - the ABI check (as it tracks all .so) reports that librte_net_igc.so
> > disappeared: this will need some waiving, like Bruce did in his
> > series: https://patchwork.dpdk.org/project/dpdk/patch/20250130151222.944561-2-bruce.richardson@intel.com/
> > - with this merge, "users" can't select net/igc compilation anymore
> > and need to be aware that igc support now requires enabling net/e1000,
> > please update the release notes to make this visible,
> >
> > There is also a strange build failure for mingw (see github test report).
> >
>
> MinGW issues were fixed, abigail changes and release notes will come in
> (now) V3.

There are quite many warnings on this series.

The main problem is that v2 has some meson style issues:
$ ./devtools/check-meson.py
Error: Incorrect indent at drivers/net/intel/e1000/meson.build:20
Error: Incorrect indent at drivers/net/intel/e1000/meson.build:21
Error: Incorrect indent at drivers/net/intel/e1000/meson.build:22
Error: Incorrect indent at drivers/net/intel/e1000/meson.build:23
Error: Incorrect indent at drivers/net/intel/e1000/meson.build:24

And this is what blocked testing at UNH.
Please fix for v3.


There are also authorship/Sob issues, those can be seen with
./devtools/checkpatches.sh and ./devtools/check-git-log.sh.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v1 00/42] Merge Intel IGC and E1000 drivers, and update E1000 base code
  2025-02-03  8:18  3% ` David Marchand
@ 2025-02-04 15:35  0%   ` Burakov, Anatoly
  2025-02-05 10:05  0%     ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Burakov, Anatoly @ 2025-02-04 15:35 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Bruce Richardson

On 03/02/2025 9:18, David Marchand wrote:
> Hello Anatoly,
> 
> On Fri, Jan 31, 2025 at 1:59 PM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
>>
>> Intel IGC and E1000 drivers are distinct, but they are actually generated
>> from the same base code. This patchset will merge together all e1000-derived
>> drivers into one common base, with three different ethdev driver
>> frontends (EM, IGB, and IGC).
>>
>> After the merge is done, base code is also updated to latest snapshot.
>>
>> Adam Ludkiewicz (1):
>>    net/e1000/base: add WoL definitions
>>
>> Aleksandr Loktionov (1):
>>    net/e1000/base: fix mac addr hash bit_shift
>>
>> Amir Avivi (1):
>>    net/e1000/base: fix iterator type
>>
>> Anatoly Burakov (13):
>>    net/e1000/base: add initial support for i225
>>    net/e1000/base: add link bringup support for i225
>>    net/e1000/base: add LED blink support for i225
>>    net/e1000/base: add NVM/EEPROM support for i225
>>    net/e1000/base: add LTR support in i225
>>    net/e1000/base: add eee support for i225
>>    net/e1000/base: add misc definitions for i225
>>    net/e1000: merge igc with e1000
>>    net/e1000: add missing i225 devices
>>    net/e1000: add missing hardware support
>>    net/e1000/base: correct minor formatting issues
>>    net/e1000/base: correct mPHY access logic
>>    net/e1000/base: update readme
>>
>> Barbara Skobiej (2):
>>    net/e1000/base: fix reset for 82580
>>    net/e1000/base: fix data type in MAC hash
>>
>> Carolyn Wyborny (1):
>>    net/e1000/base: skip MANC check for 82575
>>
>> Dima Ruinskiy (4):
>>    net/e1000/base: make e1000_access_phy_wakeup_reg_bm non-static
>>    net/e1000/base: make debug prints more informative
>>    net/e1000/base: hardcode bus parameters for ICH8
>>    net/e1000/base: fix unchecked return
>>
>> Evgeny Efimov (1):
>>    net/e1000/base: add EEE common API function
>>
>> Jakub Buchocki (1):
>>    net/e1000/base: fix uninitialized variable usage
>>
>> Marcin Jurczak (1):
>>    net/e1000/base: remove non-inclusive language
>>
>> Nir Efrati (6):
>>    net/e1000/base: workaround for packet loss
>>    net/e1000/base: add definition for EXFWSM register
>>    net/e1000/base: use longer ULP exit timeout on more HW
>>    net/e1000/base: remove redundant access to RO register
>>    net/e1000/base: introduce PHY ID retry mechanism
>>    net/e1000/base: add PHY read/write retry mechanism
>>
>> Pawel Malinowski (1):
>>    net/e1000/base: fix semaphore timeout value
>>
>> Piotr Kubaj (1):
>>    net/e1000/base: rename NVM version variable
>>
>> Piotr Pietruszewski (1):
>>    net/e1000/base: improve code flow in ICH8LAN
>>
>> Przemyslaw Ciesielski (1):
>>    net/e1000/base: fix static analysis warnings
>>
>> Sasha Neftin (4):
>>    net/e1000/base: add queue select definitions
>>    net/e1000/base: add profile information field
>>    net/e1000/base: add LPI counters
>>    net/e1000/base: improve NVM checksum handling
>>
>> Vitaly Lifshits (2):
>>    net/e1000: add support for more I219 devices
>>    net/e1000/base: correct disable k1 logic
>>
>>   drivers/net/intel/e1000/base/README           |    8 +-
>>   .../net/intel/e1000/base/e1000_80003es2lan.c  |   10 +-
>>   drivers/net/intel/e1000/base/e1000_82571.c    |    4 +-
>>   drivers/net/intel/e1000/base/e1000_82575.c    |   21 +-
>>   drivers/net/intel/e1000/base/e1000_82575.h    |   29 -
>>   drivers/net/intel/e1000/base/e1000_api.c      |   76 +-
>>   drivers/net/intel/e1000/base/e1000_api.h      |    4 +-
>>   drivers/net/intel/e1000/base/e1000_base.c     |    3 +-
>>   drivers/net/intel/e1000/base/e1000_defines.h  |  259 +-
>>   drivers/net/intel/e1000/base/e1000_hw.h       |   86 +-
>>   drivers/net/intel/e1000/base/e1000_i210.c     |   14 +-
>>   drivers/net/intel/e1000/base/e1000_i210.h     |    4 +
>>   drivers/net/intel/e1000/base/e1000_i225.c     | 1384 ++++++
>>   drivers/net/intel/e1000/base/e1000_i225.h     |  117 +
>>   drivers/net/intel/e1000/base/e1000_ich8lan.c  |  224 +-
>>   drivers/net/intel/e1000/base/e1000_ich8lan.h  |    3 +-
>>   drivers/net/intel/e1000/base/e1000_mac.c      |   62 +-
>>   drivers/net/intel/e1000/base/e1000_mac.h      |    2 +-
>>   drivers/net/intel/e1000/base/e1000_nvm.c      |    7 +-
>>   drivers/net/intel/e1000/base/e1000_osdep.h    |   33 +-
>>   drivers/net/intel/e1000/base/e1000_phy.c      |  447 +-
>>   drivers/net/intel/e1000/base/e1000_phy.h      |   21 +
>>   drivers/net/intel/e1000/base/e1000_regs.h     |   48 +-
>>   drivers/net/intel/e1000/base/e1000_vf.c       |   14 +-
>>   drivers/net/intel/e1000/base/meson.build      |    1 +
>>   drivers/net/intel/e1000/em_ethdev.c           |   36 +-
>>   drivers/net/intel/e1000/igb_ethdev.c          |    1 +
>>   drivers/net/intel/{igc => e1000}/igc_ethdev.c |  914 ++--
>>   drivers/net/intel/{igc => e1000}/igc_ethdev.h |   32 +-
>>   drivers/net/intel/{igc => e1000}/igc_filter.c |   84 +-
>>   drivers/net/intel/{igc => e1000}/igc_filter.h |    0
>>   drivers/net/intel/{igc => e1000}/igc_flow.c   |    2 +-
>>   drivers/net/intel/{igc => e1000}/igc_flow.h   |    0
>>   drivers/net/intel/{igc => e1000}/igc_logs.c   |    2 +-
>>   drivers/net/intel/{igc => e1000}/igc_txrx.c   |  376 +-
>>   drivers/net/intel/{igc => e1000}/igc_txrx.h   |    6 +-
>>   drivers/net/intel/e1000/meson.build           |   11 +
>>   drivers/net/intel/igc/base/README             |   29 -
>>   drivers/net/intel/igc/base/igc_82571.h        |   36 -
>>   drivers/net/intel/igc/base/igc_82575.h        |  351 --
>>   drivers/net/intel/igc/base/igc_api.c          | 1853 -------
>>   drivers/net/intel/igc/base/igc_api.h          |  111 -
>>   drivers/net/intel/igc/base/igc_base.c         |  190 -
>>   drivers/net/intel/igc/base/igc_base.h         |  127 -
>>   drivers/net/intel/igc/base/igc_defines.h      | 1670 -------
>>   drivers/net/intel/igc/base/igc_hw.h           | 1059 ----
>>   drivers/net/intel/igc/base/igc_i225.c         | 1372 -----
>>   drivers/net/intel/igc/base/igc_i225.h         |  110 -
>>   drivers/net/intel/igc/base/igc_ich8lan.h      |  296 --
>>   drivers/net/intel/igc/base/igc_mac.c          | 2100 --------
>>   drivers/net/intel/igc/base/igc_mac.h          |   64 -
>>   drivers/net/intel/igc/base/igc_manage.c       |  547 --
>>   drivers/net/intel/igc/base/igc_manage.h       |   65 -
>>   drivers/net/intel/igc/base/igc_nvm.c          | 1324 -----
>>   drivers/net/intel/igc/base/igc_nvm.h          |   69 -
>>   drivers/net/intel/igc/base/igc_osdep.c        |   64 -
>>   drivers/net/intel/igc/base/igc_osdep.h        |  163 -
>>   drivers/net/intel/igc/base/igc_phy.c          | 4420 -----------------
>>   drivers/net/intel/igc/base/igc_phy.h          |  337 --
>>   drivers/net/intel/igc/base/igc_regs.h         |  732 ---
>>   drivers/net/intel/igc/base/meson.build        |   19 -
>>   drivers/net/intel/igc/igc_logs.h              |   43 -
>>   drivers/net/intel/igc/meson.build             |   21 -
>>   drivers/net/meson.build                       |    1 -
>>   64 files changed, 3300 insertions(+), 18218 deletions(-)
>>   create mode 100644 drivers/net/intel/e1000/base/e1000_i225.c
>>   create mode 100644 drivers/net/intel/e1000/base/e1000_i225.h
>>   rename drivers/net/intel/{igc => e1000}/igc_ethdev.c (73%)
>>   rename drivers/net/intel/{igc => e1000}/igc_ethdev.h (91%)
>>   rename drivers/net/intel/{igc => e1000}/igc_filter.c (81%)
>>   rename drivers/net/intel/{igc => e1000}/igc_filter.h (100%)
>>   rename drivers/net/intel/{igc => e1000}/igc_flow.c (99%)
>>   rename drivers/net/intel/{igc => e1000}/igc_flow.h (100%)
>>   rename drivers/net/intel/{igc => e1000}/igc_logs.c (90%)
>>   rename drivers/net/intel/{igc => e1000}/igc_txrx.c (87%)
>>   rename drivers/net/intel/{igc => e1000}/igc_txrx.h (97%)
>>   delete mode 100644 drivers/net/intel/igc/base/README
>>   delete mode 100644 drivers/net/intel/igc/base/igc_82571.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_82575.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_api.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_api.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_base.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_base.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_defines.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_hw.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_i225.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_i225.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_ich8lan.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_mac.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_mac.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_manage.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_manage.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_nvm.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_nvm.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_osdep.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_osdep.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_phy.c
>>   delete mode 100644 drivers/net/intel/igc/base/igc_phy.h
>>   delete mode 100644 drivers/net/intel/igc/base/igc_regs.h
>>   delete mode 100644 drivers/net/intel/igc/base/meson.build
>>   delete mode 100644 drivers/net/intel/igc/igc_logs.h
>>   delete mode 100644 drivers/net/intel/igc/meson.build
> 
> Consolidation is a good thing, there are two small issues with this
> series though:
> - the ABI check (as it tracks all .so) reports that librte_net_igc.so
> disappeared: this will need some waiving, like Bruce did in his
> series: https://patchwork.dpdk.org/project/dpdk/patch/20250130151222.944561-2-bruce.richardson@intel.com/
> - with this merge, "users" can't select net/igc compilation anymore
> and need to be aware that igc support now requires enabling net/e1000,
> please update the release notes to make this visible,
> 
> There is also a strange build failure for mingw (see github test report).
> 

MinGW issues were fixed, abigail changes and release notes will come in 
(now) V3.

-- 
Thanks,
Anatoly

^ permalink raw reply	[relevance 0%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-02-03  4:37  0%                               ` Naga Harish K, S V
@ 2025-02-04  7:15  0%                                 ` Jerin Jacob
  0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2025-02-04  7:15 UTC (permalink / raw)
  To: Naga Harish K, S V, Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Liang Ma, Mccarthy, Peter,
	Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly



> -----Original Message-----
> From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> Sent: Monday, February 3, 2025 10:08 AM
> To: Jerin Jacob <jerinj@marvell.com>; Shijith Thotton <sthotton@marvell.com>;
> dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak, Pravin
> <pravin.pathak@intel.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Sachin Saxena <sachin.saxena@nxp.com>; Mattias R_nnblom
> <mattias.ronnblom@ericsson.com>; Liang Ma <liangma@liangbit.com>;
> Mccarthy, Peter <peter.mccarthy@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
> Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to configure
> multiple Rx queues
> 
> > -----Original Message----- > From: Jerin Jacob <jerinj@ marvell. com>
> > > Sent: Thursday, January 30, 2025 10: 18 PM > To: Naga Harish K, S V
> > <s. v. naga. harish. k@ intel. com>; Shijith Thotton > <sthotton@ 
> > marvell. com>;
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob <jerinj@marvell.com>
> > Sent: Thursday, January 30, 2025 10:18 PM
> > To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Shijith Thotton
> > <sthotton@marvell.com>; dev@dpdk.org
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> > <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>;
> > Van Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > <amitprakashs@marvell.com>; Burakov, Anatoly
> > <anatoly.burakov@intel.com>
> > Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple
> > Rx queues
> >
> >
> > > -----Original Message-----
> > > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > > Sent: Thursday, January 30, 2025 9:01 PM
> > > To: Jerin Jacob <jerinj@marvell.com>; Shijith Thotton
> > > <sthotton@marvell.com>; dev@dpdk.org
> > > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> > > <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>;
> > > Van Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> > > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > > <amitprakashs@marvell.com>; Burakov, Anatoly
> > > <anatoly.burakov@intel.com>
> > > Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to
> > > configure multiple Rx queues
> > >
> > > > -----Original Message----- > From: Jerin Jacob <jerinj@ marvell.
> > > > com>
> > > > > Sent: Wednesday, January 29, 2025 1: 13 PM > To: Naga Harish K,
> > > > > S V
> > > > <s. v. naga. harish. k@ intel. com>; Shijith Thotton > <sthotton@
> > > > marvell. com>;
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jerin Jacob <jerinj@marvell.com>
> > > > Sent: Wednesday, January 29, 2025 1:13 PM
> > > > To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Shijith
> > > > Thotton <sthotton@marvell.com>; dev@dpdk.org
> > > > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
> > > > Pathak, Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > > > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > > > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> > > > <liangma@liangbit.com>; Mccarthy, Peter
> > > > <peter.mccarthy@intel.com>; Van Haaren, Harry
> > > > <harry.van.haaren@intel.com>; Carrillo, Erik G
> > > > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > > > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > > > <amitprakashs@marvell.com>; Burakov, Anatoly
> > > > <anatoly.burakov@intel.com>
> > > > Subject: RE: [RFC PATCH] eventdev: adapter API to configure
> > > > multiple Rx queues
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > > > > Sent: Wednesday, January 29, 2025 10:35 AM
> > > > > To: Shijith Thotton <sthotton@marvell.com>; dev@dpdk.org
> > > > > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
> > > > > Pathak, Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > > > > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > > > > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Jerin Jacob
> > > > > <jerinj@marvell.com>; Liang Ma <liangma@liangbit.com>; Mccarthy,
> > > > > Peter <peter.mccarthy@intel.com>; Van Haaren, Harry
> > > > > <harry.van.haaren@intel.com>; Carrillo, Erik G
> > > > > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > > > > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > > > > <amitprakashs@marvell.com>; Burakov, Anatoly
> > > > > <anatoly.burakov@intel.com>
> > > > > Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to
> > > > > configure multiple Rx queues
> > > > > > >
> > > > > > >This requires a change to the
> > > > > > >rte_event_eth_rx_adapter_queue_add()
> > > > > > >stable API parameters.
> > > > > > >This is an ABI breakage and may not be possible now.
> > > > > > >It requires changes to many current applications that are
> > > > > > >using the
> > > > > > >rte_event_eth_rx_adapter_queue_add() stable API.
> > > > > > >
> > > > > >
> > > > > > What I meant by mapping was to retain the stable API
> > > > > > parameters as they
> > > > are.
> > > > > > Internally, the API can use the proposed eventdev PMD
> > > > > > operation
> > > > > > (eth_rx_adapter_queues_add) without causing an ABI break, as
> > > > > > shown
> > > > below.
> > > > > >
> > > > > > int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t
> > eth_dev_id,
> > > > > >                 int32_t rx_queue_id,
> > > > > >                 const struct rte_event_eth_rx_adapter_queue_conf *conf) {
> > > > > >         if (rx_queue_id == -1)
> > > > > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > > > > >                         dev, &rte_eth_devices[eth_dev_id], 0,
> > > > > >                         conf, 0);
> > > > > >         else
> > > > > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > > > > >                         dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
> > > > > >                         conf, 1); }
> > > > > >
> > > > > > With above change, old op (eth_rx_adapter_queue_add) can be
> > > > > > removed as both API (stable and proposed) will be using
> > > > eth_rx_adapter_queues_add.
> > > >
> > > >
> > > > Since this thread is not converging and looks like it is due to confusion.
> > > > I am trying to summarize my understanding to define the next
> > > > steps(like if needed, we need to reach tech board if there are no
> > > > consensus)
> > > >
> > > >
> > > > Problem statement:
> > > > ==================
> > > > 1) Implementation of rte_event_eth_rx_adapter_queue_add() in HW
> > > > typically uses an administrative function to enable it. Typically,
> > > > it translated to sending a mailbox to PF driver etc.
> > > > So, this function takes "time" to complete in HW implementations.
> > > > 2) For SW implementations, this won't take time as there is no
> > > > other actors involved.
> > > > 3) There are customer use cases, they add 300+
> > > > rte_event_eth_rx_adapter_queue_add() on application bootup, that
> > > > is introducing significant boot time for the application.
> > > > Number of queues are function of number of ethdev ports, number
> > > > of ethdev Rx queues per port and number of event queues.
> > > >
> > > >
> > > > Expected outcome of problem statement:
> > > > ======================================
> > > > 1) The cases where application knows queue mapping(typically at
> > > > boot time case), application can call burst variant of
> > > > rte_event_eth_rx_adapter_queue_add()
> > > > function
> > > > to amortize the cost. Similar scheme used DPDK in control path API
> > > > where latency is critical, like rte_acl_add_rules() or rte_flow
> > > > via template scheme.
> > > > 2) Solution should not break ABI or any impact to SW drivers.
> > > > 3) Avoid duplicating the code as much as possible
> > > >
> > > >
> > > > Proposed solution:
> > > > ==================
> > > > 1) Update eventdev_eth_rx_adapter_queue_add_t() PMD (Internal ABI)
> > > > API to take burst parameters
> > > > 2) Add new rte_event_eth_rx_adapter_queue*s*_add() function and
> > > > wire to use updated PMD API
> > > > 3) Use rte_event_eth_rx_adapter_queue_add() as
> > > > rte_event_eth_rx_adapter_queue*s*_add(...., 1)
> > > >
> > > > If so, I am not sure what is the cons of this approach, it will
> > > > let to have optimized applications when
> > > > a) Application knows the queue mapping at priorly (typically in
> > > > boot
> > > > time)
> > > > b) Allow HW drivers to optimize without breaking anything for SW
> > > > drivers
> > > > c) Provide applications to decide burst vs non burst selection
> > > > based on the needed and performance requirements
> > >
> > > The proposed API benefits only some hardware platforms that have
> > > optimized the "queue_add" eventdev PMD implementation for burst mode.
> > > It may not benefit SW drivers/other HW platforms.
> >
> > The sprint is to have ONE API for all drivers(SW or HW). If one driver
> > is not able to leverage feature is OK as long it is NOT breaking
> > anything. We been accommodating ton of capabilities(like
> > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED)
> > And SW driver specific public API(like
> > rte_event_eth_rx_adapter_service_id_get()) to have Common API. As long
> > as it does not break each other and application has clarity on the
> > usage (when to use the API) I don’t see any issue. Do you see any
> > issue with that forward progress approach?
> >
> 
> This approach is fine, as long as it is not breaking the other platforms.

Agree

> 
> >
> > > There will not be much difference in calling the existing API
> > > (rte_event_eth_rx_adapter_queue_add()) in a loop vs using the new
> > > API for the above cases.
> >
> > That is just A implementation view. Right? I have explained in the
> > problem statement which is the not case for some drivers.(Even SW
> > driver can leverage such burst function using SIMD etc, if one driver
> > wants to)
> >
> 
> Not Just from the implementation point of view, but from the latency
> improvement also.
> Anyway, I am fine with the new API approach.

Thanks

> 
> > >
> > > If the new proposed API benefits all platforms, then it is useful.
> >
> > See above.
> >
> >
> > > This is the point I am making from the beginning, it is not captured
> > > in the summary.

^ permalink raw reply	[relevance 0%]

* RE: [PATCH v3 1/4] drivers: merge common and net idpf drivers
  2025-01-30 15:12  2%   ` [PATCH v3 1/4] drivers: merge common and net " Bruce Richardson
@ 2025-02-03  8:36  2%     ` Shetty, Praveen
  0 siblings, 0 replies; 200+ results
From: Shetty, Praveen @ 2025-02-03  8:36 UTC (permalink / raw)
  To: Richardson, Bruce, dev

Rather than having some of the idpf code split out into the "common"
directory, used by both a net/idpf and a net/cpfl driver, we can merge all idpf code together under net/idpf and have the cpfl driver depend on "net/idpf" rather than "common/idpf".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 devtools/libabigail.abignore                  |  1 +
 doc/guides/rel_notes/release_25_03.rst        |  6 ++++
 drivers/common/idpf/meson.build               | 34 -------------------
 drivers/common/meson.build                    |  1 -
 drivers/net/intel/cpfl/meson.build            |  2 +-
 .../{common => net/intel}/idpf/base/README    |  0
 .../intel}/idpf/base/idpf_alloc.h             |  0
 .../intel}/idpf/base/idpf_controlq.c          |  0
 .../intel}/idpf/base/idpf_controlq.h          |  0
 .../intel}/idpf/base/idpf_controlq_api.h      |  0
 .../intel}/idpf/base/idpf_controlq_setup.c    |  0
 .../intel}/idpf/base/idpf_devids.h            |  0
 .../intel}/idpf/base/idpf_lan_pf_regs.h       |  0
 .../intel}/idpf/base/idpf_lan_txrx.h          |  0
 .../intel}/idpf/base/idpf_lan_vf_regs.h       |  0
 .../intel}/idpf/base/idpf_osdep.h             |  0
 .../intel}/idpf/base/idpf_prototype.h         |  0
 .../intel}/idpf/base/idpf_type.h              |  0
 .../intel}/idpf/base/meson.build              |  0
 .../intel}/idpf/base/siov_regs.h              |  0
 .../intel}/idpf/base/virtchnl2.h              |  0
 .../intel}/idpf/base/virtchnl2_lan_desc.h     |  0
 .../intel}/idpf/idpf_common_device.c          |  0
 .../intel}/idpf/idpf_common_device.h          |  0
 .../intel}/idpf/idpf_common_logs.h            |  0
 .../intel}/idpf/idpf_common_rxtx.c            |  0
 .../intel}/idpf/idpf_common_rxtx.h            |  0
 .../intel}/idpf/idpf_common_rxtx_avx512.c     |  0
 .../intel}/idpf/idpf_common_virtchnl.c        |  0
 .../intel}/idpf/idpf_common_virtchnl.h        |  0
 drivers/net/intel/idpf/meson.build            | 20 +++++++++--
 .../{common => net/intel}/idpf/version.map    |  0
 drivers/net/meson.build                       |  2 +-
 33 files changed, 27 insertions(+), 39 deletions(-)  delete mode 100644 drivers/common/idpf/meson.build  rename drivers/{common => net/intel}/idpf/base/README (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_alloc.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_controlq.c (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_controlq.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_controlq_api.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_controlq_setup.c (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_devids.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_lan_pf_regs.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_lan_txrx.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_lan_vf_regs.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_osdep.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_prototype.h (100%)  rename drivers/{common => net/intel}/idpf/base/idpf_type.h (100%)  rename drivers/{common => net/intel}/idpf/base/meson.build (100%)  rename drivers/{common => net/intel}/idpf/base/siov_regs.h (100%)  rename drivers/{common => net/intel}/idpf/base/virtchnl2.h (100%)  rename drivers/{common => net/intel}/idpf/base/virtchnl2_lan_desc.h (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_device.c (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_device.h (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_logs.h (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.c (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.h (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_rxtx_avx512.c (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.c (100%)  rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.h (100%)  rename drivers/{common => net/intel}/idpf/version.map (100%)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore index 21b8cd6113..1dee6a954f 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -25,6 +25,7 @@
 ;
 ; SKIP_LIBRARY=librte_common_mlx5_glue
 ; SKIP_LIBRARY=librte_net_mlx4_glue
+; SKIP_LIBRARY=librte_common_idpf
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Experimental APIs exceptions ;
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index a88b04d958..79b1116f6e 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -115,6 +115,12 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* The driver ``common/idpf`` has been merged into the ``net/intel/idpf`` driver.
+  This change should have no impact to end applications, but,
+  when specifying the ``idpf`` or ``cpfl`` net drivers to meson via 
+``-Denable_drivers`` option,
+  there is no longer any need to also specify the ``common/idpf`` driver.
+  Note, however, ``net/intel/cpfl`` driver now depends upon the ``net/intel/idpf`` driver.
+
 
 ABI Changes
 -----------
diff --git a/drivers/common/idpf/meson.build b/drivers/common/idpf/meson.build deleted file mode 100644 index 46fd45c03b..0000000000
--- a/drivers/common/idpf/meson.build
+++ /dev/null
@@ -1,34 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2022 Intel Corporation
-
-if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
-    subdir_done()
-endif
-
-includes += include_directories('../iavf')
-
-deps += ['mbuf']
-
-sources = files(
-        'idpf_common_device.c',
-        'idpf_common_rxtx.c',
-        'idpf_common_virtchnl.c',
-)
-
-if arch_subdir == 'x86'
-    if cc_has_avx512
-        cflags += ['-DCC_AVX512_SUPPORT']
-        avx512_args = cflags + cc_avx512_flags
-        if cc.has_argument('-march=skylake-avx512')
-            avx512_args += '-march=skylake-avx512'
-        endif
-        idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
-                'idpf_common_rxtx_avx512.c',
-                dependencies: [static_rte_mbuf,],
-                include_directories: includes,
-                c_args: avx512_args)
-        objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
-    endif
-endif
-
-subdir('base')
diff --git a/drivers/common/meson.build b/drivers/common/meson.build index 8734af36aa..e1e3149d8f 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -6,7 +6,6 @@ drivers = [
         'cpt',
         'dpaax',
         'iavf',
-        'idpf',
         'ionic',
         'mvep',
         'octeontx',
diff --git a/drivers/net/intel/cpfl/meson.build b/drivers/net/intel/cpfl/meson.build
index 87fcfe0bb1..1f0269d50b 100644
--- a/drivers/net/intel/cpfl/meson.build
+++ b/drivers/net/intel/cpfl/meson.build
@@ -11,7 +11,7 @@ if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
     subdir_done()
 endif
 
-deps += ['hash', 'common_idpf']
+deps += ['hash', 'net_idpf']
 
 sources = files(
         'cpfl_ethdev.c',
diff --git a/drivers/common/idpf/base/README b/drivers/net/intel/idpf/base/README
similarity index 100%
rename from drivers/common/idpf/base/README rename to drivers/net/intel/idpf/base/README
diff --git a/drivers/common/idpf/base/idpf_alloc.h b/drivers/net/intel/idpf/base/idpf_alloc.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_alloc.h
rename to drivers/net/intel/idpf/base/idpf_alloc.h
diff --git a/drivers/common/idpf/base/idpf_controlq.c b/drivers/net/intel/idpf/base/idpf_controlq.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.c
rename to drivers/net/intel/idpf/base/idpf_controlq.c
diff --git a/drivers/common/idpf/base/idpf_controlq.h b/drivers/net/intel/idpf/base/idpf_controlq.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.h
rename to drivers/net/intel/idpf/base/idpf_controlq.h
diff --git a/drivers/common/idpf/base/idpf_controlq_api.h b/drivers/net/intel/idpf/base/idpf_controlq_api.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_api.h
rename to drivers/net/intel/idpf/base/idpf_controlq_api.h
diff --git a/drivers/common/idpf/base/idpf_controlq_setup.c b/drivers/net/intel/idpf/base/idpf_controlq_setup.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_setup.c
rename to drivers/net/intel/idpf/base/idpf_controlq_setup.c
diff --git a/drivers/common/idpf/base/idpf_devids.h b/drivers/net/intel/idpf/base/idpf_devids.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_devids.h
rename to drivers/net/intel/idpf/base/idpf_devids.h
diff --git a/drivers/common/idpf/base/idpf_lan_pf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_pf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
diff --git a/drivers/common/idpf/base/idpf_lan_txrx.h b/drivers/net/intel/idpf/base/idpf_lan_txrx.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_txrx.h
rename to drivers/net/intel/idpf/base/idpf_lan_txrx.h
diff --git a/drivers/common/idpf/base/idpf_lan_vf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_vf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
diff --git a/drivers/common/idpf/base/idpf_osdep.h b/drivers/net/intel/idpf/base/idpf_osdep.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_osdep.h
rename to drivers/net/intel/idpf/base/idpf_osdep.h
diff --git a/drivers/common/idpf/base/idpf_prototype.h b/drivers/net/intel/idpf/base/idpf_prototype.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_prototype.h
rename to drivers/net/intel/idpf/base/idpf_prototype.h
diff --git a/drivers/common/idpf/base/idpf_type.h b/drivers/net/intel/idpf/base/idpf_type.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_type.h
rename to drivers/net/intel/idpf/base/idpf_type.h
diff --git a/drivers/common/idpf/base/meson.build b/drivers/net/intel/idpf/base/meson.build
similarity index 100%
rename from drivers/common/idpf/base/meson.build
rename to drivers/net/intel/idpf/base/meson.build
diff --git a/drivers/common/idpf/base/siov_regs.h b/drivers/net/intel/idpf/base/siov_regs.h
similarity index 100%
rename from drivers/common/idpf/base/siov_regs.h
rename to drivers/net/intel/idpf/base/siov_regs.h
diff --git a/drivers/common/idpf/base/virtchnl2.h b/drivers/net/intel/idpf/base/virtchnl2.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2.h
rename to drivers/net/intel/idpf/base/virtchnl2.h
diff --git a/drivers/common/idpf/base/virtchnl2_lan_desc.h b/drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2_lan_desc.h
rename to drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
diff --git a/drivers/common/idpf/idpf_common_device.c b/drivers/net/intel/idpf/idpf_common_device.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.c
rename to drivers/net/intel/idpf/idpf_common_device.c
diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/net/intel/idpf/idpf_common_device.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.h
rename to drivers/net/intel/idpf/idpf_common_device.h
diff --git a/drivers/common/idpf/idpf_common_logs.h b/drivers/net/intel/idpf/idpf_common_logs.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_logs.h
rename to drivers/net/intel/idpf/idpf_common_logs.h
diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/net/intel/idpf/idpf_common_rxtx.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.c
rename to drivers/net/intel/idpf/idpf_common_rxtx.c
diff --git a/drivers/common/idpf/idpf_common_rxtx.h b/drivers/net/intel/idpf/idpf_common_rxtx.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.h
rename to drivers/net/intel/idpf/idpf_common_rxtx.h
diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx_avx512.c
rename to drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.c b/drivers/net/intel/idpf/idpf_common_virtchnl.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.c
rename to drivers/net/intel/idpf/idpf_common_virtchnl.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.h b/drivers/net/intel/idpf/idpf_common_virtchnl.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.h
rename to drivers/net/intel/idpf/idpf_common_virtchnl.h
diff --git a/drivers/net/intel/idpf/meson.build b/drivers/net/intel/idpf/meson.build
index 34cbdc4da0..52405b5b35 100644
--- a/drivers/net/intel/idpf/meson.build
+++ b/drivers/net/intel/idpf/meson.build
@@ -7,13 +7,29 @@ if is_windows
     subdir_done()
 endif
 
-deps += ['common_idpf']
+includes += include_directories('../../../common/iavf')
 
 sources = files(
+        'idpf_common_device.c',
+        'idpf_common_rxtx.c',
+        'idpf_common_virtchnl.c',
+
         'idpf_ethdev.c',
         'idpf_rxtx.c',
 )
 
-if arch_subdir == 'x86'and cc_has_avx512
+if arch_subdir == 'x86' and cc_has_avx512
     cflags += ['-DCC_AVX512_SUPPORT']
+    avx512_args = cflags + cc_avx512_flags
+    if cc.has_argument('-march=skylake-avx512')
+        avx512_args += '-march=skylake-avx512'
+    endif
+    idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
+            'idpf_common_rxtx_avx512.c',
+            dependencies: static_rte_mbuf,
+            include_directories: includes,
+            c_args: avx512_args)
+    objs += 
+ idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
 endif
+
+subdir('base')
diff --git a/drivers/common/idpf/version.map b/drivers/net/intel/idpf/version.map
similarity index 100%
rename from drivers/common/idpf/version.map rename to drivers/net/intel/idpf/version.map
diff --git a/drivers/net/meson.build b/drivers/net/meson.build index 02a3f5a0b6..bcf6f9dc73 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,7 +24,6 @@ drivers = [
         'gve',
         'hinic',
         'hns3',
-        'intel/cpfl',
         'intel/e1000',
         'intel/fm10k',
         'intel/i40e',
@@ -34,6 +33,7 @@ drivers = [
         'intel/igc',
         'intel/ipn3ke',
         'intel/ixgbe',
+        'intel/cpfl',  # depends on idpf, so must come after it
         'ionic',
         'mana',
         'memif',
--
Looks good to me - thanks Bruce!
Acked-by:  Praveen Shetty <praveen.shetty@intel.com>

2.43.0


^ permalink raw reply	[relevance 2%]

* Re: [PATCH v1 00/42] Merge Intel IGC and E1000 drivers, and update E1000 base code
  @ 2025-02-03  8:18  3% ` David Marchand
  2025-02-04 15:35  0%   ` Burakov, Anatoly
    1 sibling, 1 reply; 200+ results
From: David Marchand @ 2025-02-03  8:18 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev, Bruce Richardson

Hello Anatoly,

On Fri, Jan 31, 2025 at 1:59 PM Anatoly Burakov
<anatoly.burakov@intel.com> wrote:
>
> Intel IGC and E1000 drivers are distinct, but they are actually generated
> from the same base code. This patchset will merge together all e1000-derived
> drivers into one common base, with three different ethdev driver
> frontends (EM, IGB, and IGC).
>
> After the merge is done, base code is also updated to latest snapshot.
>
> Adam Ludkiewicz (1):
>   net/e1000/base: add WoL definitions
>
> Aleksandr Loktionov (1):
>   net/e1000/base: fix mac addr hash bit_shift
>
> Amir Avivi (1):
>   net/e1000/base: fix iterator type
>
> Anatoly Burakov (13):
>   net/e1000/base: add initial support for i225
>   net/e1000/base: add link bringup support for i225
>   net/e1000/base: add LED blink support for i225
>   net/e1000/base: add NVM/EEPROM support for i225
>   net/e1000/base: add LTR support in i225
>   net/e1000/base: add eee support for i225
>   net/e1000/base: add misc definitions for i225
>   net/e1000: merge igc with e1000
>   net/e1000: add missing i225 devices
>   net/e1000: add missing hardware support
>   net/e1000/base: correct minor formatting issues
>   net/e1000/base: correct mPHY access logic
>   net/e1000/base: update readme
>
> Barbara Skobiej (2):
>   net/e1000/base: fix reset for 82580
>   net/e1000/base: fix data type in MAC hash
>
> Carolyn Wyborny (1):
>   net/e1000/base: skip MANC check for 82575
>
> Dima Ruinskiy (4):
>   net/e1000/base: make e1000_access_phy_wakeup_reg_bm non-static
>   net/e1000/base: make debug prints more informative
>   net/e1000/base: hardcode bus parameters for ICH8
>   net/e1000/base: fix unchecked return
>
> Evgeny Efimov (1):
>   net/e1000/base: add EEE common API function
>
> Jakub Buchocki (1):
>   net/e1000/base: fix uninitialized variable usage
>
> Marcin Jurczak (1):
>   net/e1000/base: remove non-inclusive language
>
> Nir Efrati (6):
>   net/e1000/base: workaround for packet loss
>   net/e1000/base: add definition for EXFWSM register
>   net/e1000/base: use longer ULP exit timeout on more HW
>   net/e1000/base: remove redundant access to RO register
>   net/e1000/base: introduce PHY ID retry mechanism
>   net/e1000/base: add PHY read/write retry mechanism
>
> Pawel Malinowski (1):
>   net/e1000/base: fix semaphore timeout value
>
> Piotr Kubaj (1):
>   net/e1000/base: rename NVM version variable
>
> Piotr Pietruszewski (1):
>   net/e1000/base: improve code flow in ICH8LAN
>
> Przemyslaw Ciesielski (1):
>   net/e1000/base: fix static analysis warnings
>
> Sasha Neftin (4):
>   net/e1000/base: add queue select definitions
>   net/e1000/base: add profile information field
>   net/e1000/base: add LPI counters
>   net/e1000/base: improve NVM checksum handling
>
> Vitaly Lifshits (2):
>   net/e1000: add support for more I219 devices
>   net/e1000/base: correct disable k1 logic
>
>  drivers/net/intel/e1000/base/README           |    8 +-
>  .../net/intel/e1000/base/e1000_80003es2lan.c  |   10 +-
>  drivers/net/intel/e1000/base/e1000_82571.c    |    4 +-
>  drivers/net/intel/e1000/base/e1000_82575.c    |   21 +-
>  drivers/net/intel/e1000/base/e1000_82575.h    |   29 -
>  drivers/net/intel/e1000/base/e1000_api.c      |   76 +-
>  drivers/net/intel/e1000/base/e1000_api.h      |    4 +-
>  drivers/net/intel/e1000/base/e1000_base.c     |    3 +-
>  drivers/net/intel/e1000/base/e1000_defines.h  |  259 +-
>  drivers/net/intel/e1000/base/e1000_hw.h       |   86 +-
>  drivers/net/intel/e1000/base/e1000_i210.c     |   14 +-
>  drivers/net/intel/e1000/base/e1000_i210.h     |    4 +
>  drivers/net/intel/e1000/base/e1000_i225.c     | 1384 ++++++
>  drivers/net/intel/e1000/base/e1000_i225.h     |  117 +
>  drivers/net/intel/e1000/base/e1000_ich8lan.c  |  224 +-
>  drivers/net/intel/e1000/base/e1000_ich8lan.h  |    3 +-
>  drivers/net/intel/e1000/base/e1000_mac.c      |   62 +-
>  drivers/net/intel/e1000/base/e1000_mac.h      |    2 +-
>  drivers/net/intel/e1000/base/e1000_nvm.c      |    7 +-
>  drivers/net/intel/e1000/base/e1000_osdep.h    |   33 +-
>  drivers/net/intel/e1000/base/e1000_phy.c      |  447 +-
>  drivers/net/intel/e1000/base/e1000_phy.h      |   21 +
>  drivers/net/intel/e1000/base/e1000_regs.h     |   48 +-
>  drivers/net/intel/e1000/base/e1000_vf.c       |   14 +-
>  drivers/net/intel/e1000/base/meson.build      |    1 +
>  drivers/net/intel/e1000/em_ethdev.c           |   36 +-
>  drivers/net/intel/e1000/igb_ethdev.c          |    1 +
>  drivers/net/intel/{igc => e1000}/igc_ethdev.c |  914 ++--
>  drivers/net/intel/{igc => e1000}/igc_ethdev.h |   32 +-
>  drivers/net/intel/{igc => e1000}/igc_filter.c |   84 +-
>  drivers/net/intel/{igc => e1000}/igc_filter.h |    0
>  drivers/net/intel/{igc => e1000}/igc_flow.c   |    2 +-
>  drivers/net/intel/{igc => e1000}/igc_flow.h   |    0
>  drivers/net/intel/{igc => e1000}/igc_logs.c   |    2 +-
>  drivers/net/intel/{igc => e1000}/igc_txrx.c   |  376 +-
>  drivers/net/intel/{igc => e1000}/igc_txrx.h   |    6 +-
>  drivers/net/intel/e1000/meson.build           |   11 +
>  drivers/net/intel/igc/base/README             |   29 -
>  drivers/net/intel/igc/base/igc_82571.h        |   36 -
>  drivers/net/intel/igc/base/igc_82575.h        |  351 --
>  drivers/net/intel/igc/base/igc_api.c          | 1853 -------
>  drivers/net/intel/igc/base/igc_api.h          |  111 -
>  drivers/net/intel/igc/base/igc_base.c         |  190 -
>  drivers/net/intel/igc/base/igc_base.h         |  127 -
>  drivers/net/intel/igc/base/igc_defines.h      | 1670 -------
>  drivers/net/intel/igc/base/igc_hw.h           | 1059 ----
>  drivers/net/intel/igc/base/igc_i225.c         | 1372 -----
>  drivers/net/intel/igc/base/igc_i225.h         |  110 -
>  drivers/net/intel/igc/base/igc_ich8lan.h      |  296 --
>  drivers/net/intel/igc/base/igc_mac.c          | 2100 --------
>  drivers/net/intel/igc/base/igc_mac.h          |   64 -
>  drivers/net/intel/igc/base/igc_manage.c       |  547 --
>  drivers/net/intel/igc/base/igc_manage.h       |   65 -
>  drivers/net/intel/igc/base/igc_nvm.c          | 1324 -----
>  drivers/net/intel/igc/base/igc_nvm.h          |   69 -
>  drivers/net/intel/igc/base/igc_osdep.c        |   64 -
>  drivers/net/intel/igc/base/igc_osdep.h        |  163 -
>  drivers/net/intel/igc/base/igc_phy.c          | 4420 -----------------
>  drivers/net/intel/igc/base/igc_phy.h          |  337 --
>  drivers/net/intel/igc/base/igc_regs.h         |  732 ---
>  drivers/net/intel/igc/base/meson.build        |   19 -
>  drivers/net/intel/igc/igc_logs.h              |   43 -
>  drivers/net/intel/igc/meson.build             |   21 -
>  drivers/net/meson.build                       |    1 -
>  64 files changed, 3300 insertions(+), 18218 deletions(-)
>  create mode 100644 drivers/net/intel/e1000/base/e1000_i225.c
>  create mode 100644 drivers/net/intel/e1000/base/e1000_i225.h
>  rename drivers/net/intel/{igc => e1000}/igc_ethdev.c (73%)
>  rename drivers/net/intel/{igc => e1000}/igc_ethdev.h (91%)
>  rename drivers/net/intel/{igc => e1000}/igc_filter.c (81%)
>  rename drivers/net/intel/{igc => e1000}/igc_filter.h (100%)
>  rename drivers/net/intel/{igc => e1000}/igc_flow.c (99%)
>  rename drivers/net/intel/{igc => e1000}/igc_flow.h (100%)
>  rename drivers/net/intel/{igc => e1000}/igc_logs.c (90%)
>  rename drivers/net/intel/{igc => e1000}/igc_txrx.c (87%)
>  rename drivers/net/intel/{igc => e1000}/igc_txrx.h (97%)
>  delete mode 100644 drivers/net/intel/igc/base/README
>  delete mode 100644 drivers/net/intel/igc/base/igc_82571.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_82575.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_api.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_api.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_base.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_base.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_defines.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_hw.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_i225.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_i225.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_ich8lan.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_mac.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_mac.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_manage.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_manage.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_nvm.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_nvm.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_osdep.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_osdep.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_phy.c
>  delete mode 100644 drivers/net/intel/igc/base/igc_phy.h
>  delete mode 100644 drivers/net/intel/igc/base/igc_regs.h
>  delete mode 100644 drivers/net/intel/igc/base/meson.build
>  delete mode 100644 drivers/net/intel/igc/igc_logs.h
>  delete mode 100644 drivers/net/intel/igc/meson.build

Consolidation is a good thing, there are two small issues with this
series though:
- the ABI check (as it tracks all .so) reports that librte_net_igc.so
disappeared: this will need some waiving, like Bruce did in his
series: https://patchwork.dpdk.org/project/dpdk/patch/20250130151222.944561-2-bruce.richardson@intel.com/
- with this merge, "users" can't select net/igc compilation anymore
and need to be aware that igc support now requires enabling net/e1000,
please update the release notes to make this visible,

There is also a strange build failure for mingw (see github test report).


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-01-30 16:48  0%                             ` Jerin Jacob
@ 2025-02-03  4:37  0%                               ` Naga Harish K, S V
  2025-02-04  7:15  0%                                 ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Naga Harish K, S V @ 2025-02-03  4:37 UTC (permalink / raw)
  To: Jerin Jacob, Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Liang Ma, Mccarthy, Peter,
	Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly



> -----Original Message-----
> From: Jerin Jacob <jerinj@marvell.com>
> Sent: Thursday, January 30, 2025 10:18 PM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Shijith Thotton
> <sthotton@marvell.com>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>; Van
> Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>
> Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx
> queues
> 
> 
> > -----Original Message-----
> > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > Sent: Thursday, January 30, 2025 9:01 PM
> > To: Jerin Jacob <jerinj@marvell.com>; Shijith Thotton
> > <sthotton@marvell.com>; dev@dpdk.org
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> > <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>;
> > Van Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > <amitprakashs@marvell.com>; Burakov, Anatoly
> > <anatoly.burakov@intel.com>
> > Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to configure
> > multiple Rx queues
> >
> > > -----Original Message----- > From: Jerin Jacob <jerinj@ marvell. 
> > > com>
> > > > Sent: Wednesday, January 29, 2025 1: 13 PM > To: Naga Harish K, S
> > > > V
> > > <s. v. naga. harish. k@ intel. com>; Shijith Thotton > <sthotton@
> > > marvell. com>;
> >
> >
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinj@marvell.com>
> > > Sent: Wednesday, January 29, 2025 1:13 PM
> > > To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Shijith
> > > Thotton <sthotton@marvell.com>; dev@dpdk.org
> > > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> > > <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>;
> > > Van Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> > > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > > <amitprakashs@marvell.com>; Burakov, Anatoly
> > > <anatoly.burakov@intel.com>
> > > Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple
> > > Rx queues
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > > > Sent: Wednesday, January 29, 2025 10:35 AM
> > > > To: Shijith Thotton <sthotton@marvell.com>; dev@dpdk.org
> > > > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
> > > > Pathak, Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > > > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > > > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Jerin Jacob
> > > > <jerinj@marvell.com>; Liang Ma <liangma@liangbit.com>; Mccarthy,
> > > > Peter <peter.mccarthy@intel.com>; Van Haaren, Harry
> > > > <harry.van.haaren@intel.com>; Carrillo, Erik G
> > > > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > > > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > > > <amitprakashs@marvell.com>; Burakov, Anatoly
> > > > <anatoly.burakov@intel.com>
> > > > Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to
> > > > configure multiple Rx queues
> > > > > >
> > > > > >This requires a change to the
> > > > > >rte_event_eth_rx_adapter_queue_add()
> > > > > >stable API parameters.
> > > > > >This is an ABI breakage and may not be possible now.
> > > > > >It requires changes to many current applications that are using
> > > > > >the
> > > > > >rte_event_eth_rx_adapter_queue_add() stable API.
> > > > > >
> > > > >
> > > > > What I meant by mapping was to retain the stable API parameters
> > > > > as they
> > > are.
> > > > > Internally, the API can use the proposed eventdev PMD operation
> > > > > (eth_rx_adapter_queues_add) without causing an ABI break, as
> > > > > shown
> > > below.
> > > > >
> > > > > int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t
> eth_dev_id,
> > > > >                 int32_t rx_queue_id,
> > > > >                 const struct rte_event_eth_rx_adapter_queue_conf *conf) {
> > > > >         if (rx_queue_id == -1)
> > > > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > > > >                         dev, &rte_eth_devices[eth_dev_id], 0,
> > > > >                         conf, 0);
> > > > >         else
> > > > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > > > >                         dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
> > > > >                         conf, 1); }
> > > > >
> > > > > With above change, old op (eth_rx_adapter_queue_add) can be
> > > > > removed as both API (stable and proposed) will be using
> > > eth_rx_adapter_queues_add.
> > >
> > >
> > > Since this thread is not converging and looks like it is due to confusion.
> > > I am trying to summarize my understanding to define the next
> > > steps(like if needed, we need to reach tech board if there are no
> > > consensus)
> > >
> > >
> > > Problem statement:
> > > ==================
> > > 1) Implementation of rte_event_eth_rx_adapter_queue_add() in HW
> > > typically uses an administrative function to enable it. Typically,
> > > it translated to sending a mailbox to PF driver etc.
> > > So, this function takes "time" to complete in HW implementations.
> > > 2) For SW implementations, this won't take time as there is no other
> > > actors involved.
> > > 3) There are customer use cases, they add 300+
> > > rte_event_eth_rx_adapter_queue_add() on application bootup, that is
> > > introducing significant boot time for the application.
> > > Number of queues are function of number of ethdev ports, number  of
> > > ethdev Rx queues per port and number of event queues.
> > >
> > >
> > > Expected outcome of problem statement:
> > > ======================================
> > > 1) The cases where application knows queue mapping(typically at boot
> > > time case), application can call burst variant of
> > > rte_event_eth_rx_adapter_queue_add()
> > > function
> > > to amortize the cost. Similar scheme used DPDK in control path API
> > > where latency is critical, like rte_acl_add_rules() or rte_flow via
> > > template scheme.
> > > 2) Solution should not break ABI or any impact to SW drivers.
> > > 3) Avoid duplicating the code as much as possible
> > >
> > >
> > > Proposed solution:
> > > ==================
> > > 1) Update eventdev_eth_rx_adapter_queue_add_t() PMD (Internal ABI)
> > > API to take burst parameters
> > > 2) Add new rte_event_eth_rx_adapter_queue*s*_add() function and wire
> > > to use updated PMD API
> > > 3) Use rte_event_eth_rx_adapter_queue_add() as
> > > rte_event_eth_rx_adapter_queue*s*_add(...., 1)
> > >
> > > If so, I am not sure what is the cons of this approach, it will let
> > > to have optimized applications when
> > > a) Application knows the queue mapping at priorly (typically in boot
> > > time)
> > > b) Allow HW drivers to optimize without breaking anything for SW
> > > drivers
> > > c) Provide applications to decide burst vs non burst selection based
> > > on the needed and performance requirements
> >
> > The proposed API benefits only some hardware platforms that have
> > optimized the "queue_add" eventdev PMD implementation for burst mode.
> > It may not benefit SW drivers/other HW platforms.
> 
> The sprint is to have ONE API for all drivers(SW or HW). If one driver is not able
> to leverage feature is OK as long it is NOT breaking anything. We been
> accommodating ton of capabilities(like
> RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED)
> And SW driver specific public API(like
> rte_event_eth_rx_adapter_service_id_get()) to have Common API. As long as
> it does not break each other and application has clarity on the usage (when to
> use the API) I don’t see any issue. Do you see any issue with that forward
> progress approach?
> 

This approach is fine, as long as it is not breaking the other platforms.

> 
> > There will not be much difference in calling the existing API
> > (rte_event_eth_rx_adapter_queue_add()) in a loop vs using the new API
> > for the above cases.
> 
> That is just A implementation view. Right? I have explained in the problem
> statement which is the not case for some drivers.(Even SW driver can leverage
> such burst function using SIMD etc, if one driver wants to)
> 

Not Just from the implementation point of view, but from the latency improvement also.
Anyway, I am fine with the new API approach.

> >
> > If the new proposed API benefits all platforms, then it is useful.
> 
> See above.
> 
> 
> > This is the point I am making from the beginning, it is not captured
> > in the summary.

^ permalink raw reply	[relevance 0%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-01-30 15:30  0%                           ` Naga Harish K, S V
@ 2025-01-30 16:48  0%                             ` Jerin Jacob
  2025-02-03  4:37  0%                               ` Naga Harish K, S V
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2025-01-30 16:48 UTC (permalink / raw)
  To: Naga Harish K, S V, Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Liang Ma, Mccarthy, Peter,
	Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly


> -----Original Message-----
> From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> Sent: Thursday, January 30, 2025 9:01 PM
> To: Jerin Jacob <jerinj@marvell.com>; Shijith Thotton <sthotton@marvell.com>;
> dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak, Pravin
> <pravin.pathak@intel.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Sachin Saxena <sachin.saxena@nxp.com>; Mattias R_nnblom
> <mattias.ronnblom@ericsson.com>; Liang Ma <liangma@liangbit.com>;
> Mccarthy, Peter <peter.mccarthy@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
> Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to configure
> multiple Rx queues
> 
> > -----Original Message----- > From: Jerin Jacob <jerinj@ marvell. com>
> > > Sent: Wednesday, January 29, 2025 1: 13 PM > To: Naga Harish K, S V
> > <s. v. naga. harish. k@ intel. com>; Shijith Thotton > <sthotton@ 
> > marvell. com>;
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob <jerinj@marvell.com>
> > Sent: Wednesday, January 29, 2025 1:13 PM
> > To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Shijith Thotton
> > <sthotton@marvell.com>; dev@dpdk.org
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> > <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>;
> > Van Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > <amitprakashs@marvell.com>; Burakov, Anatoly
> > <anatoly.burakov@intel.com>
> > Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple
> > Rx queues
> >
> >
> >
> > > -----Original Message-----
> > > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > > Sent: Wednesday, January 29, 2025 10:35 AM
> > > To: Shijith Thotton <sthotton@marvell.com>; dev@dpdk.org
> > > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Jerin Jacob
> > > <jerinj@marvell.com>; Liang Ma <liangma@liangbit.com>; Mccarthy,
> > > Peter <peter.mccarthy@intel.com>; Van Haaren, Harry
> > > <harry.van.haaren@intel.com>; Carrillo, Erik G
> > > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > > <amitprakashs@marvell.com>; Burakov, Anatoly
> > > <anatoly.burakov@intel.com>
> > > Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to
> > > configure multiple Rx queues
> > > > >
> > > > >This requires a change to the
> > > > >rte_event_eth_rx_adapter_queue_add()
> > > > >stable API parameters.
> > > > >This is an ABI breakage and may not be possible now.
> > > > >It requires changes to many current applications that are using
> > > > >the
> > > > >rte_event_eth_rx_adapter_queue_add() stable API.
> > > > >
> > > >
> > > > What I meant by mapping was to retain the stable API parameters as
> > > > they
> > are.
> > > > Internally, the API can use the proposed eventdev PMD operation
> > > > (eth_rx_adapter_queues_add) without causing an ABI break, as shown
> > below.
> > > >
> > > > int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t eth_dev_id,
> > > >                 int32_t rx_queue_id,
> > > >                 const struct rte_event_eth_rx_adapter_queue_conf *conf) {
> > > >         if (rx_queue_id == -1)
> > > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > > >                         dev, &rte_eth_devices[eth_dev_id], 0,
> > > >                         conf, 0);
> > > >         else
> > > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > > >                         dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
> > > >                         conf, 1);
> > > > }
> > > >
> > > > With above change, old op (eth_rx_adapter_queue_add) can be
> > > > removed as both API (stable and proposed) will be using
> > eth_rx_adapter_queues_add.
> >
> >
> > Since this thread is not converging and looks like it is due to confusion.
> > I am trying to summarize my understanding to define the next
> > steps(like if needed, we need to reach tech board if there are no
> > consensus)
> >
> >
> > Problem statement:
> > ==================
> > 1) Implementation of rte_event_eth_rx_adapter_queue_add() in HW
> > typically uses an administrative function to enable it. Typically, it
> > translated to sending a mailbox to PF driver etc.
> > So, this function takes "time" to complete in HW implementations.
> > 2) For SW implementations, this won't take time as there is no other
> > actors involved.
> > 3) There are customer use cases, they add 300+
> > rte_event_eth_rx_adapter_queue_add() on application bootup, that is
> > introducing significant boot time for the application.
> > Number of queues are function of number of ethdev ports, number  of
> > ethdev Rx queues per port and number of event queues.
> >
> >
> > Expected outcome of problem statement:
> > ======================================
> > 1) The cases where application knows queue mapping(typically at boot
> > time case), application can call burst variant of
> > rte_event_eth_rx_adapter_queue_add()
> > function
> > to amortize the cost. Similar scheme used DPDK in control path API
> > where latency is critical, like rte_acl_add_rules() or rte_flow via
> > template scheme.
> > 2) Solution should not break ABI or any impact to SW drivers.
> > 3) Avoid duplicating the code as much as possible
> >
> >
> > Proposed solution:
> > ==================
> > 1) Update eventdev_eth_rx_adapter_queue_add_t() PMD (Internal ABI) API
> > to take burst parameters
> > 2) Add new rte_event_eth_rx_adapter_queue*s*_add() function and wire
> > to use updated PMD API
> > 3) Use rte_event_eth_rx_adapter_queue_add() as
> > rte_event_eth_rx_adapter_queue*s*_add(...., 1)
> >
> > If so, I am not sure what is the cons of this approach, it will let to
> > have optimized applications when
> > a) Application knows the queue mapping at priorly (typically in boot
> > time)
> > b) Allow HW drivers to optimize without breaking anything for SW
> > drivers
> > c) Provide applications to decide burst vs non burst selection based
> > on the needed and performance requirements
> 
> The proposed API benefits only some hardware platforms that have optimized
> the "queue_add" eventdev PMD implementation for burst mode.
> It may not benefit SW drivers/other HW platforms.

The sprint is to have ONE API for all drivers(SW or HW). If one driver is not able to leverage
feature is OK as long it is NOT breaking anything. We been accommodating ton of capabilities(like  RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED)
And SW driver specific public API(like rte_event_eth_rx_adapter_service_id_get()) to have
Common API. As long as it does not break each other and application has clarity on the usage
(when to use the API) I don’t see any issue. Do you see any issue with that forward progress approach? 


> There will not be much difference in calling the existing API
> (rte_event_eth_rx_adapter_queue_add()) in a loop vs using the new API for the
> above cases.

That is just A implementation view. Right? I have explained in the problem statement 
which is the not case for some drivers.(Even SW driver can leverage such burst function using SIMD etc, if one driver wants to)

> 
> If the new proposed API benefits all platforms, then it is useful.

See above.


> This is the point I am making from the beginning, it is not captured in the
> summary.

^ permalink raw reply	[relevance 0%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-01-29  7:43  4%                         ` Jerin Jacob
@ 2025-01-30 15:30  0%                           ` Naga Harish K, S V
  2025-01-30 16:48  0%                             ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Naga Harish K, S V @ 2025-01-30 15:30 UTC (permalink / raw)
  To: Jerin Jacob, Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Liang Ma, Mccarthy, Peter,
	Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly



> -----Original Message-----
> From: Jerin Jacob <jerinj@marvell.com>
> Sent: Wednesday, January 29, 2025 1:13 PM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; Shijith Thotton
> <sthotton@marvell.com>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Liang Ma
> <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>; Van
> Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>
> Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx
> queues
> 
> 
> 
> > -----Original Message-----
> > From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> > Sent: Wednesday, January 29, 2025 10:35 AM
> > To: Shijith Thotton <sthotton@marvell.com>; dev@dpdk.org
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> > Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> > Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Jerin Jacob
> > <jerinj@marvell.com>; Liang Ma <liangma@liangbit.com>; Mccarthy, Peter
> > <peter.mccarthy@intel.com>; Van Haaren, Harry
> > <harry.van.haaren@intel.com>; Carrillo, Erik G
> > <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> > <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> > <amitprakashs@marvell.com>; Burakov, Anatoly
> > <anatoly.burakov@intel.com>
> > Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to configure
> > multiple Rx queues
> > > >
> > > >This requires a change to the rte_event_eth_rx_adapter_queue_add()
> > > >stable API parameters.
> > > >This is an ABI breakage and may not be possible now.
> > > >It requires changes to many current applications that are using the
> > > >rte_event_eth_rx_adapter_queue_add() stable API.
> > > >
> > >
> > > What I meant by mapping was to retain the stable API parameters as they
> are.
> > > Internally, the API can use the proposed eventdev PMD operation
> > > (eth_rx_adapter_queues_add) without causing an ABI break, as shown
> below.
> > >
> > > int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t eth_dev_id,
> > >                 int32_t rx_queue_id,
> > >                 const struct rte_event_eth_rx_adapter_queue_conf *conf) {
> > >         if (rx_queue_id == -1)
> > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > >                         dev, &rte_eth_devices[eth_dev_id], 0,
> > >                         conf, 0);
> > >         else
> > >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> > >                         dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
> > >                         conf, 1);
> > > }
> > >
> > > With above change, old op (eth_rx_adapter_queue_add) can be removed
> > > as both API (stable and proposed) will be using
> eth_rx_adapter_queues_add.
> 
> 
> Since this thread is not converging and looks like it is due to confusion.
> I am trying to summarize my understanding to define the next steps(like if
> needed, we need to reach tech board if there are no consensus)
> 
> 
> Problem statement:
> ==================
> 1) Implementation of rte_event_eth_rx_adapter_queue_add() in HW typically
> uses an administrative function to enable it. Typically, it translated to sending a
> mailbox to PF driver etc.
> So, this function takes "time" to complete in HW implementations.
> 2) For SW implementations, this won't take time as there is no other actors
> involved.
> 3) There are customer use cases, they add 300+
> rte_event_eth_rx_adapter_queue_add() on application bootup, that is
> introducing significant boot time for the application.
> Number of queues are function of number of ethdev ports, number  of
> ethdev Rx queues per port and number of event queues.
> 
> 
> Expected outcome of problem statement:
> ======================================
> 1) The cases where application knows queue mapping(typically at boot time
> case),
> application can call burst variant of rte_event_eth_rx_adapter_queue_add()
> function
> to amortize the cost. Similar scheme used DPDK in control path API where
> latency is critical,
> like rte_acl_add_rules() or rte_flow via template scheme.
> 2) Solution should not break ABI or any impact to SW drivers.
> 3) Avoid duplicating the code as much as possible
> 
> 
> Proposed solution:
> ==================
> 1) Update eventdev_eth_rx_adapter_queue_add_t() PMD (Internal ABI) API
> to take burst parameters
> 2) Add new rte_event_eth_rx_adapter_queue*s*_add() function and wire to
> use updated PMD API
> 3) Use rte_event_eth_rx_adapter_queue_add() as
> rte_event_eth_rx_adapter_queue*s*_add(...., 1)
> 
> If so, I am not sure what is the cons of this approach, it will let to have
> optimized applications when
> a) Application knows the queue mapping at priorly (typically in boot time)
> b) Allow HW drivers to optimize without breaking anything for SW drivers
> c) Provide applications to decide burst vs non burst selection based on the
> needed and performance requirements

The proposed API benefits only some hardware platforms that have optimized the "queue_add" eventdev PMD implementation for burst mode.
It may not benefit SW drivers/other HW platforms.
There will not be much difference in calling the existing API (rte_event_eth_rx_adapter_queue_add()) in a loop vs using the new API for the above cases.

If the new proposed API benefits all platforms, then it is useful.
This is the point I am making from the beginning, it is not captured in the summary.

^ permalink raw reply	[relevance 0%]

* [PATCH v3 1/4] drivers: merge common and net idpf drivers
  @ 2025-01-30 15:12  2%   ` Bruce Richardson
  2025-02-03  8:36  2%     ` Shetty, Praveen
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2025-01-30 15:12 UTC (permalink / raw)
  To: dev
  Cc: Bruce Richardson, Thomas Monjalon, Jingjing Wu, Praveen Shetty,
	Konstantin Ananyev

Rather than having some of the idpf code split out into the "common"
directory, used by both a net/idpf and a net/cpfl driver, we can
merge all idpf code together under net/idpf and have the cpfl driver
depend on "net/idpf" rather than "common/idpf".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 devtools/libabigail.abignore                  |  1 +
 doc/guides/rel_notes/release_25_03.rst        |  6 ++++
 drivers/common/idpf/meson.build               | 34 -------------------
 drivers/common/meson.build                    |  1 -
 drivers/net/intel/cpfl/meson.build            |  2 +-
 .../{common => net/intel}/idpf/base/README    |  0
 .../intel}/idpf/base/idpf_alloc.h             |  0
 .../intel}/idpf/base/idpf_controlq.c          |  0
 .../intel}/idpf/base/idpf_controlq.h          |  0
 .../intel}/idpf/base/idpf_controlq_api.h      |  0
 .../intel}/idpf/base/idpf_controlq_setup.c    |  0
 .../intel}/idpf/base/idpf_devids.h            |  0
 .../intel}/idpf/base/idpf_lan_pf_regs.h       |  0
 .../intel}/idpf/base/idpf_lan_txrx.h          |  0
 .../intel}/idpf/base/idpf_lan_vf_regs.h       |  0
 .../intel}/idpf/base/idpf_osdep.h             |  0
 .../intel}/idpf/base/idpf_prototype.h         |  0
 .../intel}/idpf/base/idpf_type.h              |  0
 .../intel}/idpf/base/meson.build              |  0
 .../intel}/idpf/base/siov_regs.h              |  0
 .../intel}/idpf/base/virtchnl2.h              |  0
 .../intel}/idpf/base/virtchnl2_lan_desc.h     |  0
 .../intel}/idpf/idpf_common_device.c          |  0
 .../intel}/idpf/idpf_common_device.h          |  0
 .../intel}/idpf/idpf_common_logs.h            |  0
 .../intel}/idpf/idpf_common_rxtx.c            |  0
 .../intel}/idpf/idpf_common_rxtx.h            |  0
 .../intel}/idpf/idpf_common_rxtx_avx512.c     |  0
 .../intel}/idpf/idpf_common_virtchnl.c        |  0
 .../intel}/idpf/idpf_common_virtchnl.h        |  0
 drivers/net/intel/idpf/meson.build            | 20 +++++++++--
 .../{common => net/intel}/idpf/version.map    |  0
 drivers/net/meson.build                       |  2 +-
 33 files changed, 27 insertions(+), 39 deletions(-)
 delete mode 100644 drivers/common/idpf/meson.build
 rename drivers/{common => net/intel}/idpf/base/README (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_alloc.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_api.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_setup.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_devids.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_pf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_txrx.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_vf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_osdep.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_prototype.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_type.h (100%)
 rename drivers/{common => net/intel}/idpf/base/meson.build (100%)
 rename drivers/{common => net/intel}/idpf/base/siov_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2_lan_desc.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_logs.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx_avx512.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.h (100%)
 rename drivers/{common => net/intel}/idpf/version.map (100%)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..1dee6a954f 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -25,6 +25,7 @@
 ;
 ; SKIP_LIBRARY=librte_common_mlx5_glue
 ; SKIP_LIBRARY=librte_net_mlx4_glue
+; SKIP_LIBRARY=librte_common_idpf
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Experimental APIs exceptions ;
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index a88b04d958..79b1116f6e 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -115,6 +115,12 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* The driver ``common/idpf`` has been merged into the ``net/intel/idpf`` driver.
+  This change should have no impact to end applications, but,
+  when specifying the ``idpf`` or ``cpfl`` net drivers to meson via ``-Denable_drivers`` option,
+  there is no longer any need to also specify the ``common/idpf`` driver.
+  Note, however, ``net/intel/cpfl`` driver now depends upon the ``net/intel/idpf`` driver.
+
 
 ABI Changes
 -----------
diff --git a/drivers/common/idpf/meson.build b/drivers/common/idpf/meson.build
deleted file mode 100644
index 46fd45c03b..0000000000
--- a/drivers/common/idpf/meson.build
+++ /dev/null
@@ -1,34 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2022 Intel Corporation
-
-if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
-    subdir_done()
-endif
-
-includes += include_directories('../iavf')
-
-deps += ['mbuf']
-
-sources = files(
-        'idpf_common_device.c',
-        'idpf_common_rxtx.c',
-        'idpf_common_virtchnl.c',
-)
-
-if arch_subdir == 'x86'
-    if cc_has_avx512
-        cflags += ['-DCC_AVX512_SUPPORT']
-        avx512_args = cflags + cc_avx512_flags
-        if cc.has_argument('-march=skylake-avx512')
-            avx512_args += '-march=skylake-avx512'
-        endif
-        idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
-                'idpf_common_rxtx_avx512.c',
-                dependencies: [static_rte_mbuf,],
-                include_directories: includes,
-                c_args: avx512_args)
-        objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
-    endif
-endif
-
-subdir('base')
diff --git a/drivers/common/meson.build b/drivers/common/meson.build
index 8734af36aa..e1e3149d8f 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -6,7 +6,6 @@ drivers = [
         'cpt',
         'dpaax',
         'iavf',
-        'idpf',
         'ionic',
         'mvep',
         'octeontx',
diff --git a/drivers/net/intel/cpfl/meson.build b/drivers/net/intel/cpfl/meson.build
index 87fcfe0bb1..1f0269d50b 100644
--- a/drivers/net/intel/cpfl/meson.build
+++ b/drivers/net/intel/cpfl/meson.build
@@ -11,7 +11,7 @@ if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
     subdir_done()
 endif
 
-deps += ['hash', 'common_idpf']
+deps += ['hash', 'net_idpf']
 
 sources = files(
         'cpfl_ethdev.c',
diff --git a/drivers/common/idpf/base/README b/drivers/net/intel/idpf/base/README
similarity index 100%
rename from drivers/common/idpf/base/README
rename to drivers/net/intel/idpf/base/README
diff --git a/drivers/common/idpf/base/idpf_alloc.h b/drivers/net/intel/idpf/base/idpf_alloc.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_alloc.h
rename to drivers/net/intel/idpf/base/idpf_alloc.h
diff --git a/drivers/common/idpf/base/idpf_controlq.c b/drivers/net/intel/idpf/base/idpf_controlq.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.c
rename to drivers/net/intel/idpf/base/idpf_controlq.c
diff --git a/drivers/common/idpf/base/idpf_controlq.h b/drivers/net/intel/idpf/base/idpf_controlq.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.h
rename to drivers/net/intel/idpf/base/idpf_controlq.h
diff --git a/drivers/common/idpf/base/idpf_controlq_api.h b/drivers/net/intel/idpf/base/idpf_controlq_api.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_api.h
rename to drivers/net/intel/idpf/base/idpf_controlq_api.h
diff --git a/drivers/common/idpf/base/idpf_controlq_setup.c b/drivers/net/intel/idpf/base/idpf_controlq_setup.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_setup.c
rename to drivers/net/intel/idpf/base/idpf_controlq_setup.c
diff --git a/drivers/common/idpf/base/idpf_devids.h b/drivers/net/intel/idpf/base/idpf_devids.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_devids.h
rename to drivers/net/intel/idpf/base/idpf_devids.h
diff --git a/drivers/common/idpf/base/idpf_lan_pf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_pf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
diff --git a/drivers/common/idpf/base/idpf_lan_txrx.h b/drivers/net/intel/idpf/base/idpf_lan_txrx.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_txrx.h
rename to drivers/net/intel/idpf/base/idpf_lan_txrx.h
diff --git a/drivers/common/idpf/base/idpf_lan_vf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_vf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
diff --git a/drivers/common/idpf/base/idpf_osdep.h b/drivers/net/intel/idpf/base/idpf_osdep.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_osdep.h
rename to drivers/net/intel/idpf/base/idpf_osdep.h
diff --git a/drivers/common/idpf/base/idpf_prototype.h b/drivers/net/intel/idpf/base/idpf_prototype.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_prototype.h
rename to drivers/net/intel/idpf/base/idpf_prototype.h
diff --git a/drivers/common/idpf/base/idpf_type.h b/drivers/net/intel/idpf/base/idpf_type.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_type.h
rename to drivers/net/intel/idpf/base/idpf_type.h
diff --git a/drivers/common/idpf/base/meson.build b/drivers/net/intel/idpf/base/meson.build
similarity index 100%
rename from drivers/common/idpf/base/meson.build
rename to drivers/net/intel/idpf/base/meson.build
diff --git a/drivers/common/idpf/base/siov_regs.h b/drivers/net/intel/idpf/base/siov_regs.h
similarity index 100%
rename from drivers/common/idpf/base/siov_regs.h
rename to drivers/net/intel/idpf/base/siov_regs.h
diff --git a/drivers/common/idpf/base/virtchnl2.h b/drivers/net/intel/idpf/base/virtchnl2.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2.h
rename to drivers/net/intel/idpf/base/virtchnl2.h
diff --git a/drivers/common/idpf/base/virtchnl2_lan_desc.h b/drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2_lan_desc.h
rename to drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
diff --git a/drivers/common/idpf/idpf_common_device.c b/drivers/net/intel/idpf/idpf_common_device.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.c
rename to drivers/net/intel/idpf/idpf_common_device.c
diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/net/intel/idpf/idpf_common_device.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.h
rename to drivers/net/intel/idpf/idpf_common_device.h
diff --git a/drivers/common/idpf/idpf_common_logs.h b/drivers/net/intel/idpf/idpf_common_logs.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_logs.h
rename to drivers/net/intel/idpf/idpf_common_logs.h
diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/net/intel/idpf/idpf_common_rxtx.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.c
rename to drivers/net/intel/idpf/idpf_common_rxtx.c
diff --git a/drivers/common/idpf/idpf_common_rxtx.h b/drivers/net/intel/idpf/idpf_common_rxtx.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.h
rename to drivers/net/intel/idpf/idpf_common_rxtx.h
diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx_avx512.c
rename to drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.c b/drivers/net/intel/idpf/idpf_common_virtchnl.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.c
rename to drivers/net/intel/idpf/idpf_common_virtchnl.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.h b/drivers/net/intel/idpf/idpf_common_virtchnl.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.h
rename to drivers/net/intel/idpf/idpf_common_virtchnl.h
diff --git a/drivers/net/intel/idpf/meson.build b/drivers/net/intel/idpf/meson.build
index 34cbdc4da0..52405b5b35 100644
--- a/drivers/net/intel/idpf/meson.build
+++ b/drivers/net/intel/idpf/meson.build
@@ -7,13 +7,29 @@ if is_windows
     subdir_done()
 endif
 
-deps += ['common_idpf']
+includes += include_directories('../../../common/iavf')
 
 sources = files(
+        'idpf_common_device.c',
+        'idpf_common_rxtx.c',
+        'idpf_common_virtchnl.c',
+
         'idpf_ethdev.c',
         'idpf_rxtx.c',
 )
 
-if arch_subdir == 'x86'and cc_has_avx512
+if arch_subdir == 'x86' and cc_has_avx512
     cflags += ['-DCC_AVX512_SUPPORT']
+    avx512_args = cflags + cc_avx512_flags
+    if cc.has_argument('-march=skylake-avx512')
+        avx512_args += '-march=skylake-avx512'
+    endif
+    idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
+            'idpf_common_rxtx_avx512.c',
+            dependencies: static_rte_mbuf,
+            include_directories: includes,
+            c_args: avx512_args)
+    objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
 endif
+
+subdir('base')
diff --git a/drivers/common/idpf/version.map b/drivers/net/intel/idpf/version.map
similarity index 100%
rename from drivers/common/idpf/version.map
rename to drivers/net/intel/idpf/version.map
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 02a3f5a0b6..bcf6f9dc73 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,7 +24,6 @@ drivers = [
         'gve',
         'hinic',
         'hns3',
-        'intel/cpfl',
         'intel/e1000',
         'intel/fm10k',
         'intel/i40e',
@@ -34,6 +33,7 @@ drivers = [
         'intel/igc',
         'intel/ipn3ke',
         'intel/ixgbe',
+        'intel/cpfl',  # depends on idpf, so must come after it
         'ionic',
         'mana',
         'memif',
-- 
2.43.0


^ permalink raw reply	[relevance 2%]

* [PATCH v2 1/4] drivers: merge common and net idpf drivers
  @ 2025-01-30 12:48  2%   ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2025-01-30 12:48 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, Jingjing Wu, Praveen Shetty, Konstantin Ananyev

Rather than having some of the idpf code split out into the "common"
directory, used by both a net/idpf and a net/cpfl driver, we can
merge all idpf code together under net/idpf and have the cpfl driver
depend on "net/idpf" rather than "common/idpf".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/rel_notes/release_25_03.rst        |  6 ++++
 drivers/common/idpf/meson.build               | 34 -------------------
 drivers/common/meson.build                    |  1 -
 drivers/net/intel/cpfl/meson.build            |  2 +-
 .../{common => net/intel}/idpf/base/README    |  0
 .../intel}/idpf/base/idpf_alloc.h             |  0
 .../intel}/idpf/base/idpf_controlq.c          |  0
 .../intel}/idpf/base/idpf_controlq.h          |  0
 .../intel}/idpf/base/idpf_controlq_api.h      |  0
 .../intel}/idpf/base/idpf_controlq_setup.c    |  0
 .../intel}/idpf/base/idpf_devids.h            |  0
 .../intel}/idpf/base/idpf_lan_pf_regs.h       |  0
 .../intel}/idpf/base/idpf_lan_txrx.h          |  0
 .../intel}/idpf/base/idpf_lan_vf_regs.h       |  0
 .../intel}/idpf/base/idpf_osdep.h             |  0
 .../intel}/idpf/base/idpf_prototype.h         |  0
 .../intel}/idpf/base/idpf_type.h              |  0
 .../intel}/idpf/base/meson.build              |  0
 .../intel}/idpf/base/siov_regs.h              |  0
 .../intel}/idpf/base/virtchnl2.h              |  0
 .../intel}/idpf/base/virtchnl2_lan_desc.h     |  0
 .../intel}/idpf/idpf_common_device.c          |  0
 .../intel}/idpf/idpf_common_device.h          |  0
 .../intel}/idpf/idpf_common_logs.h            |  0
 .../intel}/idpf/idpf_common_rxtx.c            |  0
 .../intel}/idpf/idpf_common_rxtx.h            |  0
 .../intel}/idpf/idpf_common_rxtx_avx512.c     |  0
 .../intel}/idpf/idpf_common_virtchnl.c        |  0
 .../intel}/idpf/idpf_common_virtchnl.h        |  0
 drivers/net/intel/idpf/meson.build            | 20 +++++++++--
 .../{common => net/intel}/idpf/version.map    |  0
 drivers/net/meson.build                       |  2 +-
 32 files changed, 26 insertions(+), 39 deletions(-)
 delete mode 100644 drivers/common/idpf/meson.build
 rename drivers/{common => net/intel}/idpf/base/README (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_alloc.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_api.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_controlq_setup.c (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_devids.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_pf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_txrx.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_lan_vf_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_osdep.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_prototype.h (100%)
 rename drivers/{common => net/intel}/idpf/base/idpf_type.h (100%)
 rename drivers/{common => net/intel}/idpf/base/meson.build (100%)
 rename drivers/{common => net/intel}/idpf/base/siov_regs.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2.h (100%)
 rename drivers/{common => net/intel}/idpf/base/virtchnl2_lan_desc.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_device.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_logs.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx.h (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_rxtx_avx512.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.c (100%)
 rename drivers/{common => net/intel}/idpf/idpf_common_virtchnl.h (100%)
 rename drivers/{common => net/intel}/idpf/version.map (100%)

diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index a88b04d958..79b1116f6e 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -115,6 +115,12 @@ API Changes
   but to enable/disable these drivers via Meson option requires use of the new paths.
   For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
 
+* The driver ``common/idpf`` has been merged into the ``net/intel/idpf`` driver.
+  This change should have no impact to end applications, but,
+  when specifying the ``idpf`` or ``cpfl`` net drivers to meson via ``-Denable_drivers`` option,
+  there is no longer any need to also specify the ``common/idpf`` driver.
+  Note, however, ``net/intel/cpfl`` driver now depends upon the ``net/intel/idpf`` driver.
+
 
 ABI Changes
 -----------
diff --git a/drivers/common/idpf/meson.build b/drivers/common/idpf/meson.build
deleted file mode 100644
index 46fd45c03b..0000000000
--- a/drivers/common/idpf/meson.build
+++ /dev/null
@@ -1,34 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2022 Intel Corporation
-
-if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
-    subdir_done()
-endif
-
-includes += include_directories('../iavf')
-
-deps += ['mbuf']
-
-sources = files(
-        'idpf_common_device.c',
-        'idpf_common_rxtx.c',
-        'idpf_common_virtchnl.c',
-)
-
-if arch_subdir == 'x86'
-    if cc_has_avx512
-        cflags += ['-DCC_AVX512_SUPPORT']
-        avx512_args = cflags + cc_avx512_flags
-        if cc.has_argument('-march=skylake-avx512')
-            avx512_args += '-march=skylake-avx512'
-        endif
-        idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
-                'idpf_common_rxtx_avx512.c',
-                dependencies: [static_rte_mbuf,],
-                include_directories: includes,
-                c_args: avx512_args)
-        objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
-    endif
-endif
-
-subdir('base')
diff --git a/drivers/common/meson.build b/drivers/common/meson.build
index 8734af36aa..e1e3149d8f 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -6,7 +6,6 @@ drivers = [
         'cpt',
         'dpaax',
         'iavf',
-        'idpf',
         'ionic',
         'mvep',
         'octeontx',
diff --git a/drivers/net/intel/cpfl/meson.build b/drivers/net/intel/cpfl/meson.build
index 87fcfe0bb1..1f0269d50b 100644
--- a/drivers/net/intel/cpfl/meson.build
+++ b/drivers/net/intel/cpfl/meson.build
@@ -11,7 +11,7 @@ if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
     subdir_done()
 endif
 
-deps += ['hash', 'common_idpf']
+deps += ['hash', 'net_idpf']
 
 sources = files(
         'cpfl_ethdev.c',
diff --git a/drivers/common/idpf/base/README b/drivers/net/intel/idpf/base/README
similarity index 100%
rename from drivers/common/idpf/base/README
rename to drivers/net/intel/idpf/base/README
diff --git a/drivers/common/idpf/base/idpf_alloc.h b/drivers/net/intel/idpf/base/idpf_alloc.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_alloc.h
rename to drivers/net/intel/idpf/base/idpf_alloc.h
diff --git a/drivers/common/idpf/base/idpf_controlq.c b/drivers/net/intel/idpf/base/idpf_controlq.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.c
rename to drivers/net/intel/idpf/base/idpf_controlq.c
diff --git a/drivers/common/idpf/base/idpf_controlq.h b/drivers/net/intel/idpf/base/idpf_controlq.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq.h
rename to drivers/net/intel/idpf/base/idpf_controlq.h
diff --git a/drivers/common/idpf/base/idpf_controlq_api.h b/drivers/net/intel/idpf/base/idpf_controlq_api.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_api.h
rename to drivers/net/intel/idpf/base/idpf_controlq_api.h
diff --git a/drivers/common/idpf/base/idpf_controlq_setup.c b/drivers/net/intel/idpf/base/idpf_controlq_setup.c
similarity index 100%
rename from drivers/common/idpf/base/idpf_controlq_setup.c
rename to drivers/net/intel/idpf/base/idpf_controlq_setup.c
diff --git a/drivers/common/idpf/base/idpf_devids.h b/drivers/net/intel/idpf/base/idpf_devids.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_devids.h
rename to drivers/net/intel/idpf/base/idpf_devids.h
diff --git a/drivers/common/idpf/base/idpf_lan_pf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_pf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_pf_regs.h
diff --git a/drivers/common/idpf/base/idpf_lan_txrx.h b/drivers/net/intel/idpf/base/idpf_lan_txrx.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_txrx.h
rename to drivers/net/intel/idpf/base/idpf_lan_txrx.h
diff --git a/drivers/common/idpf/base/idpf_lan_vf_regs.h b/drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_lan_vf_regs.h
rename to drivers/net/intel/idpf/base/idpf_lan_vf_regs.h
diff --git a/drivers/common/idpf/base/idpf_osdep.h b/drivers/net/intel/idpf/base/idpf_osdep.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_osdep.h
rename to drivers/net/intel/idpf/base/idpf_osdep.h
diff --git a/drivers/common/idpf/base/idpf_prototype.h b/drivers/net/intel/idpf/base/idpf_prototype.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_prototype.h
rename to drivers/net/intel/idpf/base/idpf_prototype.h
diff --git a/drivers/common/idpf/base/idpf_type.h b/drivers/net/intel/idpf/base/idpf_type.h
similarity index 100%
rename from drivers/common/idpf/base/idpf_type.h
rename to drivers/net/intel/idpf/base/idpf_type.h
diff --git a/drivers/common/idpf/base/meson.build b/drivers/net/intel/idpf/base/meson.build
similarity index 100%
rename from drivers/common/idpf/base/meson.build
rename to drivers/net/intel/idpf/base/meson.build
diff --git a/drivers/common/idpf/base/siov_regs.h b/drivers/net/intel/idpf/base/siov_regs.h
similarity index 100%
rename from drivers/common/idpf/base/siov_regs.h
rename to drivers/net/intel/idpf/base/siov_regs.h
diff --git a/drivers/common/idpf/base/virtchnl2.h b/drivers/net/intel/idpf/base/virtchnl2.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2.h
rename to drivers/net/intel/idpf/base/virtchnl2.h
diff --git a/drivers/common/idpf/base/virtchnl2_lan_desc.h b/drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
similarity index 100%
rename from drivers/common/idpf/base/virtchnl2_lan_desc.h
rename to drivers/net/intel/idpf/base/virtchnl2_lan_desc.h
diff --git a/drivers/common/idpf/idpf_common_device.c b/drivers/net/intel/idpf/idpf_common_device.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.c
rename to drivers/net/intel/idpf/idpf_common_device.c
diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/net/intel/idpf/idpf_common_device.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_device.h
rename to drivers/net/intel/idpf/idpf_common_device.h
diff --git a/drivers/common/idpf/idpf_common_logs.h b/drivers/net/intel/idpf/idpf_common_logs.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_logs.h
rename to drivers/net/intel/idpf/idpf_common_logs.h
diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/net/intel/idpf/idpf_common_rxtx.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.c
rename to drivers/net/intel/idpf/idpf_common_rxtx.c
diff --git a/drivers/common/idpf/idpf_common_rxtx.h b/drivers/net/intel/idpf/idpf_common_rxtx.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx.h
rename to drivers/net/intel/idpf/idpf_common_rxtx.h
diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_rxtx_avx512.c
rename to drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.c b/drivers/net/intel/idpf/idpf_common_virtchnl.c
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.c
rename to drivers/net/intel/idpf/idpf_common_virtchnl.c
diff --git a/drivers/common/idpf/idpf_common_virtchnl.h b/drivers/net/intel/idpf/idpf_common_virtchnl.h
similarity index 100%
rename from drivers/common/idpf/idpf_common_virtchnl.h
rename to drivers/net/intel/idpf/idpf_common_virtchnl.h
diff --git a/drivers/net/intel/idpf/meson.build b/drivers/net/intel/idpf/meson.build
index 34cbdc4da0..52405b5b35 100644
--- a/drivers/net/intel/idpf/meson.build
+++ b/drivers/net/intel/idpf/meson.build
@@ -7,13 +7,29 @@ if is_windows
     subdir_done()
 endif
 
-deps += ['common_idpf']
+includes += include_directories('../../../common/iavf')
 
 sources = files(
+        'idpf_common_device.c',
+        'idpf_common_rxtx.c',
+        'idpf_common_virtchnl.c',
+
         'idpf_ethdev.c',
         'idpf_rxtx.c',
 )
 
-if arch_subdir == 'x86'and cc_has_avx512
+if arch_subdir == 'x86' and cc_has_avx512
     cflags += ['-DCC_AVX512_SUPPORT']
+    avx512_args = cflags + cc_avx512_flags
+    if cc.has_argument('-march=skylake-avx512')
+        avx512_args += '-march=skylake-avx512'
+    endif
+    idpf_common_avx512_lib = static_library('idpf_common_avx512_lib',
+            'idpf_common_rxtx_avx512.c',
+            dependencies: static_rte_mbuf,
+            include_directories: includes,
+            c_args: avx512_args)
+    objs += idpf_common_avx512_lib.extract_objects('idpf_common_rxtx_avx512.c')
 endif
+
+subdir('base')
diff --git a/drivers/common/idpf/version.map b/drivers/net/intel/idpf/version.map
similarity index 100%
rename from drivers/common/idpf/version.map
rename to drivers/net/intel/idpf/version.map
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 02a3f5a0b6..bcf6f9dc73 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,7 +24,6 @@ drivers = [
         'gve',
         'hinic',
         'hns3',
-        'intel/cpfl',
         'intel/e1000',
         'intel/fm10k',
         'intel/i40e',
@@ -34,6 +33,7 @@ drivers = [
         'intel/igc',
         'intel/ipn3ke',
         'intel/ixgbe',
+        'intel/cpfl',  # depends on idpf, so must come after it
         'ionic',
         'mana',
         'memif',
-- 
2.43.0


^ permalink raw reply	[relevance 2%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-01-29  5:04  0%                       ` Naga Harish K, S V
@ 2025-01-29  7:43  4%                         ` Jerin Jacob
  2025-01-30 15:30  0%                           ` Naga Harish K, S V
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2025-01-29  7:43 UTC (permalink / raw)
  To: Naga Harish K, S V, Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Liang Ma, Mccarthy, Peter,
	Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly



> -----Original Message-----
> From: Naga Harish K, S V <s.v.naga.harish.k@intel.com>
> Sent: Wednesday, January 29, 2025 10:35 AM
> To: Shijith Thotton <sthotton@marvell.com>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak, Pravin
> <pravin.pathak@intel.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Sachin Saxena <sachin.saxena@nxp.com>; Mattias R_nnblom
> <mattias.ronnblom@ericsson.com>; Jerin Jacob <jerinj@marvell.com>; Liang
> Ma <liangma@liangbit.com>; Mccarthy, Peter <peter.mccarthy@intel.com>;
> Van Haaren, Harry <harry.van.haaren@intel.com>; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; Gujjar, Abhinandan S
> <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
> Subject: [EXTERNAL] RE: [RFC PATCH] eventdev: adapter API to configure
> multiple Rx queues
> > >
> > >This requires a change to the rte_event_eth_rx_adapter_queue_add()
> > >stable API parameters.
> > >This is an ABI breakage and may not be possible now.
> > >It requires changes to many current applications that are using the
> > >rte_event_eth_rx_adapter_queue_add() stable API.
> > >
> >
> > What I meant by mapping was to retain the stable API parameters as they are.
> > Internally, the API can use the proposed eventdev PMD operation
> > (eth_rx_adapter_queues_add) without causing an ABI break, as shown below.
> >
> > int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t eth_dev_id,
> >                 int32_t rx_queue_id,
> >                 const struct rte_event_eth_rx_adapter_queue_conf *conf) {
> >         if (rx_queue_id == -1)
> >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> >                         dev, &rte_eth_devices[eth_dev_id], 0,
> >                         conf, 0);
> >         else
> >                 dev->dev_ops->eth_rx_adapter_queues_add)(
> >                         dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
> >                         conf, 1);
> > }
> >
> > With above change, old op (eth_rx_adapter_queue_add) can be removed as
> > both API (stable and proposed) will be using eth_rx_adapter_queues_add.


Since this thread is not converging and looks like it is due to confusion.
I am trying to summarize my understanding to define the next steps(like if needed, we need to reach tech board if there are no consensus)


Problem statement:
==================
1) Implementation of rte_event_eth_rx_adapter_queue_add() in HW typically uses an administrative
function to enable it. Typically, it translated to sending a mailbox to PF driver etc.
So, this function takes "time" to complete in HW implementations.
2) For SW implementations, this won't take time as there is no other actors involved.
3) There are customer use cases, they add 300+ rte_event_eth_rx_adapter_queue_add() on 
application bootup, that is introducing significant boot time for the application.
Number of queues are function of number of ethdev ports, number  of ethdev Rx queues per port
and number of event queues.


Expected outcome of problem statement:
======================================
1) The cases where application knows queue mapping(typically at boot time case),
application can call burst variant of rte_event_eth_rx_adapter_queue_add() function
to amortize the cost. Similar scheme used DPDK in control path API where latency is critical,
like rte_acl_add_rules() or rte_flow via template scheme.
2) Solution should not break ABI or any impact to SW drivers.
3) Avoid duplicating the code as much as possible


Proposed solution:
==================
1) Update eventdev_eth_rx_adapter_queue_add_t() PMD (Internal ABI) API to take burst parameters
2) Add new rte_event_eth_rx_adapter_queue*s*_add() function and wire to use updated PMD API
3) Use rte_event_eth_rx_adapter_queue_add() as rte_event_eth_rx_adapter_queue*s*_add(...., 1)

If so, I am not sure what is the cons of this approach, it will let to have optimized applications when
a) Application knows the queue mapping at priorly (typically in boot time)
b) Allow HW drivers to optimize without breaking anything for SW drivers
c) Provide applications to decide burst vs non burst selection based on the needed and performance requirements

^ permalink raw reply	[relevance 4%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-01-24 10:00  3%                     ` Shijith Thotton
@ 2025-01-29  5:04  0%                       ` Naga Harish K, S V
  2025-01-29  7:43  4%                         ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Naga Harish K, S V @ 2025-01-29  5:04 UTC (permalink / raw)
  To: Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Jerin Jacob, Liang Ma, Mccarthy,
	Peter, Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly



> -----Original Message-----
> From: Shijith Thotton <sthotton@marvell.com>
> Sent: Friday, January 24, 2025 3:30 PM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Jerin Jacob
> <jerinj@marvell.com>; Liang Ma <liangma@liangbit.com>; Mccarthy, Peter
> <peter.mccarthy@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>
> Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx
> queues
> 
> >> >> >> >> >>> This RFC introduces a new API,
> >> >> >> >> >>> rte_event_eth_rx_adapter_queues_add(),
> >> >> >> >> >>> designed to enhance the flexibility of configuring
> >> >> >> >> >>> multiple Rx queues in eventdev Rx adapter.
> >> >> >> >> >>>
> >> >> >> >> >>> The existing rte_event_eth_rx_adapter_queue_add() API
> >> >> >> >> >>> supports adding multiple queues by specifying
> >> >> >> >> >>> rx_queue_id = -1, but it lacks the ability to
> >> >> >> >> >apply
> >> >> >> >> >>> specific configurations to each of the added queues.
> >> >> >> >> >>>
> >> >> >> >> >>
> >> >> >> >> >>The application can still use the existing
> >> >> >> >> >>rte_event_eth_rx_adapter_queue_add() API in a loop with
> >> >> >> >> >>different configurations for different queues.
> >> >> >> >> >>
> >> >> >> >> >>The proposed API is not enabling new features that cannot
> >> >> >> >> >>be achieved with the existing API.
> >> >> >> >> >>Adding new APIs without much usefulness causes unnecessary
> >> >> >> >> >>complexity/confusion for users.
> >> >> >> >> >>
> >> >>
> >> >> The eth_rx_adapter_queue_add eventdev PMD operation can be
> updated
> >> to
> >> >> support burst mode. Internally, both the new and existing APIs can
> >> >> utilize this updated operation. This enables applications to use
> >> >> either API and achieve
> >> >the
> >> >> same results while adding a single queue. For adding multiple RX
> >> >> queues to
> >> >the
> >> >> adapter, the new API can be used as it is not supported by the old API.
> >> >>
> >> >
> >> >Not all platforms implement the eventdev PMD operation for
> >> >eth_rx_adapter_queue_add, so this does not apply to all platforms.
> >> >
> >>
> >> Yes, but there are hardware PMDs that implement
> >eth_rx_adapter_queue_add
> >> op, and I am looking for a solution that works for both cases.
> >>
> >> The idea is to use the new eventdev PMD operation
> >> (eth_rx_adapter_queues_add) within the
> >> rte_event_eth_rx_adapter_queue_add() API. The parameters of this API
> >> can be easily mapped to and supported by the new PMD operation.
> >>
> >
> >This requires a change to the rte_event_eth_rx_adapter_queue_add()
> >stable API parameters.
> >This is an ABI breakage and may not be possible now.
> >It requires changes to many current applications that are using the
> >rte_event_eth_rx_adapter_queue_add() stable API.
> >
> 
> What I meant by mapping was to retain the stable API parameters as they are.
> Internally, the API can use the proposed eventdev PMD operation
> (eth_rx_adapter_queues_add) without causing an ABI break, as shown below.
> 
> int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t eth_dev_id,
>                 int32_t rx_queue_id,
>                 const struct rte_event_eth_rx_adapter_queue_conf *conf) {
>         if (rx_queue_id == -1)
>                 dev->dev_ops->eth_rx_adapter_queues_add)(
>                         dev, &rte_eth_devices[eth_dev_id], 0,
>                         conf, 0);
>         else
>                 dev->dev_ops->eth_rx_adapter_queues_add)(
>                         dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
>                         conf, 1);
> }
> 
> With above change, old op (eth_rx_adapter_queue_add) can be removed as
> both API (stable and proposed) will be using eth_rx_adapter_queues_add.
> 

The whole idea is not to have the proposed API, as it does not add any new feature but just a combination of existing API.
It is already discussed in the previous threads.
The internal implementation details are not a concern.

> >> typedef int (*eventdev_eth_rx_adapter_queues_add_t)(
> >>     const struct rte_eventdev *dev,
> >>     const struct rte_eth_dev *eth_dev,
> >>     int32_t rx_queue_id[],
> >>     const struct rte_event_eth_rx_adapter_queue_conf queue_conf[],
> >>     uint16_t nb_rx_queues);
> >>
> >> With this, the old PMD op (eth_rx_adapter_queue_add) can be removed.
> >>
> >> >> >> >> >
> >> >> >> >> >The new API was introduced because the existing API does
> >> >> >> >> >not support adding multiple queues with specific configurations.
> >> >> >> >> >It serves as a burst variant of the existing API, like many
> >> >> >> >> >other APIs in
> >> >> DPDK.
> >> >> >> >> >
> >> >> >> >
> >> >> >> >The other burst APIs may be there for dataplane
> >> >> >> >functionalities, but may not be for the control plane functionalities.
> >> >> >> >
> >> >> >>
> >> >> >> rte_acl_add_rules() is an example of burst API in control path.
> >> >> >>
> >> >> >
> >> >> >I mean, In general, burst APIs are for data-plane functions.
> >> >> >This may be one of the rare cases where a burst API is in the control
> path.
> >> >> >
> >> >> >> >> >For better clarity, the API can be renamed to
> >> >> >> >> >rte_event_eth_rx_adapter_queue_add_burst() if needed.
> >> >> >> >> >
> >> >> >> >> >In hardware, adding each queue individually incurs
> >> >> >> >> >significant overheads, such as mailbox operations. A burst
> >> >> >> >> >API helps to amortize this overhead. Since real- world
> >> >> >> >> >applications often call the API with specific queue_ids,
> >> >> >> >> >the burst API can provide considerable
> >> >> benefits.
> >> >> >> >> >Testing shows a 75% reduction in time when adding multiple
> >> >> >> >> >queues to the RX adapter using the burst API on our platform.
> >> >> >> >> >
> >> >> >> >
> >> >> >> > As batching helps for a particular hardware device, this may
> >> >> >> >not be applicable for all platforms/cases.
> >> >> >> >	Since queue_add is a control plane operation, latency may not
> >be
> >> >> >> >a concern.
> >> >> >>
> >> >> >> In certain use cases, these APIs can be considered semi-fast path.
> >> >> >> For
> >> >> >instance,
> >> >> >> in an application that hotplugs a port on demand, configuring
> >> >> >> all available queues simultaneously can significantly reduce latency.
> >> >> >>
> >> >> >
> >> >> >As said earlier, this latency reduction (when trying to add
> >> >> >multiple RX queues to the Event Ethernet Rx adapter) may not
> >> >> >apply to all
> >> >> platforms/cases.
> >> >> >This API is not for configuring queues but for adding the queues
> >> >> >to the RX adapter.
> >> >> >
> >> >> >> >How to specify a particular set(specific queue_ids) of
> >> >> >> >rx_queues that has a non- zero start index with the new proposed
> API?
> >> >> >>
> >> >> >> In the proposed API,
> >> >> >> int rte_event_eth_rx_adapter_queues_add(
> >> >> >>                         uint8_t id, uint16_t eth_dev_id, int32_t rx_queue_id[],
> >> >> >>                         const struct rte_event_eth_rx_adapter_queue_conf
> conf[],
> >> >> >>                         uint16_t nb_rx_queues); rx_queues_id is
> >> >> >> an array containing the receive queues ids, which can start
> >> >> >> from a non-zero value. The array index is used solely to locate
> >> >> >> the corresponding queue_conf. For example, rx_queues_id[i] will
> >> >> >> use
> >conf[i].
> >> >> >>
> >> >> >
> >> >> >Ok
> >> >> >
> >> >> >> >	Since this is still not possible with the proposed API, the
> >> >> >> >existing queue_add API needs to be used with specific
> >> >> >> >queue_ids and their configurations.
> >> >> >> >
> >> >> >> >> >I can modify the old API implementation to act as a wrapper
> >> >> >> >> >around the burst API, with number of queues equal to 1. If
> >> >> >> >> >concerns remain, we can explore deprecation as an alternative.
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >> Please let me know if you have any suggestions/feedback on
> >> >> >> >> what I said above.
> >> >> >> >
> >> >> >> >Still feel the new proposed API can be avoided as it looks
> >> >> >> >like a different combination of existing API instead of adding
> >> >> >> >some new
> >> >features.
> >> >> >> >
> >> >> >> >> If not, I can go ahead and send v1.
> >> >> >> >>
> >> >> >> >> >>> The proposed API, rte_event_eth_rx_adapter_queues_add,
> >> >> >> >> >>> addresses this limitation by:
> >> >> >> >> >>>
> >> >> >> >> >>> - Enabling users to specify an array of rx_queue_id
> >> >> >> >> >>> values
> >> alongside
> >> >> >> >> >>>   individual configurations for each queue.
> >> >> >> >> >>>
> >> >> >> >> >>> - Supporting a nb_rx_queues argument to define the
> >> >> >> >> >>> number of queues
> >> >> >> >to
> >> >> >> >> >>>   configure. When set to 0, the API applies a common
> >> >> >> >> >>> configuration
> >> >> to
> >> >> >> >> >>>   all queues, similar to the existing rx_queue_id = -1 behavior.
> >> >> >> >> >>>
> >> >> >> >> >>> This enhancement allows for more granular control when
> >> >> >> >> >>> configuring
> >> >> >> >> >multiple
> >> >> >> >> >>> Rx queues. Additionally, the API can act as a
> >> >> >> >> >>> replacement for the older API, offering both flexibility
> >> >> >> >> >>> and improved
> >> functionality.
> >> >> >> >> >>>
> >> >> >> >> >>> Signed-off-by: Shijith Thotton <sthotton@marvell.com>
> >> >> >> >> >>> ---
> >> >> >> >> >>>  lib/eventdev/eventdev_pmd.h             | 34
> >> >> >> >> +++++++++++++++++++++++++
> >> >> >> >> >>>  lib/eventdev/rte_event_eth_rx_adapter.h | 34
> >> >> >> >> >>> +++++++++++++++++++++++++
> >> >> >> >> >>>  2 files changed, 68 insertions(+)
> >> >> >> >> >>>
> >> >> >> >> >>> diff --git a/lib/eventdev/eventdev_pmd.h
> >> >> >> >> >>> b/lib/eventdev/eventdev_pmd.h index
> >> 36148f8d86..2e458a9779
> >> >> >> >> 100644
> >> >> >> >> >>> --- a/lib/eventdev/eventdev_pmd.h
> >> >> >> >> >>> +++ b/lib/eventdev/eventdev_pmd.h
> >> >> >> >> >>> @@ -25,6 +25,7 @@
> >> >> >> >> >>>  #include <rte_mbuf_dyn.h>
> >> >> >> >> >>>
> >> >> >> >> >>>  #include "event_timer_adapter_pmd.h"
> >> >> >> >> >>> +#include "rte_event_eth_rx_adapter.h"
> >> >> >> >> >>>  #include "rte_eventdev.h"
> >> >> >> >> >>>
> >> >> >> >> >>>  #ifdef __cplusplus
> >> >> >> >> >>> @@ -708,6 +709,37 @@ typedef int
> >> >> >> >> >>> (*eventdev_eth_rx_adapter_queue_add_t)(
> >> >> >> >> >>>  		int32_t rx_queue_id,
> >> >> >> >> >>>  		const struct
> rte_event_eth_rx_adapter_queue_conf
> >> >> >> >> >>> *queue_conf);
> >> >> >> >> >>>
> >> >> >> >> >>> +/**
> >> >> >> >> >>> + * Add ethernet Rx queues to event device. This
> >> >> >> >> >>> +callback is invoked if
> >> >> >> >> >>> + * the caps returned from
> >> >> >> >> >>> +rte_eventdev_eth_rx_adapter_caps_get(,
> >> >> >> >> >>> +eth_port_id)
> >> >> >> >> >>> + * has RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT
> >set.
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param dev
> >> >> >> >> >>> + *   Event device pointer
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param eth_dev
> >> >> >> >> >>> + *   Ethernet device pointer
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param rx_queue_id
> >> >> >> >> >>> + *   Ethernet device receive queue index array
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param queue_conf
> >> >> >> >> >>> + *   Additional configuration structure array
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param nb_rx_queues
> >> >> >> >> >>> + *   Number of ethernet device receive queues
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @return
> >> >> >> >> >>> + *   - 0: Success, ethernet receive queues added successfully.
> >> >> >> >> >>> + *   - <0: Error code returned by the driver function.
> >> >> >> >> >>> + */
> >> >> >> >> >>> +typedef int (*eventdev_eth_rx_adapter_queues_add_t)(
> >> >> >> >> >>> +		const struct rte_eventdev *dev,
> >> >> >> >> >>> +		const struct rte_eth_dev *eth_dev,
> >> >> >> >> >>> +		int32_t rx_queue_id[],
> >> >> >> >> >>> +		const struct
> rte_event_eth_rx_adapter_queue_conf
> >> >> >> >> >>> queue_conf[],
> >> >> >> >> >>> +		uint16_t nb_rx_queues);
> >> >> >> >> >>> +
> >> >> >> >> >>>  /**
> >> >> >> >> >>>   * Delete ethernet Rx queues from event device. This
> >> >> >> >> >>> callback is
> >> >> >invoked
> >> >> >> if
> >> >> >> >> >>>   * the caps returned from
> >> >> >> >> >>> eventdev_eth_rx_adapter_caps_get(,
> >> >> >> >> >eth_port_id)
> >> >> >> >> >>> @@ -1578,6 +1610,8 @@ struct eventdev_ops {
> >> >> >> >> >>>  	/**< Get ethernet Rx adapter capabilities */
> >> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_add_t
> >> >> >eth_rx_adapter_queue_add;
> >> >> >> >> >>>  	/**< Add Rx queues to ethernet Rx adapter */
> >> >> >> >> >>> +	eventdev_eth_rx_adapter_queues_add_t
> >> >> >> >> >>> eth_rx_adapter_queues_add;
> >> >> >> >> >>> +	/**< Add Rx queues to ethernet Rx adapter */
> >> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_del_t
> >> >> >eth_rx_adapter_queue_del;
> >> >> >> >> >>>  	/**< Delete Rx queues from ethernet Rx adapter */
> >> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_conf_get_t
> >> >> >> >> >>> eth_rx_adapter_queue_conf_get; diff --git
> >> >> >> >> >>> a/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >> >>> b/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >> >>> index 9237e198a7..9a5c560b67 100644
> >> >> >> >> >>> --- a/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >> >>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >> >>> @@ -553,6 +553,40 @@ int
> >> >> >> >> rte_event_eth_rx_adapter_queue_add(uint8_t
> >> >> >> >> >>> id,
> >> >> >> >> >>>  			int32_t rx_queue_id,
> >> >> >> >> >>>  			const struct
> >> >> >rte_event_eth_rx_adapter_queue_conf
> >> >> >> >> >>> *conf);
> >> >> >> >> >>>
> >> >> >> >> >>> +/**
> >> >> >> >> >>> + * Add multiple receive queues to an event adapter.
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param id
> >> >> >> >> >>> + *  Adapter identifier.
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param eth_dev_id
> >> >> >> >> >>> + *  Port identifier of Ethernet device.
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param rx_queue_id
> >> >> >> >> >>> + *  Array of Ethernet device receive queue indices.
> >> >> >> >> >>> + *  If nb_rx_queues is 0, then rx_queue_id is ignored.
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param conf
> >> >> >> >> >>> + *  Array of additional configuration structures of
> >> >> >> >> >>> +type
> >> >> >> >> >>> + *  *rte_event_eth_rx_adapter_queue_conf*. conf[i] is
> >> >> >> >> >>> +used for
> >> >> >> >> >>> rx_queue_id[i].
> >> >> >> >> >>> + *  If nb_rx_queues is 0, then conf[0] is used for all Rx queues.
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @param nb_rx_queues
> >> >> >> >> >>> + *  Number of receive queues to add.
> >> >> >> >> >>> + *  If nb_rx_queues is 0, then all Rx queues configured
> >> >> >> >> >>> +for
> >> >> >> >> >>> + *  the device are added with the same configuration in
> conf[0].
> >> >> >> >> >>> + * @see RTE_EVENT_ETH_RX_ADAPTER_CAP_MULTI_EVENTQ
> >> >> >> >> >>> + *
> >> >> >> >> >>> + * @return
> >> >> >> >> >>> + *  - 0: Success, Receive queues added correctly.
> >> >> >> >> >>> + *  - <0: Error code on failure.
> >> >> >> >> >>> + */
> >> >> >> >> >>> +__rte_experimental
> >> >> >> >> >>> +int rte_event_eth_rx_adapter_queues_add(
> >> >> >> >> >>> +			uint8_t id, uint16_t eth_dev_id,
> int32_t
> >> >> >> >> >>> rx_queue_id[],
> >> >> >> >> >>> +			const struct
> >> >> >> rte_event_eth_rx_adapter_queue_conf
> >> >> >> >> >>> conf[],
> >> >> >> >> >>> +			uint16_t nb_rx_queues);
> >> >> >> >> >>> +
> >> >> >> >> >>>  /**
> >> >> >> >> >>>   * Delete receive queue from an event adapter.
> >> >> >> >> >>>   *
> >> >> >> >> >>> --
> >> >> >> >> >>> 2.25.1


^ permalink raw reply	[relevance 0%]

* [PATCH v6 01/25] net: move intel drivers to intel subdirectory
  @ 2025-01-24 16:28  1%   ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2025-01-24 16:28 UTC (permalink / raw)
  To: dev
  Cc: david.marchand, anatoly.burakov, vladimir.medvedkin, ian.stokes,
	praveen.shetty, Bruce Richardson

Consolidate all Intel HW NIC drivers into a driver/net/intel  This
matches the layout used for drivers in the kernel, and potentially
enabling easier sharing among drivers.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 MAINTAINERS                                   | 20 +++++++++----------
 devtools/check-git-log.sh                     |  9 +++++++++
 doc/api/doxy-api.conf.in                      |  6 +++---
 doc/guides/nics/ice.rst                       |  2 +-
 doc/guides/rel_notes/release_25_03.rst        |  7 +++++++
 drivers/meson.build                           |  6 +++++-
 drivers/net/{ => intel}/cpfl/cpfl_actions.h   |  0
 drivers/net/{ => intel}/cpfl/cpfl_controlq.c  |  0
 drivers/net/{ => intel}/cpfl/cpfl_controlq.h  |  0
 drivers/net/{ => intel}/cpfl/cpfl_cpchnl.h    |  0
 drivers/net/{ => intel}/cpfl/cpfl_ethdev.c    |  0
 drivers/net/{ => intel}/cpfl/cpfl_ethdev.h    |  0
 drivers/net/{ => intel}/cpfl/cpfl_flow.c      |  0
 drivers/net/{ => intel}/cpfl/cpfl_flow.h      |  0
 .../{ => intel}/cpfl/cpfl_flow_engine_fxp.c   |  0
 .../net/{ => intel}/cpfl/cpfl_flow_parser.c   |  0
 .../net/{ => intel}/cpfl/cpfl_flow_parser.h   |  0
 drivers/net/{ => intel}/cpfl/cpfl_fxp_rule.c  |  0
 drivers/net/{ => intel}/cpfl/cpfl_fxp_rule.h  |  0
 drivers/net/{ => intel}/cpfl/cpfl_logs.h      |  0
 .../net/{ => intel}/cpfl/cpfl_representor.c   |  0
 .../net/{ => intel}/cpfl/cpfl_representor.h   |  0
 drivers/net/{ => intel}/cpfl/cpfl_rules.c     |  0
 drivers/net/{ => intel}/cpfl/cpfl_rules.h     |  0
 drivers/net/{ => intel}/cpfl/cpfl_rxtx.c      |  0
 drivers/net/{ => intel}/cpfl/cpfl_rxtx.h      |  0
 .../{ => intel}/cpfl/cpfl_rxtx_vec_common.h   |  0
 drivers/net/{ => intel}/cpfl/cpfl_vchnl.c     |  0
 drivers/net/{ => intel}/cpfl/meson.build      |  0
 drivers/net/{ => intel}/e1000/base/README     |  0
 .../e1000/base/e1000_80003es2lan.c            |  0
 .../e1000/base/e1000_80003es2lan.h            |  0
 .../net/{ => intel}/e1000/base/e1000_82540.c  |  0
 .../net/{ => intel}/e1000/base/e1000_82541.c  |  0
 .../net/{ => intel}/e1000/base/e1000_82541.h  |  0
 .../net/{ => intel}/e1000/base/e1000_82542.c  |  0
 .../net/{ => intel}/e1000/base/e1000_82543.c  |  0
 .../net/{ => intel}/e1000/base/e1000_82543.h  |  0
 .../net/{ => intel}/e1000/base/e1000_82571.c  |  0
 .../net/{ => intel}/e1000/base/e1000_82571.h  |  0
 .../net/{ => intel}/e1000/base/e1000_82575.c  |  0
 .../net/{ => intel}/e1000/base/e1000_82575.h  |  0
 .../net/{ => intel}/e1000/base/e1000_api.c    |  0
 .../net/{ => intel}/e1000/base/e1000_api.h    |  0
 .../net/{ => intel}/e1000/base/e1000_base.c   |  0
 .../net/{ => intel}/e1000/base/e1000_base.h   |  0
 .../{ => intel}/e1000/base/e1000_defines.h    |  0
 drivers/net/{ => intel}/e1000/base/e1000_hw.h |  0
 .../net/{ => intel}/e1000/base/e1000_i210.c   |  0
 .../net/{ => intel}/e1000/base/e1000_i210.h   |  0
 .../{ => intel}/e1000/base/e1000_ich8lan.c    |  0
 .../{ => intel}/e1000/base/e1000_ich8lan.h    |  0
 .../net/{ => intel}/e1000/base/e1000_mac.c    |  0
 .../net/{ => intel}/e1000/base/e1000_mac.h    |  0
 .../net/{ => intel}/e1000/base/e1000_manage.c |  0
 .../net/{ => intel}/e1000/base/e1000_manage.h |  0
 .../net/{ => intel}/e1000/base/e1000_mbx.c    |  0
 .../net/{ => intel}/e1000/base/e1000_mbx.h    |  0
 .../net/{ => intel}/e1000/base/e1000_nvm.c    |  0
 .../net/{ => intel}/e1000/base/e1000_nvm.h    |  0
 .../net/{ => intel}/e1000/base/e1000_osdep.c  |  0
 .../net/{ => intel}/e1000/base/e1000_osdep.h  |  0
 .../net/{ => intel}/e1000/base/e1000_phy.c    |  0
 .../net/{ => intel}/e1000/base/e1000_phy.h    |  0
 .../net/{ => intel}/e1000/base/e1000_regs.h   |  0
 drivers/net/{ => intel}/e1000/base/e1000_vf.c |  0
 drivers/net/{ => intel}/e1000/base/e1000_vf.h |  0
 .../net/{ => intel}/e1000/base/meson.build    |  0
 drivers/net/{ => intel}/e1000/e1000_ethdev.h  |  0
 drivers/net/{ => intel}/e1000/e1000_logs.c    |  0
 drivers/net/{ => intel}/e1000/e1000_logs.h    |  0
 drivers/net/{ => intel}/e1000/em_ethdev.c     |  0
 drivers/net/{ => intel}/e1000/em_rxtx.c       |  0
 drivers/net/{ => intel}/e1000/igb_ethdev.c    |  0
 drivers/net/{ => intel}/e1000/igb_flow.c      |  0
 drivers/net/{ => intel}/e1000/igb_pf.c        |  0
 drivers/net/{ => intel}/e1000/igb_regs.h      |  0
 drivers/net/{ => intel}/e1000/igb_rxtx.c      |  0
 drivers/net/{ => intel}/e1000/meson.build     |  0
 .../net/{ => intel}/fm10k/base/fm10k_api.c    |  0
 .../net/{ => intel}/fm10k/base/fm10k_api.h    |  0
 .../net/{ => intel}/fm10k/base/fm10k_common.c |  0
 .../net/{ => intel}/fm10k/base/fm10k_common.h |  0
 .../net/{ => intel}/fm10k/base/fm10k_mbx.c    |  0
 .../net/{ => intel}/fm10k/base/fm10k_mbx.h    |  0
 .../net/{ => intel}/fm10k/base/fm10k_osdep.h  |  0
 drivers/net/{ => intel}/fm10k/base/fm10k_pf.c |  0
 drivers/net/{ => intel}/fm10k/base/fm10k_pf.h |  0
 .../net/{ => intel}/fm10k/base/fm10k_tlv.c    |  0
 .../net/{ => intel}/fm10k/base/fm10k_tlv.h    |  0
 .../net/{ => intel}/fm10k/base/fm10k_type.h   |  0
 drivers/net/{ => intel}/fm10k/base/fm10k_vf.c |  0
 drivers/net/{ => intel}/fm10k/base/fm10k_vf.h |  0
 .../net/{ => intel}/fm10k/base/meson.build    |  0
 drivers/net/{ => intel}/fm10k/fm10k.h         |  0
 drivers/net/{ => intel}/fm10k/fm10k_ethdev.c  |  0
 drivers/net/{ => intel}/fm10k/fm10k_logs.h    |  0
 drivers/net/{ => intel}/fm10k/fm10k_rxtx.c    |  0
 .../net/{ => intel}/fm10k/fm10k_rxtx_vec.c    |  0
 drivers/net/{ => intel}/fm10k/meson.build     |  0
 drivers/net/{ => intel}/i40e/base/README      |  0
 .../net/{ => intel}/i40e/base/i40e_adminq.c   |  0
 .../net/{ => intel}/i40e/base/i40e_adminq.h   |  0
 .../{ => intel}/i40e/base/i40e_adminq_cmd.h   |  0
 .../net/{ => intel}/i40e/base/i40e_alloc.h    |  0
 .../net/{ => intel}/i40e/base/i40e_common.c   |  0
 drivers/net/{ => intel}/i40e/base/i40e_dcb.c  |  0
 drivers/net/{ => intel}/i40e/base/i40e_dcb.h  |  0
 .../net/{ => intel}/i40e/base/i40e_devids.h   |  0
 drivers/net/{ => intel}/i40e/base/i40e_diag.c |  0
 drivers/net/{ => intel}/i40e/base/i40e_diag.h |  0
 drivers/net/{ => intel}/i40e/base/i40e_hmc.c  |  0
 drivers/net/{ => intel}/i40e/base/i40e_hmc.h  |  0
 .../net/{ => intel}/i40e/base/i40e_lan_hmc.c  |  0
 .../net/{ => intel}/i40e/base/i40e_lan_hmc.h  |  0
 drivers/net/{ => intel}/i40e/base/i40e_nvm.c  |  0
 .../net/{ => intel}/i40e/base/i40e_osdep.h    |  0
 .../{ => intel}/i40e/base/i40e_prototype.h    |  0
 .../net/{ => intel}/i40e/base/i40e_register.h |  0
 .../net/{ => intel}/i40e/base/i40e_status.h   |  0
 drivers/net/{ => intel}/i40e/base/i40e_type.h |  0
 drivers/net/{ => intel}/i40e/base/meson.build |  0
 drivers/net/{ => intel}/i40e/base/virtchnl.h  |  0
 drivers/net/{ => intel}/i40e/i40e_ethdev.c    |  0
 drivers/net/{ => intel}/i40e/i40e_ethdev.h    |  0
 drivers/net/{ => intel}/i40e/i40e_fdir.c      |  0
 drivers/net/{ => intel}/i40e/i40e_flow.c      |  0
 drivers/net/{ => intel}/i40e/i40e_hash.c      |  0
 drivers/net/{ => intel}/i40e/i40e_hash.h      |  0
 drivers/net/{ => intel}/i40e/i40e_logs.h      |  0
 drivers/net/{ => intel}/i40e/i40e_pf.c        |  0
 drivers/net/{ => intel}/i40e/i40e_pf.h        |  0
 .../i40e/i40e_recycle_mbufs_vec_common.c      |  0
 drivers/net/{ => intel}/i40e/i40e_regs.h      |  0
 drivers/net/{ => intel}/i40e/i40e_rxtx.c      |  0
 drivers/net/{ => intel}/i40e/i40e_rxtx.h      |  0
 .../{ => intel}/i40e/i40e_rxtx_common_avx.h   |  0
 .../{ => intel}/i40e/i40e_rxtx_vec_altivec.c  |  0
 .../net/{ => intel}/i40e/i40e_rxtx_vec_avx2.c |  0
 .../{ => intel}/i40e/i40e_rxtx_vec_avx512.c   |  0
 .../{ => intel}/i40e/i40e_rxtx_vec_common.h   |  0
 .../net/{ => intel}/i40e/i40e_rxtx_vec_neon.c |  0
 .../net/{ => intel}/i40e/i40e_rxtx_vec_sse.c  |  0
 drivers/net/{ => intel}/i40e/i40e_testpmd.c   |  0
 drivers/net/{ => intel}/i40e/i40e_tm.c        |  0
 .../{ => intel}/i40e/i40e_vf_representor.c    |  0
 drivers/net/{ => intel}/i40e/meson.build      |  0
 drivers/net/{ => intel}/i40e/rte_pmd_i40e.c   |  0
 drivers/net/{ => intel}/i40e/rte_pmd_i40e.h   |  0
 drivers/net/{ => intel}/i40e/version.map      |  0
 drivers/net/{ => intel}/iavf/iavf.h           |  0
 drivers/net/{ => intel}/iavf/iavf_ethdev.c    |  0
 drivers/net/{ => intel}/iavf/iavf_fdir.c      |  0
 drivers/net/{ => intel}/iavf/iavf_fsub.c      |  0
 .../net/{ => intel}/iavf/iavf_generic_flow.c  |  0
 .../net/{ => intel}/iavf/iavf_generic_flow.h  |  0
 drivers/net/{ => intel}/iavf/iavf_hash.c      |  0
 .../net/{ => intel}/iavf/iavf_ipsec_crypto.c  |  0
 .../net/{ => intel}/iavf/iavf_ipsec_crypto.h  |  0
 .../iavf/iavf_ipsec_crypto_capabilities.h     |  0
 drivers/net/{ => intel}/iavf/iavf_log.h       |  0
 drivers/net/{ => intel}/iavf/iavf_rxtx.c      |  0
 drivers/net/{ => intel}/iavf/iavf_rxtx.h      |  0
 .../net/{ => intel}/iavf/iavf_rxtx_vec_avx2.c |  0
 .../{ => intel}/iavf/iavf_rxtx_vec_avx512.c   |  0
 .../{ => intel}/iavf/iavf_rxtx_vec_common.h   |  0
 .../net/{ => intel}/iavf/iavf_rxtx_vec_neon.c |  0
 .../net/{ => intel}/iavf/iavf_rxtx_vec_sse.c  |  0
 drivers/net/{ => intel}/iavf/iavf_testpmd.c   |  0
 drivers/net/{ => intel}/iavf/iavf_tm.c        |  0
 drivers/net/{ => intel}/iavf/iavf_vchnl.c     |  0
 drivers/net/{ => intel}/iavf/meson.build      |  9 +++------
 drivers/net/{ => intel}/iavf/rte_pmd_iavf.h   |  0
 drivers/net/{ => intel}/iavf/version.map      |  0
 drivers/net/{ => intel}/ice/base/README       |  0
 drivers/net/{ => intel}/ice/base/ice_acl.c    |  0
 drivers/net/{ => intel}/ice/base/ice_acl.h    |  0
 .../net/{ => intel}/ice/base/ice_acl_ctrl.c   |  0
 .../net/{ => intel}/ice/base/ice_adminq_cmd.h |  0
 drivers/net/{ => intel}/ice/base/ice_alloc.h  |  0
 drivers/net/{ => intel}/ice/base/ice_bitops.h |  0
 .../net/{ => intel}/ice/base/ice_bst_tcam.c   |  0
 .../net/{ => intel}/ice/base/ice_bst_tcam.h   |  0
 .../net/{ => intel}/ice/base/ice_cgu_regs.h   |  0
 drivers/net/{ => intel}/ice/base/ice_common.c |  0
 drivers/net/{ => intel}/ice/base/ice_common.h |  0
 .../net/{ => intel}/ice/base/ice_controlq.c   |  0
 .../net/{ => intel}/ice/base/ice_controlq.h   |  0
 drivers/net/{ => intel}/ice/base/ice_dcb.c    |  0
 drivers/net/{ => intel}/ice/base/ice_dcb.h    |  0
 drivers/net/{ => intel}/ice/base/ice_ddp.c    |  0
 drivers/net/{ => intel}/ice/base/ice_ddp.h    |  0
 drivers/net/{ => intel}/ice/base/ice_defs.h   |  0
 drivers/net/{ => intel}/ice/base/ice_devids.h |  0
 drivers/net/{ => intel}/ice/base/ice_fdir.c   |  0
 drivers/net/{ => intel}/ice/base/ice_fdir.h   |  0
 .../net/{ => intel}/ice/base/ice_flex_pipe.c  |  0
 .../net/{ => intel}/ice/base/ice_flex_pipe.h  |  0
 .../net/{ => intel}/ice/base/ice_flex_type.h  |  0
 drivers/net/{ => intel}/ice/base/ice_flg_rd.c |  0
 drivers/net/{ => intel}/ice/base/ice_flg_rd.h |  0
 drivers/net/{ => intel}/ice/base/ice_flow.c   |  0
 drivers/net/{ => intel}/ice/base/ice_flow.h   |  0
 drivers/net/{ => intel}/ice/base/ice_fwlog.c  |  0
 drivers/net/{ => intel}/ice/base/ice_fwlog.h  |  0
 .../net/{ => intel}/ice/base/ice_hw_autogen.h |  0
 drivers/net/{ => intel}/ice/base/ice_imem.c   |  0
 drivers/net/{ => intel}/ice/base/ice_imem.h   |  0
 .../net/{ => intel}/ice/base/ice_lan_tx_rx.h  |  0
 .../net/{ => intel}/ice/base/ice_metainit.c   |  0
 .../net/{ => intel}/ice/base/ice_metainit.h   |  0
 drivers/net/{ => intel}/ice/base/ice_mk_grp.c |  0
 drivers/net/{ => intel}/ice/base/ice_mk_grp.h |  0
 drivers/net/{ => intel}/ice/base/ice_nvm.c    |  0
 drivers/net/{ => intel}/ice/base/ice_nvm.h    |  0
 drivers/net/{ => intel}/ice/base/ice_osdep.h  |  0
 drivers/net/{ => intel}/ice/base/ice_parser.c |  0
 drivers/net/{ => intel}/ice/base/ice_parser.h |  0
 .../net/{ => intel}/ice/base/ice_parser_rt.c  |  0
 .../net/{ => intel}/ice/base/ice_parser_rt.h  |  0
 .../{ => intel}/ice/base/ice_parser_util.h    |  0
 drivers/net/{ => intel}/ice/base/ice_pg_cam.c |  0
 drivers/net/{ => intel}/ice/base/ice_pg_cam.h |  0
 .../net/{ => intel}/ice/base/ice_phy_regs.h   |  0
 .../net/{ => intel}/ice/base/ice_proto_grp.c  |  0
 .../net/{ => intel}/ice/base/ice_proto_grp.h  |  0
 .../{ => intel}/ice/base/ice_protocol_type.h  |  0
 .../net/{ => intel}/ice/base/ice_ptp_consts.h |  0
 drivers/net/{ => intel}/ice/base/ice_ptp_hw.c |  0
 drivers/net/{ => intel}/ice/base/ice_ptp_hw.h |  0
 .../net/{ => intel}/ice/base/ice_ptype_mk.c   |  0
 .../net/{ => intel}/ice/base/ice_ptype_mk.h   |  0
 .../net/{ => intel}/ice/base/ice_sbq_cmd.h    |  0
 drivers/net/{ => intel}/ice/base/ice_sched.c  |  0
 drivers/net/{ => intel}/ice/base/ice_sched.h  |  0
 drivers/net/{ => intel}/ice/base/ice_status.h |  0
 drivers/net/{ => intel}/ice/base/ice_switch.c |  0
 drivers/net/{ => intel}/ice/base/ice_switch.h |  0
 drivers/net/{ => intel}/ice/base/ice_tmatch.h |  0
 drivers/net/{ => intel}/ice/base/ice_type.h   |  0
 drivers/net/{ => intel}/ice/base/ice_vf_mbx.c |  0
 drivers/net/{ => intel}/ice/base/ice_vf_mbx.h |  0
 .../net/{ => intel}/ice/base/ice_vlan_mode.c  |  0
 .../net/{ => intel}/ice/base/ice_vlan_mode.h  |  0
 drivers/net/{ => intel}/ice/base/ice_xlt_kb.c |  0
 drivers/net/{ => intel}/ice/base/ice_xlt_kb.h |  0
 drivers/net/{ => intel}/ice/base/meson.build  |  0
 drivers/net/{ => intel}/ice/ice_acl_filter.c  |  0
 drivers/net/{ => intel}/ice/ice_dcf.c         |  0
 drivers/net/{ => intel}/ice/ice_dcf.h         |  0
 drivers/net/{ => intel}/ice/ice_dcf_ethdev.c  |  0
 drivers/net/{ => intel}/ice/ice_dcf_ethdev.h  |  0
 drivers/net/{ => intel}/ice/ice_dcf_parent.c  |  0
 drivers/net/{ => intel}/ice/ice_dcf_sched.c   |  0
 .../{ => intel}/ice/ice_dcf_vf_representor.c  |  0
 drivers/net/{ => intel}/ice/ice_diagnose.c    |  0
 drivers/net/{ => intel}/ice/ice_ethdev.c      |  0
 drivers/net/{ => intel}/ice/ice_ethdev.h      |  0
 drivers/net/{ => intel}/ice/ice_fdir_filter.c |  0
 .../net/{ => intel}/ice/ice_generic_flow.c    |  0
 .../net/{ => intel}/ice/ice_generic_flow.h    |  0
 drivers/net/{ => intel}/ice/ice_hash.c        |  0
 drivers/net/{ => intel}/ice/ice_logs.h        |  0
 drivers/net/{ => intel}/ice/ice_rxtx.c        |  0
 drivers/net/{ => intel}/ice/ice_rxtx.h        |  0
 .../net/{ => intel}/ice/ice_rxtx_common_avx.h |  0
 .../net/{ => intel}/ice/ice_rxtx_vec_avx2.c   |  0
 .../net/{ => intel}/ice/ice_rxtx_vec_avx512.c |  0
 .../net/{ => intel}/ice/ice_rxtx_vec_common.h |  0
 .../net/{ => intel}/ice/ice_rxtx_vec_sse.c    |  0
 .../net/{ => intel}/ice/ice_switch_filter.c   |  0
 drivers/net/{ => intel}/ice/ice_testpmd.c     |  0
 drivers/net/{ => intel}/ice/ice_tm.c          |  0
 drivers/net/{ => intel}/ice/meson.build       |  2 +-
 drivers/net/{ => intel}/ice/version.map       |  0
 drivers/net/{ => intel}/idpf/idpf_ethdev.c    |  0
 drivers/net/{ => intel}/idpf/idpf_ethdev.h    |  0
 drivers/net/{ => intel}/idpf/idpf_logs.h      |  0
 drivers/net/{ => intel}/idpf/idpf_rxtx.c      |  0
 drivers/net/{ => intel}/idpf/idpf_rxtx.h      |  0
 .../{ => intel}/idpf/idpf_rxtx_vec_common.h   |  0
 drivers/net/{ => intel}/idpf/meson.build      |  0
 drivers/net/{ => intel}/igc/base/README       |  0
 drivers/net/{ => intel}/igc/base/igc_82571.h  |  0
 drivers/net/{ => intel}/igc/base/igc_82575.h  |  0
 drivers/net/{ => intel}/igc/base/igc_api.c    |  0
 drivers/net/{ => intel}/igc/base/igc_api.h    |  0
 drivers/net/{ => intel}/igc/base/igc_base.c   |  0
 drivers/net/{ => intel}/igc/base/igc_base.h   |  0
 .../net/{ => intel}/igc/base/igc_defines.h    |  0
 drivers/net/{ => intel}/igc/base/igc_hw.h     |  0
 drivers/net/{ => intel}/igc/base/igc_i225.c   |  0
 drivers/net/{ => intel}/igc/base/igc_i225.h   |  0
 .../net/{ => intel}/igc/base/igc_ich8lan.h    |  0
 drivers/net/{ => intel}/igc/base/igc_mac.c    |  0
 drivers/net/{ => intel}/igc/base/igc_mac.h    |  0
 drivers/net/{ => intel}/igc/base/igc_manage.c |  0
 drivers/net/{ => intel}/igc/base/igc_manage.h |  0
 drivers/net/{ => intel}/igc/base/igc_nvm.c    |  0
 drivers/net/{ => intel}/igc/base/igc_nvm.h    |  0
 drivers/net/{ => intel}/igc/base/igc_osdep.c  |  0
 drivers/net/{ => intel}/igc/base/igc_osdep.h  |  0
 drivers/net/{ => intel}/igc/base/igc_phy.c    |  0
 drivers/net/{ => intel}/igc/base/igc_phy.h    |  0
 drivers/net/{ => intel}/igc/base/igc_regs.h   |  0
 drivers/net/{ => intel}/igc/base/meson.build  |  0
 drivers/net/{ => intel}/igc/igc_ethdev.c      |  0
 drivers/net/{ => intel}/igc/igc_ethdev.h      |  0
 drivers/net/{ => intel}/igc/igc_filter.c      |  0
 drivers/net/{ => intel}/igc/igc_filter.h      |  0
 drivers/net/{ => intel}/igc/igc_flow.c        |  0
 drivers/net/{ => intel}/igc/igc_flow.h        |  0
 drivers/net/{ => intel}/igc/igc_logs.c        |  0
 drivers/net/{ => intel}/igc/igc_logs.h        |  0
 drivers/net/{ => intel}/igc/igc_txrx.c        |  0
 drivers/net/{ => intel}/igc/igc_txrx.h        |  0
 drivers/net/{ => intel}/igc/meson.build       |  0
 .../net/{ => intel}/ipn3ke/ipn3ke_ethdev.c    |  0
 .../net/{ => intel}/ipn3ke/ipn3ke_ethdev.h    |  0
 drivers/net/{ => intel}/ipn3ke/ipn3ke_flow.c  |  0
 drivers/net/{ => intel}/ipn3ke/ipn3ke_flow.h  |  0
 drivers/net/{ => intel}/ipn3ke/ipn3ke_logs.h  |  0
 .../{ => intel}/ipn3ke/ipn3ke_rawdev_api.h    |  0
 .../{ => intel}/ipn3ke/ipn3ke_representor.c   |  0
 drivers/net/{ => intel}/ipn3ke/ipn3ke_tm.c    |  0
 drivers/net/{ => intel}/ipn3ke/meson.build    |  2 +-
 drivers/net/{ => intel}/ipn3ke/version.map    |  0
 drivers/net/{ => intel}/ixgbe/base/README     |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_82598.c  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_82598.h  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_82599.c  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_82599.h  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_api.c    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_api.h    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_common.c |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_common.h |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_dcb.c    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_dcb.h    |  0
 .../{ => intel}/ixgbe/base/ixgbe_dcb_82598.c  |  0
 .../{ => intel}/ixgbe/base/ixgbe_dcb_82598.h  |  0
 .../{ => intel}/ixgbe/base/ixgbe_dcb_82599.c  |  0
 .../{ => intel}/ixgbe/base/ixgbe_dcb_82599.h  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_e610.c   |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_e610.h   |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_hv_vf.c  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_hv_vf.h  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_mbx.c    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_mbx.h    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_osdep.c  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_osdep.h  |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_phy.c    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_phy.h    |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_type.h   |  0
 .../{ => intel}/ixgbe/base/ixgbe_type_e610.h  |  0
 drivers/net/{ => intel}/ixgbe/base/ixgbe_vf.c |  0
 drivers/net/{ => intel}/ixgbe/base/ixgbe_vf.h |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_x540.c   |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_x540.h   |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_x550.c   |  0
 .../net/{ => intel}/ixgbe/base/ixgbe_x550.h   |  0
 .../net/{ => intel}/ixgbe/base/meson.build    |  0
 .../{ => intel}/ixgbe/ixgbe_82599_bypass.c    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_bypass.c  |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_bypass.h  |  0
 .../net/{ => intel}/ixgbe/ixgbe_bypass_api.h  |  0
 .../{ => intel}/ixgbe/ixgbe_bypass_defines.h  |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_ethdev.c  |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_ethdev.h  |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_fdir.c    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_flow.c    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_ipsec.c   |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_ipsec.h   |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_logs.h    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_pf.c      |  0
 .../ixgbe/ixgbe_recycle_mbufs_vec_common.c    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_regs.h    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_rxtx.c    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_rxtx.h    |  0
 .../{ => intel}/ixgbe/ixgbe_rxtx_vec_common.h |  0
 .../{ => intel}/ixgbe/ixgbe_rxtx_vec_neon.c   |  0
 .../{ => intel}/ixgbe/ixgbe_rxtx_vec_sse.c    |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_testpmd.c |  0
 drivers/net/{ => intel}/ixgbe/ixgbe_tm.c      |  0
 .../{ => intel}/ixgbe/ixgbe_vf_representor.c  |  0
 drivers/net/{ => intel}/ixgbe/meson.build     |  0
 drivers/net/{ => intel}/ixgbe/rte_pmd_ixgbe.c |  0
 drivers/net/{ => intel}/ixgbe/rte_pmd_ixgbe.h |  0
 drivers/net/{ => intel}/ixgbe/version.map     |  0
 drivers/net/meson.build                       | 20 +++++++++----------
 drivers/raw/ifpga/meson.build                 |  2 --
 usertools/dpdk-rss-flows.py                   |  4 ++--
 391 files changed, 52 insertions(+), 37 deletions(-)
 rename drivers/net/{ => intel}/cpfl/cpfl_actions.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_controlq.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_controlq.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_cpchnl.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_ethdev.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_ethdev.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_flow.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_flow.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_flow_engine_fxp.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_flow_parser.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_flow_parser.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_fxp_rule.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_fxp_rule.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_logs.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_representor.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_representor.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_rules.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_rules.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_rxtx.c (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_rxtx.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_rxtx_vec_common.h (100%)
 rename drivers/net/{ => intel}/cpfl/cpfl_vchnl.c (100%)
 rename drivers/net/{ => intel}/cpfl/meson.build (100%)
 rename drivers/net/{ => intel}/e1000/base/README (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_80003es2lan.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_80003es2lan.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82540.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82541.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82541.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82542.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82543.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82543.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82571.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82571.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82575.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_82575.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_api.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_api.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_base.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_base.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_defines.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_hw.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_i210.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_i210.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_ich8lan.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_ich8lan.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_mac.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_mac.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_manage.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_manage.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_mbx.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_mbx.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_nvm.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_nvm.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_osdep.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_osdep.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_phy.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_phy.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_regs.h (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_vf.c (100%)
 rename drivers/net/{ => intel}/e1000/base/e1000_vf.h (100%)
 rename drivers/net/{ => intel}/e1000/base/meson.build (100%)
 rename drivers/net/{ => intel}/e1000/e1000_ethdev.h (100%)
 rename drivers/net/{ => intel}/e1000/e1000_logs.c (100%)
 rename drivers/net/{ => intel}/e1000/e1000_logs.h (100%)
 rename drivers/net/{ => intel}/e1000/em_ethdev.c (100%)
 rename drivers/net/{ => intel}/e1000/em_rxtx.c (100%)
 rename drivers/net/{ => intel}/e1000/igb_ethdev.c (100%)
 rename drivers/net/{ => intel}/e1000/igb_flow.c (100%)
 rename drivers/net/{ => intel}/e1000/igb_pf.c (100%)
 rename drivers/net/{ => intel}/e1000/igb_regs.h (100%)
 rename drivers/net/{ => intel}/e1000/igb_rxtx.c (100%)
 rename drivers/net/{ => intel}/e1000/meson.build (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_api.c (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_api.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_common.c (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_common.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_mbx.c (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_mbx.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_osdep.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_pf.c (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_pf.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_tlv.c (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_tlv.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_type.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_vf.c (100%)
 rename drivers/net/{ => intel}/fm10k/base/fm10k_vf.h (100%)
 rename drivers/net/{ => intel}/fm10k/base/meson.build (100%)
 rename drivers/net/{ => intel}/fm10k/fm10k.h (100%)
 rename drivers/net/{ => intel}/fm10k/fm10k_ethdev.c (100%)
 rename drivers/net/{ => intel}/fm10k/fm10k_logs.h (100%)
 rename drivers/net/{ => intel}/fm10k/fm10k_rxtx.c (100%)
 rename drivers/net/{ => intel}/fm10k/fm10k_rxtx_vec.c (100%)
 rename drivers/net/{ => intel}/fm10k/meson.build (100%)
 rename drivers/net/{ => intel}/i40e/base/README (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_adminq.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_adminq.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_adminq_cmd.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_alloc.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_common.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_dcb.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_dcb.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_devids.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_diag.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_diag.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_hmc.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_hmc.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_lan_hmc.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_lan_hmc.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_nvm.c (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_osdep.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_prototype.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_register.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_status.h (100%)
 rename drivers/net/{ => intel}/i40e/base/i40e_type.h (100%)
 rename drivers/net/{ => intel}/i40e/base/meson.build (100%)
 rename drivers/net/{ => intel}/i40e/base/virtchnl.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_ethdev.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_ethdev.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_fdir.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_flow.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_hash.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_hash.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_logs.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_pf.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_pf.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_recycle_mbufs_vec_common.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_regs.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_common_avx.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_vec_altivec.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_vec_avx2.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_vec_avx512.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_vec_common.h (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_vec_neon.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_rxtx_vec_sse.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_testpmd.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_tm.c (100%)
 rename drivers/net/{ => intel}/i40e/i40e_vf_representor.c (100%)
 rename drivers/net/{ => intel}/i40e/meson.build (100%)
 rename drivers/net/{ => intel}/i40e/rte_pmd_i40e.c (100%)
 rename drivers/net/{ => intel}/i40e/rte_pmd_i40e.h (100%)
 rename drivers/net/{ => intel}/i40e/version.map (100%)
 rename drivers/net/{ => intel}/iavf/iavf.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_ethdev.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_fdir.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_fsub.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_generic_flow.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_generic_flow.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_hash.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_ipsec_crypto.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_ipsec_crypto.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_ipsec_crypto_capabilities.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_log.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx_vec_avx2.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx_vec_avx512.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx_vec_common.h (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx_vec_neon.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_rxtx_vec_sse.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_testpmd.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_tm.c (100%)
 rename drivers/net/{ => intel}/iavf/iavf_vchnl.c (100%)
 rename drivers/net/{ => intel}/iavf/meson.build (84%)
 rename drivers/net/{ => intel}/iavf/rte_pmd_iavf.h (100%)
 rename drivers/net/{ => intel}/iavf/version.map (100%)
 rename drivers/net/{ => intel}/ice/base/README (100%)
 rename drivers/net/{ => intel}/ice/base/ice_acl.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_acl.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_acl_ctrl.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_adminq_cmd.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_alloc.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_bitops.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_bst_tcam.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_bst_tcam.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_cgu_regs.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_common.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_common.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_controlq.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_controlq.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_dcb.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_dcb.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ddp.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ddp.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_defs.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_devids.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_fdir.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_fdir.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flex_pipe.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flex_pipe.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flex_type.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flg_rd.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flg_rd.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flow.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_flow.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_fwlog.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_fwlog.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_hw_autogen.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_imem.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_imem.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_lan_tx_rx.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_metainit.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_metainit.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_mk_grp.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_mk_grp.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_nvm.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_nvm.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_osdep.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_parser.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_parser.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_parser_rt.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_parser_rt.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_parser_util.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_pg_cam.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_pg_cam.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_phy_regs.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_proto_grp.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_proto_grp.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_protocol_type.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ptp_consts.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ptp_hw.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ptp_hw.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ptype_mk.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_ptype_mk.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_sbq_cmd.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_sched.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_sched.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_status.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_switch.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_switch.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_tmatch.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_type.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_vf_mbx.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_vf_mbx.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_vlan_mode.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_vlan_mode.h (100%)
 rename drivers/net/{ => intel}/ice/base/ice_xlt_kb.c (100%)
 rename drivers/net/{ => intel}/ice/base/ice_xlt_kb.h (100%)
 rename drivers/net/{ => intel}/ice/base/meson.build (100%)
 rename drivers/net/{ => intel}/ice/ice_acl_filter.c (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf.c (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf.h (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf_ethdev.c (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf_ethdev.h (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf_parent.c (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf_sched.c (100%)
 rename drivers/net/{ => intel}/ice/ice_dcf_vf_representor.c (100%)
 rename drivers/net/{ => intel}/ice/ice_diagnose.c (100%)
 rename drivers/net/{ => intel}/ice/ice_ethdev.c (100%)
 rename drivers/net/{ => intel}/ice/ice_ethdev.h (100%)
 rename drivers/net/{ => intel}/ice/ice_fdir_filter.c (100%)
 rename drivers/net/{ => intel}/ice/ice_generic_flow.c (100%)
 rename drivers/net/{ => intel}/ice/ice_generic_flow.h (100%)
 rename drivers/net/{ => intel}/ice/ice_hash.c (100%)
 rename drivers/net/{ => intel}/ice/ice_logs.h (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx.c (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx.h (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx_common_avx.h (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx_vec_avx2.c (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx_vec_avx512.c (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx_vec_common.h (100%)
 rename drivers/net/{ => intel}/ice/ice_rxtx_vec_sse.c (100%)
 rename drivers/net/{ => intel}/ice/ice_switch_filter.c (100%)
 rename drivers/net/{ => intel}/ice/ice_testpmd.c (100%)
 rename drivers/net/{ => intel}/ice/ice_tm.c (100%)
 rename drivers/net/{ => intel}/ice/meson.build (96%)
 rename drivers/net/{ => intel}/ice/version.map (100%)
 rename drivers/net/{ => intel}/idpf/idpf_ethdev.c (100%)
 rename drivers/net/{ => intel}/idpf/idpf_ethdev.h (100%)
 rename drivers/net/{ => intel}/idpf/idpf_logs.h (100%)
 rename drivers/net/{ => intel}/idpf/idpf_rxtx.c (100%)
 rename drivers/net/{ => intel}/idpf/idpf_rxtx.h (100%)
 rename drivers/net/{ => intel}/idpf/idpf_rxtx_vec_common.h (100%)
 rename drivers/net/{ => intel}/idpf/meson.build (100%)
 rename drivers/net/{ => intel}/igc/base/README (100%)
 rename drivers/net/{ => intel}/igc/base/igc_82571.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_82575.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_api.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_api.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_base.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_base.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_defines.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_hw.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_i225.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_i225.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_ich8lan.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_mac.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_mac.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_manage.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_manage.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_nvm.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_nvm.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_osdep.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_osdep.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_phy.c (100%)
 rename drivers/net/{ => intel}/igc/base/igc_phy.h (100%)
 rename drivers/net/{ => intel}/igc/base/igc_regs.h (100%)
 rename drivers/net/{ => intel}/igc/base/meson.build (100%)
 rename drivers/net/{ => intel}/igc/igc_ethdev.c (100%)
 rename drivers/net/{ => intel}/igc/igc_ethdev.h (100%)
 rename drivers/net/{ => intel}/igc/igc_filter.c (100%)
 rename drivers/net/{ => intel}/igc/igc_filter.h (100%)
 rename drivers/net/{ => intel}/igc/igc_flow.c (100%)
 rename drivers/net/{ => intel}/igc/igc_flow.h (100%)
 rename drivers/net/{ => intel}/igc/igc_logs.c (100%)
 rename drivers/net/{ => intel}/igc/igc_logs.h (100%)
 rename drivers/net/{ => intel}/igc/igc_txrx.c (100%)
 rename drivers/net/{ => intel}/igc/igc_txrx.h (100%)
 rename drivers/net/{ => intel}/igc/meson.build (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_ethdev.c (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_ethdev.h (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_flow.c (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_flow.h (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_logs.h (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_rawdev_api.h (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_representor.c (100%)
 rename drivers/net/{ => intel}/ipn3ke/ipn3ke_tm.c (100%)
 rename drivers/net/{ => intel}/ipn3ke/meson.build (91%)
 rename drivers/net/{ => intel}/ipn3ke/version.map (100%)
 rename drivers/net/{ => intel}/ixgbe/base/README (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_82598.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_82598.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_82599.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_82599.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_api.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_api.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_common.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_common.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_dcb.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_dcb.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_dcb_82598.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_dcb_82598.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_dcb_82599.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_dcb_82599.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_e610.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_e610.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_hv_vf.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_hv_vf.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_mbx.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_mbx.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_osdep.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_osdep.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_phy.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_phy.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_type.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_type_e610.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_vf.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_vf.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_x540.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_x540.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_x550.c (100%)
 rename drivers/net/{ => intel}/ixgbe/base/ixgbe_x550.h (100%)
 rename drivers/net/{ => intel}/ixgbe/base/meson.build (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_82599_bypass.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_bypass.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_bypass.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_bypass_api.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_bypass_defines.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_ethdev.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_ethdev.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_fdir.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_flow.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_ipsec.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_ipsec.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_logs.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_pf.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_recycle_mbufs_vec_common.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_regs.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_rxtx.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_rxtx.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_rxtx_vec_common.h (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_rxtx_vec_neon.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_rxtx_vec_sse.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_testpmd.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_tm.c (100%)
 rename drivers/net/{ => intel}/ixgbe/ixgbe_vf_representor.c (100%)
 rename drivers/net/{ => intel}/ixgbe/meson.build (100%)
 rename drivers/net/{ => intel}/ixgbe/rte_pmd_ixgbe.c (100%)
 rename drivers/net/{ => intel}/ixgbe/rte_pmd_ixgbe.h (100%)
 rename drivers/net/{ => intel}/ixgbe/version.map (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index b86cdd266b..5f23ee558c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -779,7 +779,7 @@ F: doc/guides/nics/features/hinic.ini
 
 Intel e1000
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/e1000/
+F: drivers/net/intel/e1000/
 F: doc/guides/nics/e1000em.rst
 F: doc/guides/nics/intel_vf.rst
 F: doc/guides/nics/features/e1000.ini
@@ -789,7 +789,7 @@ Intel ixgbe
 M: Anatoly Burakov <anatoly.burakov@intel.com>
 M: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/ixgbe/
+F: drivers/net/intel/ixgbe/
 F: doc/guides/nics/ixgbe.rst
 F: doc/guides/nics/intel_vf.rst
 F: doc/guides/nics/features/ixgbe*.ini
@@ -798,14 +798,14 @@ Intel i40e
 M: Ian Stokes <ian.stokes@intel.com>
 M: Bruce Richardson <bruce.richardson@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/i40e/
+F: drivers/net/intel/i40e/
 F: doc/guides/nics/i40e.rst
 F: doc/guides/nics/intel_vf.rst
 F: doc/guides/nics/features/i40e*.ini
 
 Intel fm10k
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/fm10k/
+F: drivers/net/intel/fm10k/
 F: doc/guides/nics/fm10k.rst
 F: doc/guides/nics/features/fm10k*.ini
 
@@ -813,7 +813,7 @@ Intel iavf
 M: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
 M: Ian Stokes <ian.stokes@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/iavf/
+F: drivers/net/intel/iavf/
 F: drivers/common/iavf/
 F: doc/guides/nics/features/iavf*.ini
 
@@ -821,7 +821,7 @@ Intel ice
 M: Bruce Richardson <bruce.richardson@intel.com>
 M: Anatoly Burakov <anatoly.burakov@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/ice/
+F: drivers/net/intel/ice/
 F: doc/guides/nics/ice.rst
 F: doc/guides/nics/features/ice.ini
 
@@ -829,7 +829,7 @@ Intel idpf
 M: Jingjing Wu <jingjing.wu@intel.com>
 M: Praveen Shetty <praveen.shetty@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/idpf/
+F: drivers/net/intel/idpf/
 F: drivers/common/idpf/
 F: doc/guides/nics/idpf.rst
 F: doc/guides/nics/features/idpf.ini
@@ -837,20 +837,20 @@ F: doc/guides/nics/features/idpf.ini
 Intel cpfl - EXPERIMENTAL
 M: Praveen Shetty <praveen.shetty@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/cpfl/
+F: drivers/net/intel/cpfl/
 F: doc/guides/nics/cpfl.rst
 F: doc/guides/nics/features/cpfl.ini
 
 Intel igc
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/igc/
+F: drivers/net/intel/igc/
 F: doc/guides/nics/igc.rst
 F: doc/guides/nics/features/igc.ini
 
 Intel ipn3ke
 M: Rosen Xu <rosen.xu@intel.com>
 T: git://dpdk.org/next/dpdk-next-net-intel
-F: drivers/net/ipn3ke/
+F: drivers/net/intel/ipn3ke/
 F: doc/guides/nics/ipn3ke.rst
 F: doc/guides/nics/features/ipn3ke.ini
 
diff --git a/devtools/check-git-log.sh b/devtools/check-git-log.sh
index 2ee7f2db64..e51a82f172 100755
--- a/devtools/check-git-log.sh
+++ b/devtools/check-git-log.sh
@@ -80,6 +80,15 @@ bad=$(for commit in $commits ; do
 		continue
 	drv=$(echo "$files" | grep '^drivers/' | cut -d "/" -f 2,3 | sort -u)
 	drvgrp=$(echo "$drv" | cut -d "/" -f 1 | uniq)
+	if [ "$drv" = "net/intel" ] ; then
+		drvgrp=$drv
+		drv=$(echo "$files" | grep '^drivers/' | cut -d "/" -f 2,4 | sort -u)
+		if [ $(echo "$drv" | wc -l) -ne 1 ] ; then
+			drv='net/intel'
+		elif [ "$drv" = "net/common" ] ; then
+			drv='net/intel/common'
+		fi
+	fi
 	if [ $(echo "$drvgrp" | wc -l) -gt 1 ] ; then
 		echo "$headline" | grep -v '^drivers:'
 	elif [ $(echo "$drv" | wc -l) -gt 1 ] ; then
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index d23352d300..ebe9c211a9 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -19,9 +19,9 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/drivers/net/cnxk \
                           @TOPDIR@/drivers/net/dpaa \
                           @TOPDIR@/drivers/net/dpaa2 \
-                          @TOPDIR@/drivers/net/i40e \
-                          @TOPDIR@/drivers/net/iavf \
-                          @TOPDIR@/drivers/net/ixgbe \
+                          @TOPDIR@/drivers/net/intel/i40e \
+                          @TOPDIR@/drivers/net/intel/iavf \
+                          @TOPDIR@/drivers/net/intel/ixgbe \
                           @TOPDIR@/drivers/net/mlx5 \
                           @TOPDIR@/drivers/net/softnic \
                           @TOPDIR@/drivers/raw/dpaa2_cmdif \
diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index 5ec837346b..2603ef42d6 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -291,7 +291,7 @@ Runtime Configuration
 
     -a 0000:88:00.0,hw_debug_mask=0x80 --log-level=pmd.net.ice.driver:8
 
-  These ICE_DBG_XXX are defined in ``drivers/net/ice/base/ice_type.h``.
+  These ICE_DBG_XXX are defined in ``drivers/net/intel/ice/base/ice_type.h``.
 
 - ``1PPS out support``
 
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 85986ffa61..7b0d981c16 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -95,6 +95,13 @@ API Changes
 * eal: The ``__rte_packed`` macro for packing data is replaced with
   ``__rte_packed_begin`` / ``__rte_packed_end``.
 
+* build: The Intel networking drivers:
+  cpfl, e1000, fm10k, i40e, iavf, ice, idpf, igc, ipn3ke and ixgbe,
+  have been moved from ``drivers/net`` to a new ``drivers/net/intel`` directory.
+  The resulting build output, including the driver filenames, is the same,
+  but to enable/disable these drivers via meson option requires use of the new paths.
+  For example, ``-Denable_drivers=/net/i40e`` becomes ``-Denable_drivers=/net/intel/i40e``.
+
 
 ABI Changes
 -----------
diff --git a/drivers/meson.build b/drivers/meson.build
index 495e21b54a..89545e618e 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -47,7 +47,7 @@ enable_drivers = run_command(list_dir_globs, enable_drivers, check: true).stdout
 require_drivers = true
 if enable_drivers.length() == 0
     require_drivers = false
-    enable_drivers = run_command(list_dir_globs, '*/*', check: true).stdout().split()
+    enable_drivers = run_command(list_dir_globs, '*/*,*/*/*', check: true).stdout().split()
 endif
 
 # these drivers must always be enabled, otherwise the build breaks
@@ -143,6 +143,10 @@ foreach subpath:subdirs
         testpmd_sources = []
         require_iova_in_mbuf = true
 
+        if name.contains('/')
+            name = name.split('/')[1]
+        endif
+
         if not enable_drivers.contains(drv_path)
             build = false
             reason = 'not in enabled drivers build config'
diff --git a/drivers/net/cpfl/cpfl_actions.h b/drivers/net/intel/cpfl/cpfl_actions.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_actions.h
rename to drivers/net/intel/cpfl/cpfl_actions.h
diff --git a/drivers/net/cpfl/cpfl_controlq.c b/drivers/net/intel/cpfl/cpfl_controlq.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_controlq.c
rename to drivers/net/intel/cpfl/cpfl_controlq.c
diff --git a/drivers/net/cpfl/cpfl_controlq.h b/drivers/net/intel/cpfl/cpfl_controlq.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_controlq.h
rename to drivers/net/intel/cpfl/cpfl_controlq.h
diff --git a/drivers/net/cpfl/cpfl_cpchnl.h b/drivers/net/intel/cpfl/cpfl_cpchnl.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_cpchnl.h
rename to drivers/net/intel/cpfl/cpfl_cpchnl.h
diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/intel/cpfl/cpfl_ethdev.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_ethdev.c
rename to drivers/net/intel/cpfl/cpfl_ethdev.c
diff --git a/drivers/net/cpfl/cpfl_ethdev.h b/drivers/net/intel/cpfl/cpfl_ethdev.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_ethdev.h
rename to drivers/net/intel/cpfl/cpfl_ethdev.h
diff --git a/drivers/net/cpfl/cpfl_flow.c b/drivers/net/intel/cpfl/cpfl_flow.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_flow.c
rename to drivers/net/intel/cpfl/cpfl_flow.c
diff --git a/drivers/net/cpfl/cpfl_flow.h b/drivers/net/intel/cpfl/cpfl_flow.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_flow.h
rename to drivers/net/intel/cpfl/cpfl_flow.h
diff --git a/drivers/net/cpfl/cpfl_flow_engine_fxp.c b/drivers/net/intel/cpfl/cpfl_flow_engine_fxp.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_flow_engine_fxp.c
rename to drivers/net/intel/cpfl/cpfl_flow_engine_fxp.c
diff --git a/drivers/net/cpfl/cpfl_flow_parser.c b/drivers/net/intel/cpfl/cpfl_flow_parser.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_flow_parser.c
rename to drivers/net/intel/cpfl/cpfl_flow_parser.c
diff --git a/drivers/net/cpfl/cpfl_flow_parser.h b/drivers/net/intel/cpfl/cpfl_flow_parser.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_flow_parser.h
rename to drivers/net/intel/cpfl/cpfl_flow_parser.h
diff --git a/drivers/net/cpfl/cpfl_fxp_rule.c b/drivers/net/intel/cpfl/cpfl_fxp_rule.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_fxp_rule.c
rename to drivers/net/intel/cpfl/cpfl_fxp_rule.c
diff --git a/drivers/net/cpfl/cpfl_fxp_rule.h b/drivers/net/intel/cpfl/cpfl_fxp_rule.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_fxp_rule.h
rename to drivers/net/intel/cpfl/cpfl_fxp_rule.h
diff --git a/drivers/net/cpfl/cpfl_logs.h b/drivers/net/intel/cpfl/cpfl_logs.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_logs.h
rename to drivers/net/intel/cpfl/cpfl_logs.h
diff --git a/drivers/net/cpfl/cpfl_representor.c b/drivers/net/intel/cpfl/cpfl_representor.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_representor.c
rename to drivers/net/intel/cpfl/cpfl_representor.c
diff --git a/drivers/net/cpfl/cpfl_representor.h b/drivers/net/intel/cpfl/cpfl_representor.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_representor.h
rename to drivers/net/intel/cpfl/cpfl_representor.h
diff --git a/drivers/net/cpfl/cpfl_rules.c b/drivers/net/intel/cpfl/cpfl_rules.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_rules.c
rename to drivers/net/intel/cpfl/cpfl_rules.c
diff --git a/drivers/net/cpfl/cpfl_rules.h b/drivers/net/intel/cpfl/cpfl_rules.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_rules.h
rename to drivers/net/intel/cpfl/cpfl_rules.h
diff --git a/drivers/net/cpfl/cpfl_rxtx.c b/drivers/net/intel/cpfl/cpfl_rxtx.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_rxtx.c
rename to drivers/net/intel/cpfl/cpfl_rxtx.c
diff --git a/drivers/net/cpfl/cpfl_rxtx.h b/drivers/net/intel/cpfl/cpfl_rxtx.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_rxtx.h
rename to drivers/net/intel/cpfl/cpfl_rxtx.h
diff --git a/drivers/net/cpfl/cpfl_rxtx_vec_common.h b/drivers/net/intel/cpfl/cpfl_rxtx_vec_common.h
similarity index 100%
rename from drivers/net/cpfl/cpfl_rxtx_vec_common.h
rename to drivers/net/intel/cpfl/cpfl_rxtx_vec_common.h
diff --git a/drivers/net/cpfl/cpfl_vchnl.c b/drivers/net/intel/cpfl/cpfl_vchnl.c
similarity index 100%
rename from drivers/net/cpfl/cpfl_vchnl.c
rename to drivers/net/intel/cpfl/cpfl_vchnl.c
diff --git a/drivers/net/cpfl/meson.build b/drivers/net/intel/cpfl/meson.build
similarity index 100%
rename from drivers/net/cpfl/meson.build
rename to drivers/net/intel/cpfl/meson.build
diff --git a/drivers/net/e1000/base/README b/drivers/net/intel/e1000/base/README
similarity index 100%
rename from drivers/net/e1000/base/README
rename to drivers/net/intel/e1000/base/README
diff --git a/drivers/net/e1000/base/e1000_80003es2lan.c b/drivers/net/intel/e1000/base/e1000_80003es2lan.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_80003es2lan.c
rename to drivers/net/intel/e1000/base/e1000_80003es2lan.c
diff --git a/drivers/net/e1000/base/e1000_80003es2lan.h b/drivers/net/intel/e1000/base/e1000_80003es2lan.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_80003es2lan.h
rename to drivers/net/intel/e1000/base/e1000_80003es2lan.h
diff --git a/drivers/net/e1000/base/e1000_82540.c b/drivers/net/intel/e1000/base/e1000_82540.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_82540.c
rename to drivers/net/intel/e1000/base/e1000_82540.c
diff --git a/drivers/net/e1000/base/e1000_82541.c b/drivers/net/intel/e1000/base/e1000_82541.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_82541.c
rename to drivers/net/intel/e1000/base/e1000_82541.c
diff --git a/drivers/net/e1000/base/e1000_82541.h b/drivers/net/intel/e1000/base/e1000_82541.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_82541.h
rename to drivers/net/intel/e1000/base/e1000_82541.h
diff --git a/drivers/net/e1000/base/e1000_82542.c b/drivers/net/intel/e1000/base/e1000_82542.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_82542.c
rename to drivers/net/intel/e1000/base/e1000_82542.c
diff --git a/drivers/net/e1000/base/e1000_82543.c b/drivers/net/intel/e1000/base/e1000_82543.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_82543.c
rename to drivers/net/intel/e1000/base/e1000_82543.c
diff --git a/drivers/net/e1000/base/e1000_82543.h b/drivers/net/intel/e1000/base/e1000_82543.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_82543.h
rename to drivers/net/intel/e1000/base/e1000_82543.h
diff --git a/drivers/net/e1000/base/e1000_82571.c b/drivers/net/intel/e1000/base/e1000_82571.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_82571.c
rename to drivers/net/intel/e1000/base/e1000_82571.c
diff --git a/drivers/net/e1000/base/e1000_82571.h b/drivers/net/intel/e1000/base/e1000_82571.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_82571.h
rename to drivers/net/intel/e1000/base/e1000_82571.h
diff --git a/drivers/net/e1000/base/e1000_82575.c b/drivers/net/intel/e1000/base/e1000_82575.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_82575.c
rename to drivers/net/intel/e1000/base/e1000_82575.c
diff --git a/drivers/net/e1000/base/e1000_82575.h b/drivers/net/intel/e1000/base/e1000_82575.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_82575.h
rename to drivers/net/intel/e1000/base/e1000_82575.h
diff --git a/drivers/net/e1000/base/e1000_api.c b/drivers/net/intel/e1000/base/e1000_api.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_api.c
rename to drivers/net/intel/e1000/base/e1000_api.c
diff --git a/drivers/net/e1000/base/e1000_api.h b/drivers/net/intel/e1000/base/e1000_api.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_api.h
rename to drivers/net/intel/e1000/base/e1000_api.h
diff --git a/drivers/net/e1000/base/e1000_base.c b/drivers/net/intel/e1000/base/e1000_base.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_base.c
rename to drivers/net/intel/e1000/base/e1000_base.c
diff --git a/drivers/net/e1000/base/e1000_base.h b/drivers/net/intel/e1000/base/e1000_base.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_base.h
rename to drivers/net/intel/e1000/base/e1000_base.h
diff --git a/drivers/net/e1000/base/e1000_defines.h b/drivers/net/intel/e1000/base/e1000_defines.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_defines.h
rename to drivers/net/intel/e1000/base/e1000_defines.h
diff --git a/drivers/net/e1000/base/e1000_hw.h b/drivers/net/intel/e1000/base/e1000_hw.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_hw.h
rename to drivers/net/intel/e1000/base/e1000_hw.h
diff --git a/drivers/net/e1000/base/e1000_i210.c b/drivers/net/intel/e1000/base/e1000_i210.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_i210.c
rename to drivers/net/intel/e1000/base/e1000_i210.c
diff --git a/drivers/net/e1000/base/e1000_i210.h b/drivers/net/intel/e1000/base/e1000_i210.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_i210.h
rename to drivers/net/intel/e1000/base/e1000_i210.h
diff --git a/drivers/net/e1000/base/e1000_ich8lan.c b/drivers/net/intel/e1000/base/e1000_ich8lan.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_ich8lan.c
rename to drivers/net/intel/e1000/base/e1000_ich8lan.c
diff --git a/drivers/net/e1000/base/e1000_ich8lan.h b/drivers/net/intel/e1000/base/e1000_ich8lan.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_ich8lan.h
rename to drivers/net/intel/e1000/base/e1000_ich8lan.h
diff --git a/drivers/net/e1000/base/e1000_mac.c b/drivers/net/intel/e1000/base/e1000_mac.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_mac.c
rename to drivers/net/intel/e1000/base/e1000_mac.c
diff --git a/drivers/net/e1000/base/e1000_mac.h b/drivers/net/intel/e1000/base/e1000_mac.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_mac.h
rename to drivers/net/intel/e1000/base/e1000_mac.h
diff --git a/drivers/net/e1000/base/e1000_manage.c b/drivers/net/intel/e1000/base/e1000_manage.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_manage.c
rename to drivers/net/intel/e1000/base/e1000_manage.c
diff --git a/drivers/net/e1000/base/e1000_manage.h b/drivers/net/intel/e1000/base/e1000_manage.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_manage.h
rename to drivers/net/intel/e1000/base/e1000_manage.h
diff --git a/drivers/net/e1000/base/e1000_mbx.c b/drivers/net/intel/e1000/base/e1000_mbx.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_mbx.c
rename to drivers/net/intel/e1000/base/e1000_mbx.c
diff --git a/drivers/net/e1000/base/e1000_mbx.h b/drivers/net/intel/e1000/base/e1000_mbx.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_mbx.h
rename to drivers/net/intel/e1000/base/e1000_mbx.h
diff --git a/drivers/net/e1000/base/e1000_nvm.c b/drivers/net/intel/e1000/base/e1000_nvm.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_nvm.c
rename to drivers/net/intel/e1000/base/e1000_nvm.c
diff --git a/drivers/net/e1000/base/e1000_nvm.h b/drivers/net/intel/e1000/base/e1000_nvm.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_nvm.h
rename to drivers/net/intel/e1000/base/e1000_nvm.h
diff --git a/drivers/net/e1000/base/e1000_osdep.c b/drivers/net/intel/e1000/base/e1000_osdep.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_osdep.c
rename to drivers/net/intel/e1000/base/e1000_osdep.c
diff --git a/drivers/net/e1000/base/e1000_osdep.h b/drivers/net/intel/e1000/base/e1000_osdep.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_osdep.h
rename to drivers/net/intel/e1000/base/e1000_osdep.h
diff --git a/drivers/net/e1000/base/e1000_phy.c b/drivers/net/intel/e1000/base/e1000_phy.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_phy.c
rename to drivers/net/intel/e1000/base/e1000_phy.c
diff --git a/drivers/net/e1000/base/e1000_phy.h b/drivers/net/intel/e1000/base/e1000_phy.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_phy.h
rename to drivers/net/intel/e1000/base/e1000_phy.h
diff --git a/drivers/net/e1000/base/e1000_regs.h b/drivers/net/intel/e1000/base/e1000_regs.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_regs.h
rename to drivers/net/intel/e1000/base/e1000_regs.h
diff --git a/drivers/net/e1000/base/e1000_vf.c b/drivers/net/intel/e1000/base/e1000_vf.c
similarity index 100%
rename from drivers/net/e1000/base/e1000_vf.c
rename to drivers/net/intel/e1000/base/e1000_vf.c
diff --git a/drivers/net/e1000/base/e1000_vf.h b/drivers/net/intel/e1000/base/e1000_vf.h
similarity index 100%
rename from drivers/net/e1000/base/e1000_vf.h
rename to drivers/net/intel/e1000/base/e1000_vf.h
diff --git a/drivers/net/e1000/base/meson.build b/drivers/net/intel/e1000/base/meson.build
similarity index 100%
rename from drivers/net/e1000/base/meson.build
rename to drivers/net/intel/e1000/base/meson.build
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/intel/e1000/e1000_ethdev.h
similarity index 100%
rename from drivers/net/e1000/e1000_ethdev.h
rename to drivers/net/intel/e1000/e1000_ethdev.h
diff --git a/drivers/net/e1000/e1000_logs.c b/drivers/net/intel/e1000/e1000_logs.c
similarity index 100%
rename from drivers/net/e1000/e1000_logs.c
rename to drivers/net/intel/e1000/e1000_logs.c
diff --git a/drivers/net/e1000/e1000_logs.h b/drivers/net/intel/e1000/e1000_logs.h
similarity index 100%
rename from drivers/net/e1000/e1000_logs.h
rename to drivers/net/intel/e1000/e1000_logs.h
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/intel/e1000/em_ethdev.c
similarity index 100%
rename from drivers/net/e1000/em_ethdev.c
rename to drivers/net/intel/e1000/em_ethdev.c
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/intel/e1000/em_rxtx.c
similarity index 100%
rename from drivers/net/e1000/em_rxtx.c
rename to drivers/net/intel/e1000/em_rxtx.c
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/intel/e1000/igb_ethdev.c
similarity index 100%
rename from drivers/net/e1000/igb_ethdev.c
rename to drivers/net/intel/e1000/igb_ethdev.c
diff --git a/drivers/net/e1000/igb_flow.c b/drivers/net/intel/e1000/igb_flow.c
similarity index 100%
rename from drivers/net/e1000/igb_flow.c
rename to drivers/net/intel/e1000/igb_flow.c
diff --git a/drivers/net/e1000/igb_pf.c b/drivers/net/intel/e1000/igb_pf.c
similarity index 100%
rename from drivers/net/e1000/igb_pf.c
rename to drivers/net/intel/e1000/igb_pf.c
diff --git a/drivers/net/e1000/igb_regs.h b/drivers/net/intel/e1000/igb_regs.h
similarity index 100%
rename from drivers/net/e1000/igb_regs.h
rename to drivers/net/intel/e1000/igb_regs.h
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/intel/e1000/igb_rxtx.c
similarity index 100%
rename from drivers/net/e1000/igb_rxtx.c
rename to drivers/net/intel/e1000/igb_rxtx.c
diff --git a/drivers/net/e1000/meson.build b/drivers/net/intel/e1000/meson.build
similarity index 100%
rename from drivers/net/e1000/meson.build
rename to drivers/net/intel/e1000/meson.build
diff --git a/drivers/net/fm10k/base/fm10k_api.c b/drivers/net/intel/fm10k/base/fm10k_api.c
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_api.c
rename to drivers/net/intel/fm10k/base/fm10k_api.c
diff --git a/drivers/net/fm10k/base/fm10k_api.h b/drivers/net/intel/fm10k/base/fm10k_api.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_api.h
rename to drivers/net/intel/fm10k/base/fm10k_api.h
diff --git a/drivers/net/fm10k/base/fm10k_common.c b/drivers/net/intel/fm10k/base/fm10k_common.c
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_common.c
rename to drivers/net/intel/fm10k/base/fm10k_common.c
diff --git a/drivers/net/fm10k/base/fm10k_common.h b/drivers/net/intel/fm10k/base/fm10k_common.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_common.h
rename to drivers/net/intel/fm10k/base/fm10k_common.h
diff --git a/drivers/net/fm10k/base/fm10k_mbx.c b/drivers/net/intel/fm10k/base/fm10k_mbx.c
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_mbx.c
rename to drivers/net/intel/fm10k/base/fm10k_mbx.c
diff --git a/drivers/net/fm10k/base/fm10k_mbx.h b/drivers/net/intel/fm10k/base/fm10k_mbx.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_mbx.h
rename to drivers/net/intel/fm10k/base/fm10k_mbx.h
diff --git a/drivers/net/fm10k/base/fm10k_osdep.h b/drivers/net/intel/fm10k/base/fm10k_osdep.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_osdep.h
rename to drivers/net/intel/fm10k/base/fm10k_osdep.h
diff --git a/drivers/net/fm10k/base/fm10k_pf.c b/drivers/net/intel/fm10k/base/fm10k_pf.c
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_pf.c
rename to drivers/net/intel/fm10k/base/fm10k_pf.c
diff --git a/drivers/net/fm10k/base/fm10k_pf.h b/drivers/net/intel/fm10k/base/fm10k_pf.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_pf.h
rename to drivers/net/intel/fm10k/base/fm10k_pf.h
diff --git a/drivers/net/fm10k/base/fm10k_tlv.c b/drivers/net/intel/fm10k/base/fm10k_tlv.c
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_tlv.c
rename to drivers/net/intel/fm10k/base/fm10k_tlv.c
diff --git a/drivers/net/fm10k/base/fm10k_tlv.h b/drivers/net/intel/fm10k/base/fm10k_tlv.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_tlv.h
rename to drivers/net/intel/fm10k/base/fm10k_tlv.h
diff --git a/drivers/net/fm10k/base/fm10k_type.h b/drivers/net/intel/fm10k/base/fm10k_type.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_type.h
rename to drivers/net/intel/fm10k/base/fm10k_type.h
diff --git a/drivers/net/fm10k/base/fm10k_vf.c b/drivers/net/intel/fm10k/base/fm10k_vf.c
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_vf.c
rename to drivers/net/intel/fm10k/base/fm10k_vf.c
diff --git a/drivers/net/fm10k/base/fm10k_vf.h b/drivers/net/intel/fm10k/base/fm10k_vf.h
similarity index 100%
rename from drivers/net/fm10k/base/fm10k_vf.h
rename to drivers/net/intel/fm10k/base/fm10k_vf.h
diff --git a/drivers/net/fm10k/base/meson.build b/drivers/net/intel/fm10k/base/meson.build
similarity index 100%
rename from drivers/net/fm10k/base/meson.build
rename to drivers/net/intel/fm10k/base/meson.build
diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/intel/fm10k/fm10k.h
similarity index 100%
rename from drivers/net/fm10k/fm10k.h
rename to drivers/net/intel/fm10k/fm10k.h
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/intel/fm10k/fm10k_ethdev.c
similarity index 100%
rename from drivers/net/fm10k/fm10k_ethdev.c
rename to drivers/net/intel/fm10k/fm10k_ethdev.c
diff --git a/drivers/net/fm10k/fm10k_logs.h b/drivers/net/intel/fm10k/fm10k_logs.h
similarity index 100%
rename from drivers/net/fm10k/fm10k_logs.h
rename to drivers/net/intel/fm10k/fm10k_logs.h
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/intel/fm10k/fm10k_rxtx.c
similarity index 100%
rename from drivers/net/fm10k/fm10k_rxtx.c
rename to drivers/net/intel/fm10k/fm10k_rxtx.c
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c b/drivers/net/intel/fm10k/fm10k_rxtx_vec.c
similarity index 100%
rename from drivers/net/fm10k/fm10k_rxtx_vec.c
rename to drivers/net/intel/fm10k/fm10k_rxtx_vec.c
diff --git a/drivers/net/fm10k/meson.build b/drivers/net/intel/fm10k/meson.build
similarity index 100%
rename from drivers/net/fm10k/meson.build
rename to drivers/net/intel/fm10k/meson.build
diff --git a/drivers/net/i40e/base/README b/drivers/net/intel/i40e/base/README
similarity index 100%
rename from drivers/net/i40e/base/README
rename to drivers/net/intel/i40e/base/README
diff --git a/drivers/net/i40e/base/i40e_adminq.c b/drivers/net/intel/i40e/base/i40e_adminq.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_adminq.c
rename to drivers/net/intel/i40e/base/i40e_adminq.c
diff --git a/drivers/net/i40e/base/i40e_adminq.h b/drivers/net/intel/i40e/base/i40e_adminq.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_adminq.h
rename to drivers/net/intel/i40e/base/i40e_adminq.h
diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h b/drivers/net/intel/i40e/base/i40e_adminq_cmd.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_adminq_cmd.h
rename to drivers/net/intel/i40e/base/i40e_adminq_cmd.h
diff --git a/drivers/net/i40e/base/i40e_alloc.h b/drivers/net/intel/i40e/base/i40e_alloc.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_alloc.h
rename to drivers/net/intel/i40e/base/i40e_alloc.h
diff --git a/drivers/net/i40e/base/i40e_common.c b/drivers/net/intel/i40e/base/i40e_common.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_common.c
rename to drivers/net/intel/i40e/base/i40e_common.c
diff --git a/drivers/net/i40e/base/i40e_dcb.c b/drivers/net/intel/i40e/base/i40e_dcb.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_dcb.c
rename to drivers/net/intel/i40e/base/i40e_dcb.c
diff --git a/drivers/net/i40e/base/i40e_dcb.h b/drivers/net/intel/i40e/base/i40e_dcb.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_dcb.h
rename to drivers/net/intel/i40e/base/i40e_dcb.h
diff --git a/drivers/net/i40e/base/i40e_devids.h b/drivers/net/intel/i40e/base/i40e_devids.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_devids.h
rename to drivers/net/intel/i40e/base/i40e_devids.h
diff --git a/drivers/net/i40e/base/i40e_diag.c b/drivers/net/intel/i40e/base/i40e_diag.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_diag.c
rename to drivers/net/intel/i40e/base/i40e_diag.c
diff --git a/drivers/net/i40e/base/i40e_diag.h b/drivers/net/intel/i40e/base/i40e_diag.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_diag.h
rename to drivers/net/intel/i40e/base/i40e_diag.h
diff --git a/drivers/net/i40e/base/i40e_hmc.c b/drivers/net/intel/i40e/base/i40e_hmc.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_hmc.c
rename to drivers/net/intel/i40e/base/i40e_hmc.c
diff --git a/drivers/net/i40e/base/i40e_hmc.h b/drivers/net/intel/i40e/base/i40e_hmc.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_hmc.h
rename to drivers/net/intel/i40e/base/i40e_hmc.h
diff --git a/drivers/net/i40e/base/i40e_lan_hmc.c b/drivers/net/intel/i40e/base/i40e_lan_hmc.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_lan_hmc.c
rename to drivers/net/intel/i40e/base/i40e_lan_hmc.c
diff --git a/drivers/net/i40e/base/i40e_lan_hmc.h b/drivers/net/intel/i40e/base/i40e_lan_hmc.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_lan_hmc.h
rename to drivers/net/intel/i40e/base/i40e_lan_hmc.h
diff --git a/drivers/net/i40e/base/i40e_nvm.c b/drivers/net/intel/i40e/base/i40e_nvm.c
similarity index 100%
rename from drivers/net/i40e/base/i40e_nvm.c
rename to drivers/net/intel/i40e/base/i40e_nvm.c
diff --git a/drivers/net/i40e/base/i40e_osdep.h b/drivers/net/intel/i40e/base/i40e_osdep.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_osdep.h
rename to drivers/net/intel/i40e/base/i40e_osdep.h
diff --git a/drivers/net/i40e/base/i40e_prototype.h b/drivers/net/intel/i40e/base/i40e_prototype.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_prototype.h
rename to drivers/net/intel/i40e/base/i40e_prototype.h
diff --git a/drivers/net/i40e/base/i40e_register.h b/drivers/net/intel/i40e/base/i40e_register.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_register.h
rename to drivers/net/intel/i40e/base/i40e_register.h
diff --git a/drivers/net/i40e/base/i40e_status.h b/drivers/net/intel/i40e/base/i40e_status.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_status.h
rename to drivers/net/intel/i40e/base/i40e_status.h
diff --git a/drivers/net/i40e/base/i40e_type.h b/drivers/net/intel/i40e/base/i40e_type.h
similarity index 100%
rename from drivers/net/i40e/base/i40e_type.h
rename to drivers/net/intel/i40e/base/i40e_type.h
diff --git a/drivers/net/i40e/base/meson.build b/drivers/net/intel/i40e/base/meson.build
similarity index 100%
rename from drivers/net/i40e/base/meson.build
rename to drivers/net/intel/i40e/base/meson.build
diff --git a/drivers/net/i40e/base/virtchnl.h b/drivers/net/intel/i40e/base/virtchnl.h
similarity index 100%
rename from drivers/net/i40e/base/virtchnl.h
rename to drivers/net/intel/i40e/base/virtchnl.h
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/intel/i40e/i40e_ethdev.c
similarity index 100%
rename from drivers/net/i40e/i40e_ethdev.c
rename to drivers/net/intel/i40e/i40e_ethdev.c
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/intel/i40e/i40e_ethdev.h
similarity index 100%
rename from drivers/net/i40e/i40e_ethdev.h
rename to drivers/net/intel/i40e/i40e_ethdev.h
diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/intel/i40e/i40e_fdir.c
similarity index 100%
rename from drivers/net/i40e/i40e_fdir.c
rename to drivers/net/intel/i40e/i40e_fdir.c
diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/intel/i40e/i40e_flow.c
similarity index 100%
rename from drivers/net/i40e/i40e_flow.c
rename to drivers/net/intel/i40e/i40e_flow.c
diff --git a/drivers/net/i40e/i40e_hash.c b/drivers/net/intel/i40e/i40e_hash.c
similarity index 100%
rename from drivers/net/i40e/i40e_hash.c
rename to drivers/net/intel/i40e/i40e_hash.c
diff --git a/drivers/net/i40e/i40e_hash.h b/drivers/net/intel/i40e/i40e_hash.h
similarity index 100%
rename from drivers/net/i40e/i40e_hash.h
rename to drivers/net/intel/i40e/i40e_hash.h
diff --git a/drivers/net/i40e/i40e_logs.h b/drivers/net/intel/i40e/i40e_logs.h
similarity index 100%
rename from drivers/net/i40e/i40e_logs.h
rename to drivers/net/intel/i40e/i40e_logs.h
diff --git a/drivers/net/i40e/i40e_pf.c b/drivers/net/intel/i40e/i40e_pf.c
similarity index 100%
rename from drivers/net/i40e/i40e_pf.c
rename to drivers/net/intel/i40e/i40e_pf.c
diff --git a/drivers/net/i40e/i40e_pf.h b/drivers/net/intel/i40e/i40e_pf.h
similarity index 100%
rename from drivers/net/i40e/i40e_pf.h
rename to drivers/net/intel/i40e/i40e_pf.h
diff --git a/drivers/net/i40e/i40e_recycle_mbufs_vec_common.c b/drivers/net/intel/i40e/i40e_recycle_mbufs_vec_common.c
similarity index 100%
rename from drivers/net/i40e/i40e_recycle_mbufs_vec_common.c
rename to drivers/net/intel/i40e/i40e_recycle_mbufs_vec_common.c
diff --git a/drivers/net/i40e/i40e_regs.h b/drivers/net/intel/i40e/i40e_regs.h
similarity index 100%
rename from drivers/net/i40e/i40e_regs.h
rename to drivers/net/intel/i40e/i40e_regs.h
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/intel/i40e/i40e_rxtx.c
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx.c
rename to drivers/net/intel/i40e/i40e_rxtx.c
diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/intel/i40e/i40e_rxtx.h
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx.h
rename to drivers/net/intel/i40e/i40e_rxtx.h
diff --git a/drivers/net/i40e/i40e_rxtx_common_avx.h b/drivers/net/intel/i40e/i40e_rxtx_common_avx.h
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_common_avx.h
rename to drivers/net/intel/i40e/i40e_rxtx_common_avx.h
diff --git a/drivers/net/i40e/i40e_rxtx_vec_altivec.c b/drivers/net/intel/i40e/i40e_rxtx_vec_altivec.c
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_vec_altivec.c
rename to drivers/net/intel/i40e/i40e_rxtx_vec_altivec.c
diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx2.c b/drivers/net/intel/i40e/i40e_rxtx_vec_avx2.c
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_vec_avx2.c
rename to drivers/net/intel/i40e/i40e_rxtx_vec_avx2.c
diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx512.c b/drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_vec_avx512.c
rename to drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h b/drivers/net/intel/i40e/i40e_rxtx_vec_common.h
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_vec_common.h
rename to drivers/net/intel/i40e/i40e_rxtx_vec_common.h
diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/intel/i40e/i40e_rxtx_vec_neon.c
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_vec_neon.c
rename to drivers/net/intel/i40e/i40e_rxtx_vec_neon.c
diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c b/drivers/net/intel/i40e/i40e_rxtx_vec_sse.c
similarity index 100%
rename from drivers/net/i40e/i40e_rxtx_vec_sse.c
rename to drivers/net/intel/i40e/i40e_rxtx_vec_sse.c
diff --git a/drivers/net/i40e/i40e_testpmd.c b/drivers/net/intel/i40e/i40e_testpmd.c
similarity index 100%
rename from drivers/net/i40e/i40e_testpmd.c
rename to drivers/net/intel/i40e/i40e_testpmd.c
diff --git a/drivers/net/i40e/i40e_tm.c b/drivers/net/intel/i40e/i40e_tm.c
similarity index 100%
rename from drivers/net/i40e/i40e_tm.c
rename to drivers/net/intel/i40e/i40e_tm.c
diff --git a/drivers/net/i40e/i40e_vf_representor.c b/drivers/net/intel/i40e/i40e_vf_representor.c
similarity index 100%
rename from drivers/net/i40e/i40e_vf_representor.c
rename to drivers/net/intel/i40e/i40e_vf_representor.c
diff --git a/drivers/net/i40e/meson.build b/drivers/net/intel/i40e/meson.build
similarity index 100%
rename from drivers/net/i40e/meson.build
rename to drivers/net/intel/i40e/meson.build
diff --git a/drivers/net/i40e/rte_pmd_i40e.c b/drivers/net/intel/i40e/rte_pmd_i40e.c
similarity index 100%
rename from drivers/net/i40e/rte_pmd_i40e.c
rename to drivers/net/intel/i40e/rte_pmd_i40e.c
diff --git a/drivers/net/i40e/rte_pmd_i40e.h b/drivers/net/intel/i40e/rte_pmd_i40e.h
similarity index 100%
rename from drivers/net/i40e/rte_pmd_i40e.h
rename to drivers/net/intel/i40e/rte_pmd_i40e.h
diff --git a/drivers/net/i40e/version.map b/drivers/net/intel/i40e/version.map
similarity index 100%
rename from drivers/net/i40e/version.map
rename to drivers/net/intel/i40e/version.map
diff --git a/drivers/net/iavf/iavf.h b/drivers/net/intel/iavf/iavf.h
similarity index 100%
rename from drivers/net/iavf/iavf.h
rename to drivers/net/intel/iavf/iavf.h
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/intel/iavf/iavf_ethdev.c
similarity index 100%
rename from drivers/net/iavf/iavf_ethdev.c
rename to drivers/net/intel/iavf/iavf_ethdev.c
diff --git a/drivers/net/iavf/iavf_fdir.c b/drivers/net/intel/iavf/iavf_fdir.c
similarity index 100%
rename from drivers/net/iavf/iavf_fdir.c
rename to drivers/net/intel/iavf/iavf_fdir.c
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/intel/iavf/iavf_fsub.c
similarity index 100%
rename from drivers/net/iavf/iavf_fsub.c
rename to drivers/net/intel/iavf/iavf_fsub.c
diff --git a/drivers/net/iavf/iavf_generic_flow.c b/drivers/net/intel/iavf/iavf_generic_flow.c
similarity index 100%
rename from drivers/net/iavf/iavf_generic_flow.c
rename to drivers/net/intel/iavf/iavf_generic_flow.c
diff --git a/drivers/net/iavf/iavf_generic_flow.h b/drivers/net/intel/iavf/iavf_generic_flow.h
similarity index 100%
rename from drivers/net/iavf/iavf_generic_flow.h
rename to drivers/net/intel/iavf/iavf_generic_flow.h
diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/intel/iavf/iavf_hash.c
similarity index 100%
rename from drivers/net/iavf/iavf_hash.c
rename to drivers/net/intel/iavf/iavf_hash.c
diff --git a/drivers/net/iavf/iavf_ipsec_crypto.c b/drivers/net/intel/iavf/iavf_ipsec_crypto.c
similarity index 100%
rename from drivers/net/iavf/iavf_ipsec_crypto.c
rename to drivers/net/intel/iavf/iavf_ipsec_crypto.c
diff --git a/drivers/net/iavf/iavf_ipsec_crypto.h b/drivers/net/intel/iavf/iavf_ipsec_crypto.h
similarity index 100%
rename from drivers/net/iavf/iavf_ipsec_crypto.h
rename to drivers/net/intel/iavf/iavf_ipsec_crypto.h
diff --git a/drivers/net/iavf/iavf_ipsec_crypto_capabilities.h b/drivers/net/intel/iavf/iavf_ipsec_crypto_capabilities.h
similarity index 100%
rename from drivers/net/iavf/iavf_ipsec_crypto_capabilities.h
rename to drivers/net/intel/iavf/iavf_ipsec_crypto_capabilities.h
diff --git a/drivers/net/iavf/iavf_log.h b/drivers/net/intel/iavf/iavf_log.h
similarity index 100%
rename from drivers/net/iavf/iavf_log.h
rename to drivers/net/intel/iavf/iavf_log.h
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/intel/iavf/iavf_rxtx.c
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx.c
rename to drivers/net/intel/iavf/iavf_rxtx.c
diff --git a/drivers/net/iavf/iavf_rxtx.h b/drivers/net/intel/iavf/iavf_rxtx.h
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx.h
rename to drivers/net/intel/iavf/iavf_rxtx.h
diff --git a/drivers/net/iavf/iavf_rxtx_vec_avx2.c b/drivers/net/intel/iavf/iavf_rxtx_vec_avx2.c
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx_vec_avx2.c
rename to drivers/net/intel/iavf/iavf_rxtx_vec_avx2.c
diff --git a/drivers/net/iavf/iavf_rxtx_vec_avx512.c b/drivers/net/intel/iavf/iavf_rxtx_vec_avx512.c
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx_vec_avx512.c
rename to drivers/net/intel/iavf/iavf_rxtx_vec_avx512.c
diff --git a/drivers/net/iavf/iavf_rxtx_vec_common.h b/drivers/net/intel/iavf/iavf_rxtx_vec_common.h
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx_vec_common.h
rename to drivers/net/intel/iavf/iavf_rxtx_vec_common.h
diff --git a/drivers/net/iavf/iavf_rxtx_vec_neon.c b/drivers/net/intel/iavf/iavf_rxtx_vec_neon.c
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx_vec_neon.c
rename to drivers/net/intel/iavf/iavf_rxtx_vec_neon.c
diff --git a/drivers/net/iavf/iavf_rxtx_vec_sse.c b/drivers/net/intel/iavf/iavf_rxtx_vec_sse.c
similarity index 100%
rename from drivers/net/iavf/iavf_rxtx_vec_sse.c
rename to drivers/net/intel/iavf/iavf_rxtx_vec_sse.c
diff --git a/drivers/net/iavf/iavf_testpmd.c b/drivers/net/intel/iavf/iavf_testpmd.c
similarity index 100%
rename from drivers/net/iavf/iavf_testpmd.c
rename to drivers/net/intel/iavf/iavf_testpmd.c
diff --git a/drivers/net/iavf/iavf_tm.c b/drivers/net/intel/iavf/iavf_tm.c
similarity index 100%
rename from drivers/net/iavf/iavf_tm.c
rename to drivers/net/intel/iavf/iavf_tm.c
diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/intel/iavf/iavf_vchnl.c
similarity index 100%
rename from drivers/net/iavf/iavf_vchnl.c
rename to drivers/net/intel/iavf/iavf_vchnl.c
diff --git a/drivers/net/iavf/meson.build b/drivers/net/intel/iavf/meson.build
similarity index 84%
rename from drivers/net/iavf/meson.build
rename to drivers/net/intel/iavf/meson.build
index b48bb83438..d9b605f55a 100644
--- a/drivers/net/iavf/meson.build
+++ b/drivers/net/intel/iavf/meson.build
@@ -5,8 +5,6 @@ if dpdk_conf.get('RTE_IOVA_IN_MBUF') == 0
     subdir_done()
 endif
 
-includes += include_directories('../../common/iavf')
-
 testpmd_sources = files('iavf_testpmd.c')
 
 deps += ['common_iavf', 'security', 'cryptodev']
@@ -23,7 +21,7 @@ sources = files(
         'iavf_fsub.c',
 )
 
-if arch_subdir == 'x86'
+if arch_subdir == 'x86' and is_variable('static_rte_common_iavf')
     sources += files('iavf_rxtx_vec_sse.c')
 
     if is_windows and cc.get_id() != 'clang'
@@ -32,7 +30,7 @@ if arch_subdir == 'x86'
 
     iavf_avx2_lib = static_library('iavf_avx2_lib',
             'iavf_rxtx_vec_avx2.c',
-            dependencies: [static_rte_ethdev, static_rte_kvargs, static_rte_hash],
+            dependencies: [static_rte_ethdev, static_rte_common_iavf],
             include_directories: includes,
             c_args: [cflags, '-mavx2'])
     objs += iavf_avx2_lib.extract_objects('iavf_rxtx_vec_avx2.c')
@@ -45,8 +43,7 @@ if arch_subdir == 'x86'
         endif
         iavf_avx512_lib = static_library('iavf_avx512_lib',
                 'iavf_rxtx_vec_avx512.c',
-                dependencies: [static_rte_ethdev,
-                    static_rte_kvargs, static_rte_hash],
+                dependencies: [static_rte_ethdev, static_rte_common_iavf],
                 include_directories: includes,
                 c_args: avx512_args)
         objs += iavf_avx512_lib.extract_objects('iavf_rxtx_vec_avx512.c')
diff --git a/drivers/net/iavf/rte_pmd_iavf.h b/drivers/net/intel/iavf/rte_pmd_iavf.h
similarity index 100%
rename from drivers/net/iavf/rte_pmd_iavf.h
rename to drivers/net/intel/iavf/rte_pmd_iavf.h
diff --git a/drivers/net/iavf/version.map b/drivers/net/intel/iavf/version.map
similarity index 100%
rename from drivers/net/iavf/version.map
rename to drivers/net/intel/iavf/version.map
diff --git a/drivers/net/ice/base/README b/drivers/net/intel/ice/base/README
similarity index 100%
rename from drivers/net/ice/base/README
rename to drivers/net/intel/ice/base/README
diff --git a/drivers/net/ice/base/ice_acl.c b/drivers/net/intel/ice/base/ice_acl.c
similarity index 100%
rename from drivers/net/ice/base/ice_acl.c
rename to drivers/net/intel/ice/base/ice_acl.c
diff --git a/drivers/net/ice/base/ice_acl.h b/drivers/net/intel/ice/base/ice_acl.h
similarity index 100%
rename from drivers/net/ice/base/ice_acl.h
rename to drivers/net/intel/ice/base/ice_acl.h
diff --git a/drivers/net/ice/base/ice_acl_ctrl.c b/drivers/net/intel/ice/base/ice_acl_ctrl.c
similarity index 100%
rename from drivers/net/ice/base/ice_acl_ctrl.c
rename to drivers/net/intel/ice/base/ice_acl_ctrl.c
diff --git a/drivers/net/ice/base/ice_adminq_cmd.h b/drivers/net/intel/ice/base/ice_adminq_cmd.h
similarity index 100%
rename from drivers/net/ice/base/ice_adminq_cmd.h
rename to drivers/net/intel/ice/base/ice_adminq_cmd.h
diff --git a/drivers/net/ice/base/ice_alloc.h b/drivers/net/intel/ice/base/ice_alloc.h
similarity index 100%
rename from drivers/net/ice/base/ice_alloc.h
rename to drivers/net/intel/ice/base/ice_alloc.h
diff --git a/drivers/net/ice/base/ice_bitops.h b/drivers/net/intel/ice/base/ice_bitops.h
similarity index 100%
rename from drivers/net/ice/base/ice_bitops.h
rename to drivers/net/intel/ice/base/ice_bitops.h
diff --git a/drivers/net/ice/base/ice_bst_tcam.c b/drivers/net/intel/ice/base/ice_bst_tcam.c
similarity index 100%
rename from drivers/net/ice/base/ice_bst_tcam.c
rename to drivers/net/intel/ice/base/ice_bst_tcam.c
diff --git a/drivers/net/ice/base/ice_bst_tcam.h b/drivers/net/intel/ice/base/ice_bst_tcam.h
similarity index 100%
rename from drivers/net/ice/base/ice_bst_tcam.h
rename to drivers/net/intel/ice/base/ice_bst_tcam.h
diff --git a/drivers/net/ice/base/ice_cgu_regs.h b/drivers/net/intel/ice/base/ice_cgu_regs.h
similarity index 100%
rename from drivers/net/ice/base/ice_cgu_regs.h
rename to drivers/net/intel/ice/base/ice_cgu_regs.h
diff --git a/drivers/net/ice/base/ice_common.c b/drivers/net/intel/ice/base/ice_common.c
similarity index 100%
rename from drivers/net/ice/base/ice_common.c
rename to drivers/net/intel/ice/base/ice_common.c
diff --git a/drivers/net/ice/base/ice_common.h b/drivers/net/intel/ice/base/ice_common.h
similarity index 100%
rename from drivers/net/ice/base/ice_common.h
rename to drivers/net/intel/ice/base/ice_common.h
diff --git a/drivers/net/ice/base/ice_controlq.c b/drivers/net/intel/ice/base/ice_controlq.c
similarity index 100%
rename from drivers/net/ice/base/ice_controlq.c
rename to drivers/net/intel/ice/base/ice_controlq.c
diff --git a/drivers/net/ice/base/ice_controlq.h b/drivers/net/intel/ice/base/ice_controlq.h
similarity index 100%
rename from drivers/net/ice/base/ice_controlq.h
rename to drivers/net/intel/ice/base/ice_controlq.h
diff --git a/drivers/net/ice/base/ice_dcb.c b/drivers/net/intel/ice/base/ice_dcb.c
similarity index 100%
rename from drivers/net/ice/base/ice_dcb.c
rename to drivers/net/intel/ice/base/ice_dcb.c
diff --git a/drivers/net/ice/base/ice_dcb.h b/drivers/net/intel/ice/base/ice_dcb.h
similarity index 100%
rename from drivers/net/ice/base/ice_dcb.h
rename to drivers/net/intel/ice/base/ice_dcb.h
diff --git a/drivers/net/ice/base/ice_ddp.c b/drivers/net/intel/ice/base/ice_ddp.c
similarity index 100%
rename from drivers/net/ice/base/ice_ddp.c
rename to drivers/net/intel/ice/base/ice_ddp.c
diff --git a/drivers/net/ice/base/ice_ddp.h b/drivers/net/intel/ice/base/ice_ddp.h
similarity index 100%
rename from drivers/net/ice/base/ice_ddp.h
rename to drivers/net/intel/ice/base/ice_ddp.h
diff --git a/drivers/net/ice/base/ice_defs.h b/drivers/net/intel/ice/base/ice_defs.h
similarity index 100%
rename from drivers/net/ice/base/ice_defs.h
rename to drivers/net/intel/ice/base/ice_defs.h
diff --git a/drivers/net/ice/base/ice_devids.h b/drivers/net/intel/ice/base/ice_devids.h
similarity index 100%
rename from drivers/net/ice/base/ice_devids.h
rename to drivers/net/intel/ice/base/ice_devids.h
diff --git a/drivers/net/ice/base/ice_fdir.c b/drivers/net/intel/ice/base/ice_fdir.c
similarity index 100%
rename from drivers/net/ice/base/ice_fdir.c
rename to drivers/net/intel/ice/base/ice_fdir.c
diff --git a/drivers/net/ice/base/ice_fdir.h b/drivers/net/intel/ice/base/ice_fdir.h
similarity index 100%
rename from drivers/net/ice/base/ice_fdir.h
rename to drivers/net/intel/ice/base/ice_fdir.h
diff --git a/drivers/net/ice/base/ice_flex_pipe.c b/drivers/net/intel/ice/base/ice_flex_pipe.c
similarity index 100%
rename from drivers/net/ice/base/ice_flex_pipe.c
rename to drivers/net/intel/ice/base/ice_flex_pipe.c
diff --git a/drivers/net/ice/base/ice_flex_pipe.h b/drivers/net/intel/ice/base/ice_flex_pipe.h
similarity index 100%
rename from drivers/net/ice/base/ice_flex_pipe.h
rename to drivers/net/intel/ice/base/ice_flex_pipe.h
diff --git a/drivers/net/ice/base/ice_flex_type.h b/drivers/net/intel/ice/base/ice_flex_type.h
similarity index 100%
rename from drivers/net/ice/base/ice_flex_type.h
rename to drivers/net/intel/ice/base/ice_flex_type.h
diff --git a/drivers/net/ice/base/ice_flg_rd.c b/drivers/net/intel/ice/base/ice_flg_rd.c
similarity index 100%
rename from drivers/net/ice/base/ice_flg_rd.c
rename to drivers/net/intel/ice/base/ice_flg_rd.c
diff --git a/drivers/net/ice/base/ice_flg_rd.h b/drivers/net/intel/ice/base/ice_flg_rd.h
similarity index 100%
rename from drivers/net/ice/base/ice_flg_rd.h
rename to drivers/net/intel/ice/base/ice_flg_rd.h
diff --git a/drivers/net/ice/base/ice_flow.c b/drivers/net/intel/ice/base/ice_flow.c
similarity index 100%
rename from drivers/net/ice/base/ice_flow.c
rename to drivers/net/intel/ice/base/ice_flow.c
diff --git a/drivers/net/ice/base/ice_flow.h b/drivers/net/intel/ice/base/ice_flow.h
similarity index 100%
rename from drivers/net/ice/base/ice_flow.h
rename to drivers/net/intel/ice/base/ice_flow.h
diff --git a/drivers/net/ice/base/ice_fwlog.c b/drivers/net/intel/ice/base/ice_fwlog.c
similarity index 100%
rename from drivers/net/ice/base/ice_fwlog.c
rename to drivers/net/intel/ice/base/ice_fwlog.c
diff --git a/drivers/net/ice/base/ice_fwlog.h b/drivers/net/intel/ice/base/ice_fwlog.h
similarity index 100%
rename from drivers/net/ice/base/ice_fwlog.h
rename to drivers/net/intel/ice/base/ice_fwlog.h
diff --git a/drivers/net/ice/base/ice_hw_autogen.h b/drivers/net/intel/ice/base/ice_hw_autogen.h
similarity index 100%
rename from drivers/net/ice/base/ice_hw_autogen.h
rename to drivers/net/intel/ice/base/ice_hw_autogen.h
diff --git a/drivers/net/ice/base/ice_imem.c b/drivers/net/intel/ice/base/ice_imem.c
similarity index 100%
rename from drivers/net/ice/base/ice_imem.c
rename to drivers/net/intel/ice/base/ice_imem.c
diff --git a/drivers/net/ice/base/ice_imem.h b/drivers/net/intel/ice/base/ice_imem.h
similarity index 100%
rename from drivers/net/ice/base/ice_imem.h
rename to drivers/net/intel/ice/base/ice_imem.h
diff --git a/drivers/net/ice/base/ice_lan_tx_rx.h b/drivers/net/intel/ice/base/ice_lan_tx_rx.h
similarity index 100%
rename from drivers/net/ice/base/ice_lan_tx_rx.h
rename to drivers/net/intel/ice/base/ice_lan_tx_rx.h
diff --git a/drivers/net/ice/base/ice_metainit.c b/drivers/net/intel/ice/base/ice_metainit.c
similarity index 100%
rename from drivers/net/ice/base/ice_metainit.c
rename to drivers/net/intel/ice/base/ice_metainit.c
diff --git a/drivers/net/ice/base/ice_metainit.h b/drivers/net/intel/ice/base/ice_metainit.h
similarity index 100%
rename from drivers/net/ice/base/ice_metainit.h
rename to drivers/net/intel/ice/base/ice_metainit.h
diff --git a/drivers/net/ice/base/ice_mk_grp.c b/drivers/net/intel/ice/base/ice_mk_grp.c
similarity index 100%
rename from drivers/net/ice/base/ice_mk_grp.c
rename to drivers/net/intel/ice/base/ice_mk_grp.c
diff --git a/drivers/net/ice/base/ice_mk_grp.h b/drivers/net/intel/ice/base/ice_mk_grp.h
similarity index 100%
rename from drivers/net/ice/base/ice_mk_grp.h
rename to drivers/net/intel/ice/base/ice_mk_grp.h
diff --git a/drivers/net/ice/base/ice_nvm.c b/drivers/net/intel/ice/base/ice_nvm.c
similarity index 100%
rename from drivers/net/ice/base/ice_nvm.c
rename to drivers/net/intel/ice/base/ice_nvm.c
diff --git a/drivers/net/ice/base/ice_nvm.h b/drivers/net/intel/ice/base/ice_nvm.h
similarity index 100%
rename from drivers/net/ice/base/ice_nvm.h
rename to drivers/net/intel/ice/base/ice_nvm.h
diff --git a/drivers/net/ice/base/ice_osdep.h b/drivers/net/intel/ice/base/ice_osdep.h
similarity index 100%
rename from drivers/net/ice/base/ice_osdep.h
rename to drivers/net/intel/ice/base/ice_osdep.h
diff --git a/drivers/net/ice/base/ice_parser.c b/drivers/net/intel/ice/base/ice_parser.c
similarity index 100%
rename from drivers/net/ice/base/ice_parser.c
rename to drivers/net/intel/ice/base/ice_parser.c
diff --git a/drivers/net/ice/base/ice_parser.h b/drivers/net/intel/ice/base/ice_parser.h
similarity index 100%
rename from drivers/net/ice/base/ice_parser.h
rename to drivers/net/intel/ice/base/ice_parser.h
diff --git a/drivers/net/ice/base/ice_parser_rt.c b/drivers/net/intel/ice/base/ice_parser_rt.c
similarity index 100%
rename from drivers/net/ice/base/ice_parser_rt.c
rename to drivers/net/intel/ice/base/ice_parser_rt.c
diff --git a/drivers/net/ice/base/ice_parser_rt.h b/drivers/net/intel/ice/base/ice_parser_rt.h
similarity index 100%
rename from drivers/net/ice/base/ice_parser_rt.h
rename to drivers/net/intel/ice/base/ice_parser_rt.h
diff --git a/drivers/net/ice/base/ice_parser_util.h b/drivers/net/intel/ice/base/ice_parser_util.h
similarity index 100%
rename from drivers/net/ice/base/ice_parser_util.h
rename to drivers/net/intel/ice/base/ice_parser_util.h
diff --git a/drivers/net/ice/base/ice_pg_cam.c b/drivers/net/intel/ice/base/ice_pg_cam.c
similarity index 100%
rename from drivers/net/ice/base/ice_pg_cam.c
rename to drivers/net/intel/ice/base/ice_pg_cam.c
diff --git a/drivers/net/ice/base/ice_pg_cam.h b/drivers/net/intel/ice/base/ice_pg_cam.h
similarity index 100%
rename from drivers/net/ice/base/ice_pg_cam.h
rename to drivers/net/intel/ice/base/ice_pg_cam.h
diff --git a/drivers/net/ice/base/ice_phy_regs.h b/drivers/net/intel/ice/base/ice_phy_regs.h
similarity index 100%
rename from drivers/net/ice/base/ice_phy_regs.h
rename to drivers/net/intel/ice/base/ice_phy_regs.h
diff --git a/drivers/net/ice/base/ice_proto_grp.c b/drivers/net/intel/ice/base/ice_proto_grp.c
similarity index 100%
rename from drivers/net/ice/base/ice_proto_grp.c
rename to drivers/net/intel/ice/base/ice_proto_grp.c
diff --git a/drivers/net/ice/base/ice_proto_grp.h b/drivers/net/intel/ice/base/ice_proto_grp.h
similarity index 100%
rename from drivers/net/ice/base/ice_proto_grp.h
rename to drivers/net/intel/ice/base/ice_proto_grp.h
diff --git a/drivers/net/ice/base/ice_protocol_type.h b/drivers/net/intel/ice/base/ice_protocol_type.h
similarity index 100%
rename from drivers/net/ice/base/ice_protocol_type.h
rename to drivers/net/intel/ice/base/ice_protocol_type.h
diff --git a/drivers/net/ice/base/ice_ptp_consts.h b/drivers/net/intel/ice/base/ice_ptp_consts.h
similarity index 100%
rename from drivers/net/ice/base/ice_ptp_consts.h
rename to drivers/net/intel/ice/base/ice_ptp_consts.h
diff --git a/drivers/net/ice/base/ice_ptp_hw.c b/drivers/net/intel/ice/base/ice_ptp_hw.c
similarity index 100%
rename from drivers/net/ice/base/ice_ptp_hw.c
rename to drivers/net/intel/ice/base/ice_ptp_hw.c
diff --git a/drivers/net/ice/base/ice_ptp_hw.h b/drivers/net/intel/ice/base/ice_ptp_hw.h
similarity index 100%
rename from drivers/net/ice/base/ice_ptp_hw.h
rename to drivers/net/intel/ice/base/ice_ptp_hw.h
diff --git a/drivers/net/ice/base/ice_ptype_mk.c b/drivers/net/intel/ice/base/ice_ptype_mk.c
similarity index 100%
rename from drivers/net/ice/base/ice_ptype_mk.c
rename to drivers/net/intel/ice/base/ice_ptype_mk.c
diff --git a/drivers/net/ice/base/ice_ptype_mk.h b/drivers/net/intel/ice/base/ice_ptype_mk.h
similarity index 100%
rename from drivers/net/ice/base/ice_ptype_mk.h
rename to drivers/net/intel/ice/base/ice_ptype_mk.h
diff --git a/drivers/net/ice/base/ice_sbq_cmd.h b/drivers/net/intel/ice/base/ice_sbq_cmd.h
similarity index 100%
rename from drivers/net/ice/base/ice_sbq_cmd.h
rename to drivers/net/intel/ice/base/ice_sbq_cmd.h
diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/intel/ice/base/ice_sched.c
similarity index 100%
rename from drivers/net/ice/base/ice_sched.c
rename to drivers/net/intel/ice/base/ice_sched.c
diff --git a/drivers/net/ice/base/ice_sched.h b/drivers/net/intel/ice/base/ice_sched.h
similarity index 100%
rename from drivers/net/ice/base/ice_sched.h
rename to drivers/net/intel/ice/base/ice_sched.h
diff --git a/drivers/net/ice/base/ice_status.h b/drivers/net/intel/ice/base/ice_status.h
similarity index 100%
rename from drivers/net/ice/base/ice_status.h
rename to drivers/net/intel/ice/base/ice_status.h
diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/intel/ice/base/ice_switch.c
similarity index 100%
rename from drivers/net/ice/base/ice_switch.c
rename to drivers/net/intel/ice/base/ice_switch.c
diff --git a/drivers/net/ice/base/ice_switch.h b/drivers/net/intel/ice/base/ice_switch.h
similarity index 100%
rename from drivers/net/ice/base/ice_switch.h
rename to drivers/net/intel/ice/base/ice_switch.h
diff --git a/drivers/net/ice/base/ice_tmatch.h b/drivers/net/intel/ice/base/ice_tmatch.h
similarity index 100%
rename from drivers/net/ice/base/ice_tmatch.h
rename to drivers/net/intel/ice/base/ice_tmatch.h
diff --git a/drivers/net/ice/base/ice_type.h b/drivers/net/intel/ice/base/ice_type.h
similarity index 100%
rename from drivers/net/ice/base/ice_type.h
rename to drivers/net/intel/ice/base/ice_type.h
diff --git a/drivers/net/ice/base/ice_vf_mbx.c b/drivers/net/intel/ice/base/ice_vf_mbx.c
similarity index 100%
rename from drivers/net/ice/base/ice_vf_mbx.c
rename to drivers/net/intel/ice/base/ice_vf_mbx.c
diff --git a/drivers/net/ice/base/ice_vf_mbx.h b/drivers/net/intel/ice/base/ice_vf_mbx.h
similarity index 100%
rename from drivers/net/ice/base/ice_vf_mbx.h
rename to drivers/net/intel/ice/base/ice_vf_mbx.h
diff --git a/drivers/net/ice/base/ice_vlan_mode.c b/drivers/net/intel/ice/base/ice_vlan_mode.c
similarity index 100%
rename from drivers/net/ice/base/ice_vlan_mode.c
rename to drivers/net/intel/ice/base/ice_vlan_mode.c
diff --git a/drivers/net/ice/base/ice_vlan_mode.h b/drivers/net/intel/ice/base/ice_vlan_mode.h
similarity index 100%
rename from drivers/net/ice/base/ice_vlan_mode.h
rename to drivers/net/intel/ice/base/ice_vlan_mode.h
diff --git a/drivers/net/ice/base/ice_xlt_kb.c b/drivers/net/intel/ice/base/ice_xlt_kb.c
similarity index 100%
rename from drivers/net/ice/base/ice_xlt_kb.c
rename to drivers/net/intel/ice/base/ice_xlt_kb.c
diff --git a/drivers/net/ice/base/ice_xlt_kb.h b/drivers/net/intel/ice/base/ice_xlt_kb.h
similarity index 100%
rename from drivers/net/ice/base/ice_xlt_kb.h
rename to drivers/net/intel/ice/base/ice_xlt_kb.h
diff --git a/drivers/net/ice/base/meson.build b/drivers/net/intel/ice/base/meson.build
similarity index 100%
rename from drivers/net/ice/base/meson.build
rename to drivers/net/intel/ice/base/meson.build
diff --git a/drivers/net/ice/ice_acl_filter.c b/drivers/net/intel/ice/ice_acl_filter.c
similarity index 100%
rename from drivers/net/ice/ice_acl_filter.c
rename to drivers/net/intel/ice/ice_acl_filter.c
diff --git a/drivers/net/ice/ice_dcf.c b/drivers/net/intel/ice/ice_dcf.c
similarity index 100%
rename from drivers/net/ice/ice_dcf.c
rename to drivers/net/intel/ice/ice_dcf.c
diff --git a/drivers/net/ice/ice_dcf.h b/drivers/net/intel/ice/ice_dcf.h
similarity index 100%
rename from drivers/net/ice/ice_dcf.h
rename to drivers/net/intel/ice/ice_dcf.h
diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/intel/ice/ice_dcf_ethdev.c
similarity index 100%
rename from drivers/net/ice/ice_dcf_ethdev.c
rename to drivers/net/intel/ice/ice_dcf_ethdev.c
diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/intel/ice/ice_dcf_ethdev.h
similarity index 100%
rename from drivers/net/ice/ice_dcf_ethdev.h
rename to drivers/net/intel/ice/ice_dcf_ethdev.h
diff --git a/drivers/net/ice/ice_dcf_parent.c b/drivers/net/intel/ice/ice_dcf_parent.c
similarity index 100%
rename from drivers/net/ice/ice_dcf_parent.c
rename to drivers/net/intel/ice/ice_dcf_parent.c
diff --git a/drivers/net/ice/ice_dcf_sched.c b/drivers/net/intel/ice/ice_dcf_sched.c
similarity index 100%
rename from drivers/net/ice/ice_dcf_sched.c
rename to drivers/net/intel/ice/ice_dcf_sched.c
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/intel/ice/ice_dcf_vf_representor.c
similarity index 100%
rename from drivers/net/ice/ice_dcf_vf_representor.c
rename to drivers/net/intel/ice/ice_dcf_vf_representor.c
diff --git a/drivers/net/ice/ice_diagnose.c b/drivers/net/intel/ice/ice_diagnose.c
similarity index 100%
rename from drivers/net/ice/ice_diagnose.c
rename to drivers/net/intel/ice/ice_diagnose.c
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/intel/ice/ice_ethdev.c
similarity index 100%
rename from drivers/net/ice/ice_ethdev.c
rename to drivers/net/intel/ice/ice_ethdev.c
diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/intel/ice/ice_ethdev.h
similarity index 100%
rename from drivers/net/ice/ice_ethdev.h
rename to drivers/net/intel/ice/ice_ethdev.h
diff --git a/drivers/net/ice/ice_fdir_filter.c b/drivers/net/intel/ice/ice_fdir_filter.c
similarity index 100%
rename from drivers/net/ice/ice_fdir_filter.c
rename to drivers/net/intel/ice/ice_fdir_filter.c
diff --git a/drivers/net/ice/ice_generic_flow.c b/drivers/net/intel/ice/ice_generic_flow.c
similarity index 100%
rename from drivers/net/ice/ice_generic_flow.c
rename to drivers/net/intel/ice/ice_generic_flow.c
diff --git a/drivers/net/ice/ice_generic_flow.h b/drivers/net/intel/ice/ice_generic_flow.h
similarity index 100%
rename from drivers/net/ice/ice_generic_flow.h
rename to drivers/net/intel/ice/ice_generic_flow.h
diff --git a/drivers/net/ice/ice_hash.c b/drivers/net/intel/ice/ice_hash.c
similarity index 100%
rename from drivers/net/ice/ice_hash.c
rename to drivers/net/intel/ice/ice_hash.c
diff --git a/drivers/net/ice/ice_logs.h b/drivers/net/intel/ice/ice_logs.h
similarity index 100%
rename from drivers/net/ice/ice_logs.h
rename to drivers/net/intel/ice/ice_logs.h
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/intel/ice/ice_rxtx.c
similarity index 100%
rename from drivers/net/ice/ice_rxtx.c
rename to drivers/net/intel/ice/ice_rxtx.c
diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/intel/ice/ice_rxtx.h
similarity index 100%
rename from drivers/net/ice/ice_rxtx.h
rename to drivers/net/intel/ice/ice_rxtx.h
diff --git a/drivers/net/ice/ice_rxtx_common_avx.h b/drivers/net/intel/ice/ice_rxtx_common_avx.h
similarity index 100%
rename from drivers/net/ice/ice_rxtx_common_avx.h
rename to drivers/net/intel/ice/ice_rxtx_common_avx.h
diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c b/drivers/net/intel/ice/ice_rxtx_vec_avx2.c
similarity index 100%
rename from drivers/net/ice/ice_rxtx_vec_avx2.c
rename to drivers/net/intel/ice/ice_rxtx_vec_avx2.c
diff --git a/drivers/net/ice/ice_rxtx_vec_avx512.c b/drivers/net/intel/ice/ice_rxtx_vec_avx512.c
similarity index 100%
rename from drivers/net/ice/ice_rxtx_vec_avx512.c
rename to drivers/net/intel/ice/ice_rxtx_vec_avx512.c
diff --git a/drivers/net/ice/ice_rxtx_vec_common.h b/drivers/net/intel/ice/ice_rxtx_vec_common.h
similarity index 100%
rename from drivers/net/ice/ice_rxtx_vec_common.h
rename to drivers/net/intel/ice/ice_rxtx_vec_common.h
diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c b/drivers/net/intel/ice/ice_rxtx_vec_sse.c
similarity index 100%
rename from drivers/net/ice/ice_rxtx_vec_sse.c
rename to drivers/net/intel/ice/ice_rxtx_vec_sse.c
diff --git a/drivers/net/ice/ice_switch_filter.c b/drivers/net/intel/ice/ice_switch_filter.c
similarity index 100%
rename from drivers/net/ice/ice_switch_filter.c
rename to drivers/net/intel/ice/ice_switch_filter.c
diff --git a/drivers/net/ice/ice_testpmd.c b/drivers/net/intel/ice/ice_testpmd.c
similarity index 100%
rename from drivers/net/ice/ice_testpmd.c
rename to drivers/net/intel/ice/ice_testpmd.c
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/intel/ice/ice_tm.c
similarity index 100%
rename from drivers/net/ice/ice_tm.c
rename to drivers/net/intel/ice/ice_tm.c
diff --git a/drivers/net/ice/meson.build b/drivers/net/intel/ice/meson.build
similarity index 96%
rename from drivers/net/ice/meson.build
rename to drivers/net/intel/ice/meson.build
index 1c9dc0cc6d..beaf21e176 100644
--- a/drivers/net/ice/meson.build
+++ b/drivers/net/intel/ice/meson.build
@@ -19,7 +19,7 @@ sources = files(
 testpmd_sources = files('ice_testpmd.c')
 
 deps += ['hash', 'net', 'common_iavf']
-includes += include_directories('base', '../../common/iavf')
+includes += include_directories('base')
 
 if arch_subdir == 'x86'
     sources += files('ice_rxtx_vec_sse.c')
diff --git a/drivers/net/ice/version.map b/drivers/net/intel/ice/version.map
similarity index 100%
rename from drivers/net/ice/version.map
rename to drivers/net/intel/ice/version.map
diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/intel/idpf/idpf_ethdev.c
similarity index 100%
rename from drivers/net/idpf/idpf_ethdev.c
rename to drivers/net/intel/idpf/idpf_ethdev.c
diff --git a/drivers/net/idpf/idpf_ethdev.h b/drivers/net/intel/idpf/idpf_ethdev.h
similarity index 100%
rename from drivers/net/idpf/idpf_ethdev.h
rename to drivers/net/intel/idpf/idpf_ethdev.h
diff --git a/drivers/net/idpf/idpf_logs.h b/drivers/net/intel/idpf/idpf_logs.h
similarity index 100%
rename from drivers/net/idpf/idpf_logs.h
rename to drivers/net/intel/idpf/idpf_logs.h
diff --git a/drivers/net/idpf/idpf_rxtx.c b/drivers/net/intel/idpf/idpf_rxtx.c
similarity index 100%
rename from drivers/net/idpf/idpf_rxtx.c
rename to drivers/net/intel/idpf/idpf_rxtx.c
diff --git a/drivers/net/idpf/idpf_rxtx.h b/drivers/net/intel/idpf/idpf_rxtx.h
similarity index 100%
rename from drivers/net/idpf/idpf_rxtx.h
rename to drivers/net/intel/idpf/idpf_rxtx.h
diff --git a/drivers/net/idpf/idpf_rxtx_vec_common.h b/drivers/net/intel/idpf/idpf_rxtx_vec_common.h
similarity index 100%
rename from drivers/net/idpf/idpf_rxtx_vec_common.h
rename to drivers/net/intel/idpf/idpf_rxtx_vec_common.h
diff --git a/drivers/net/idpf/meson.build b/drivers/net/intel/idpf/meson.build
similarity index 100%
rename from drivers/net/idpf/meson.build
rename to drivers/net/intel/idpf/meson.build
diff --git a/drivers/net/igc/base/README b/drivers/net/intel/igc/base/README
similarity index 100%
rename from drivers/net/igc/base/README
rename to drivers/net/intel/igc/base/README
diff --git a/drivers/net/igc/base/igc_82571.h b/drivers/net/intel/igc/base/igc_82571.h
similarity index 100%
rename from drivers/net/igc/base/igc_82571.h
rename to drivers/net/intel/igc/base/igc_82571.h
diff --git a/drivers/net/igc/base/igc_82575.h b/drivers/net/intel/igc/base/igc_82575.h
similarity index 100%
rename from drivers/net/igc/base/igc_82575.h
rename to drivers/net/intel/igc/base/igc_82575.h
diff --git a/drivers/net/igc/base/igc_api.c b/drivers/net/intel/igc/base/igc_api.c
similarity index 100%
rename from drivers/net/igc/base/igc_api.c
rename to drivers/net/intel/igc/base/igc_api.c
diff --git a/drivers/net/igc/base/igc_api.h b/drivers/net/intel/igc/base/igc_api.h
similarity index 100%
rename from drivers/net/igc/base/igc_api.h
rename to drivers/net/intel/igc/base/igc_api.h
diff --git a/drivers/net/igc/base/igc_base.c b/drivers/net/intel/igc/base/igc_base.c
similarity index 100%
rename from drivers/net/igc/base/igc_base.c
rename to drivers/net/intel/igc/base/igc_base.c
diff --git a/drivers/net/igc/base/igc_base.h b/drivers/net/intel/igc/base/igc_base.h
similarity index 100%
rename from drivers/net/igc/base/igc_base.h
rename to drivers/net/intel/igc/base/igc_base.h
diff --git a/drivers/net/igc/base/igc_defines.h b/drivers/net/intel/igc/base/igc_defines.h
similarity index 100%
rename from drivers/net/igc/base/igc_defines.h
rename to drivers/net/intel/igc/base/igc_defines.h
diff --git a/drivers/net/igc/base/igc_hw.h b/drivers/net/intel/igc/base/igc_hw.h
similarity index 100%
rename from drivers/net/igc/base/igc_hw.h
rename to drivers/net/intel/igc/base/igc_hw.h
diff --git a/drivers/net/igc/base/igc_i225.c b/drivers/net/intel/igc/base/igc_i225.c
similarity index 100%
rename from drivers/net/igc/base/igc_i225.c
rename to drivers/net/intel/igc/base/igc_i225.c
diff --git a/drivers/net/igc/base/igc_i225.h b/drivers/net/intel/igc/base/igc_i225.h
similarity index 100%
rename from drivers/net/igc/base/igc_i225.h
rename to drivers/net/intel/igc/base/igc_i225.h
diff --git a/drivers/net/igc/base/igc_ich8lan.h b/drivers/net/intel/igc/base/igc_ich8lan.h
similarity index 100%
rename from drivers/net/igc/base/igc_ich8lan.h
rename to drivers/net/intel/igc/base/igc_ich8lan.h
diff --git a/drivers/net/igc/base/igc_mac.c b/drivers/net/intel/igc/base/igc_mac.c
similarity index 100%
rename from drivers/net/igc/base/igc_mac.c
rename to drivers/net/intel/igc/base/igc_mac.c
diff --git a/drivers/net/igc/base/igc_mac.h b/drivers/net/intel/igc/base/igc_mac.h
similarity index 100%
rename from drivers/net/igc/base/igc_mac.h
rename to drivers/net/intel/igc/base/igc_mac.h
diff --git a/drivers/net/igc/base/igc_manage.c b/drivers/net/intel/igc/base/igc_manage.c
similarity index 100%
rename from drivers/net/igc/base/igc_manage.c
rename to drivers/net/intel/igc/base/igc_manage.c
diff --git a/drivers/net/igc/base/igc_manage.h b/drivers/net/intel/igc/base/igc_manage.h
similarity index 100%
rename from drivers/net/igc/base/igc_manage.h
rename to drivers/net/intel/igc/base/igc_manage.h
diff --git a/drivers/net/igc/base/igc_nvm.c b/drivers/net/intel/igc/base/igc_nvm.c
similarity index 100%
rename from drivers/net/igc/base/igc_nvm.c
rename to drivers/net/intel/igc/base/igc_nvm.c
diff --git a/drivers/net/igc/base/igc_nvm.h b/drivers/net/intel/igc/base/igc_nvm.h
similarity index 100%
rename from drivers/net/igc/base/igc_nvm.h
rename to drivers/net/intel/igc/base/igc_nvm.h
diff --git a/drivers/net/igc/base/igc_osdep.c b/drivers/net/intel/igc/base/igc_osdep.c
similarity index 100%
rename from drivers/net/igc/base/igc_osdep.c
rename to drivers/net/intel/igc/base/igc_osdep.c
diff --git a/drivers/net/igc/base/igc_osdep.h b/drivers/net/intel/igc/base/igc_osdep.h
similarity index 100%
rename from drivers/net/igc/base/igc_osdep.h
rename to drivers/net/intel/igc/base/igc_osdep.h
diff --git a/drivers/net/igc/base/igc_phy.c b/drivers/net/intel/igc/base/igc_phy.c
similarity index 100%
rename from drivers/net/igc/base/igc_phy.c
rename to drivers/net/intel/igc/base/igc_phy.c
diff --git a/drivers/net/igc/base/igc_phy.h b/drivers/net/intel/igc/base/igc_phy.h
similarity index 100%
rename from drivers/net/igc/base/igc_phy.h
rename to drivers/net/intel/igc/base/igc_phy.h
diff --git a/drivers/net/igc/base/igc_regs.h b/drivers/net/intel/igc/base/igc_regs.h
similarity index 100%
rename from drivers/net/igc/base/igc_regs.h
rename to drivers/net/intel/igc/base/igc_regs.h
diff --git a/drivers/net/igc/base/meson.build b/drivers/net/intel/igc/base/meson.build
similarity index 100%
rename from drivers/net/igc/base/meson.build
rename to drivers/net/intel/igc/base/meson.build
diff --git a/drivers/net/igc/igc_ethdev.c b/drivers/net/intel/igc/igc_ethdev.c
similarity index 100%
rename from drivers/net/igc/igc_ethdev.c
rename to drivers/net/intel/igc/igc_ethdev.c
diff --git a/drivers/net/igc/igc_ethdev.h b/drivers/net/intel/igc/igc_ethdev.h
similarity index 100%
rename from drivers/net/igc/igc_ethdev.h
rename to drivers/net/intel/igc/igc_ethdev.h
diff --git a/drivers/net/igc/igc_filter.c b/drivers/net/intel/igc/igc_filter.c
similarity index 100%
rename from drivers/net/igc/igc_filter.c
rename to drivers/net/intel/igc/igc_filter.c
diff --git a/drivers/net/igc/igc_filter.h b/drivers/net/intel/igc/igc_filter.h
similarity index 100%
rename from drivers/net/igc/igc_filter.h
rename to drivers/net/intel/igc/igc_filter.h
diff --git a/drivers/net/igc/igc_flow.c b/drivers/net/intel/igc/igc_flow.c
similarity index 100%
rename from drivers/net/igc/igc_flow.c
rename to drivers/net/intel/igc/igc_flow.c
diff --git a/drivers/net/igc/igc_flow.h b/drivers/net/intel/igc/igc_flow.h
similarity index 100%
rename from drivers/net/igc/igc_flow.h
rename to drivers/net/intel/igc/igc_flow.h
diff --git a/drivers/net/igc/igc_logs.c b/drivers/net/intel/igc/igc_logs.c
similarity index 100%
rename from drivers/net/igc/igc_logs.c
rename to drivers/net/intel/igc/igc_logs.c
diff --git a/drivers/net/igc/igc_logs.h b/drivers/net/intel/igc/igc_logs.h
similarity index 100%
rename from drivers/net/igc/igc_logs.h
rename to drivers/net/intel/igc/igc_logs.h
diff --git a/drivers/net/igc/igc_txrx.c b/drivers/net/intel/igc/igc_txrx.c
similarity index 100%
rename from drivers/net/igc/igc_txrx.c
rename to drivers/net/intel/igc/igc_txrx.c
diff --git a/drivers/net/igc/igc_txrx.h b/drivers/net/intel/igc/igc_txrx.h
similarity index 100%
rename from drivers/net/igc/igc_txrx.h
rename to drivers/net/intel/igc/igc_txrx.h
diff --git a/drivers/net/igc/meson.build b/drivers/net/intel/igc/meson.build
similarity index 100%
rename from drivers/net/igc/meson.build
rename to drivers/net/intel/igc/meson.build
diff --git a/drivers/net/ipn3ke/ipn3ke_ethdev.c b/drivers/net/intel/ipn3ke/ipn3ke_ethdev.c
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_ethdev.c
rename to drivers/net/intel/ipn3ke/ipn3ke_ethdev.c
diff --git a/drivers/net/ipn3ke/ipn3ke_ethdev.h b/drivers/net/intel/ipn3ke/ipn3ke_ethdev.h
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_ethdev.h
rename to drivers/net/intel/ipn3ke/ipn3ke_ethdev.h
diff --git a/drivers/net/ipn3ke/ipn3ke_flow.c b/drivers/net/intel/ipn3ke/ipn3ke_flow.c
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_flow.c
rename to drivers/net/intel/ipn3ke/ipn3ke_flow.c
diff --git a/drivers/net/ipn3ke/ipn3ke_flow.h b/drivers/net/intel/ipn3ke/ipn3ke_flow.h
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_flow.h
rename to drivers/net/intel/ipn3ke/ipn3ke_flow.h
diff --git a/drivers/net/ipn3ke/ipn3ke_logs.h b/drivers/net/intel/ipn3ke/ipn3ke_logs.h
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_logs.h
rename to drivers/net/intel/ipn3ke/ipn3ke_logs.h
diff --git a/drivers/net/ipn3ke/ipn3ke_rawdev_api.h b/drivers/net/intel/ipn3ke/ipn3ke_rawdev_api.h
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_rawdev_api.h
rename to drivers/net/intel/ipn3ke/ipn3ke_rawdev_api.h
diff --git a/drivers/net/ipn3ke/ipn3ke_representor.c b/drivers/net/intel/ipn3ke/ipn3ke_representor.c
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_representor.c
rename to drivers/net/intel/ipn3ke/ipn3ke_representor.c
diff --git a/drivers/net/ipn3ke/ipn3ke_tm.c b/drivers/net/intel/ipn3ke/ipn3ke_tm.c
similarity index 100%
rename from drivers/net/ipn3ke/ipn3ke_tm.c
rename to drivers/net/intel/ipn3ke/ipn3ke_tm.c
diff --git a/drivers/net/ipn3ke/meson.build b/drivers/net/intel/ipn3ke/meson.build
similarity index 91%
rename from drivers/net/ipn3ke/meson.build
rename to drivers/net/intel/ipn3ke/meson.build
index 464bdbd8b6..23c4d36b4f 100644
--- a/drivers/net/ipn3ke/meson.build
+++ b/drivers/net/intel/ipn3ke/meson.build
@@ -21,7 +21,7 @@ if not has_libfdt
     subdir_done()
 endif
 
-includes += include_directories('../../raw/ifpga')
+includes += include_directories('../../../raw/ifpga')
 
 sources += files(
         'ipn3ke_ethdev.c',
diff --git a/drivers/net/ipn3ke/version.map b/drivers/net/intel/ipn3ke/version.map
similarity index 100%
rename from drivers/net/ipn3ke/version.map
rename to drivers/net/intel/ipn3ke/version.map
diff --git a/drivers/net/ixgbe/base/README b/drivers/net/intel/ixgbe/base/README
similarity index 100%
rename from drivers/net/ixgbe/base/README
rename to drivers/net/intel/ixgbe/base/README
diff --git a/drivers/net/ixgbe/base/ixgbe_82598.c b/drivers/net/intel/ixgbe/base/ixgbe_82598.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_82598.c
rename to drivers/net/intel/ixgbe/base/ixgbe_82598.c
diff --git a/drivers/net/ixgbe/base/ixgbe_82598.h b/drivers/net/intel/ixgbe/base/ixgbe_82598.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_82598.h
rename to drivers/net/intel/ixgbe/base/ixgbe_82598.h
diff --git a/drivers/net/ixgbe/base/ixgbe_82599.c b/drivers/net/intel/ixgbe/base/ixgbe_82599.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_82599.c
rename to drivers/net/intel/ixgbe/base/ixgbe_82599.c
diff --git a/drivers/net/ixgbe/base/ixgbe_82599.h b/drivers/net/intel/ixgbe/base/ixgbe_82599.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_82599.h
rename to drivers/net/intel/ixgbe/base/ixgbe_82599.h
diff --git a/drivers/net/ixgbe/base/ixgbe_api.c b/drivers/net/intel/ixgbe/base/ixgbe_api.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_api.c
rename to drivers/net/intel/ixgbe/base/ixgbe_api.c
diff --git a/drivers/net/ixgbe/base/ixgbe_api.h b/drivers/net/intel/ixgbe/base/ixgbe_api.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_api.h
rename to drivers/net/intel/ixgbe/base/ixgbe_api.h
diff --git a/drivers/net/ixgbe/base/ixgbe_common.c b/drivers/net/intel/ixgbe/base/ixgbe_common.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_common.c
rename to drivers/net/intel/ixgbe/base/ixgbe_common.c
diff --git a/drivers/net/ixgbe/base/ixgbe_common.h b/drivers/net/intel/ixgbe/base/ixgbe_common.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_common.h
rename to drivers/net/intel/ixgbe/base/ixgbe_common.h
diff --git a/drivers/net/ixgbe/base/ixgbe_dcb.c b/drivers/net/intel/ixgbe/base/ixgbe_dcb.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_dcb.c
rename to drivers/net/intel/ixgbe/base/ixgbe_dcb.c
diff --git a/drivers/net/ixgbe/base/ixgbe_dcb.h b/drivers/net/intel/ixgbe/base/ixgbe_dcb.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_dcb.h
rename to drivers/net/intel/ixgbe/base/ixgbe_dcb.h
diff --git a/drivers/net/ixgbe/base/ixgbe_dcb_82598.c b/drivers/net/intel/ixgbe/base/ixgbe_dcb_82598.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_dcb_82598.c
rename to drivers/net/intel/ixgbe/base/ixgbe_dcb_82598.c
diff --git a/drivers/net/ixgbe/base/ixgbe_dcb_82598.h b/drivers/net/intel/ixgbe/base/ixgbe_dcb_82598.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_dcb_82598.h
rename to drivers/net/intel/ixgbe/base/ixgbe_dcb_82598.h
diff --git a/drivers/net/ixgbe/base/ixgbe_dcb_82599.c b/drivers/net/intel/ixgbe/base/ixgbe_dcb_82599.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_dcb_82599.c
rename to drivers/net/intel/ixgbe/base/ixgbe_dcb_82599.c
diff --git a/drivers/net/ixgbe/base/ixgbe_dcb_82599.h b/drivers/net/intel/ixgbe/base/ixgbe_dcb_82599.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_dcb_82599.h
rename to drivers/net/intel/ixgbe/base/ixgbe_dcb_82599.h
diff --git a/drivers/net/ixgbe/base/ixgbe_e610.c b/drivers/net/intel/ixgbe/base/ixgbe_e610.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_e610.c
rename to drivers/net/intel/ixgbe/base/ixgbe_e610.c
diff --git a/drivers/net/ixgbe/base/ixgbe_e610.h b/drivers/net/intel/ixgbe/base/ixgbe_e610.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_e610.h
rename to drivers/net/intel/ixgbe/base/ixgbe_e610.h
diff --git a/drivers/net/ixgbe/base/ixgbe_hv_vf.c b/drivers/net/intel/ixgbe/base/ixgbe_hv_vf.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_hv_vf.c
rename to drivers/net/intel/ixgbe/base/ixgbe_hv_vf.c
diff --git a/drivers/net/ixgbe/base/ixgbe_hv_vf.h b/drivers/net/intel/ixgbe/base/ixgbe_hv_vf.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_hv_vf.h
rename to drivers/net/intel/ixgbe/base/ixgbe_hv_vf.h
diff --git a/drivers/net/ixgbe/base/ixgbe_mbx.c b/drivers/net/intel/ixgbe/base/ixgbe_mbx.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_mbx.c
rename to drivers/net/intel/ixgbe/base/ixgbe_mbx.c
diff --git a/drivers/net/ixgbe/base/ixgbe_mbx.h b/drivers/net/intel/ixgbe/base/ixgbe_mbx.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_mbx.h
rename to drivers/net/intel/ixgbe/base/ixgbe_mbx.h
diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.c b/drivers/net/intel/ixgbe/base/ixgbe_osdep.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_osdep.c
rename to drivers/net/intel/ixgbe/base/ixgbe_osdep.c
diff --git a/drivers/net/ixgbe/base/ixgbe_osdep.h b/drivers/net/intel/ixgbe/base/ixgbe_osdep.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_osdep.h
rename to drivers/net/intel/ixgbe/base/ixgbe_osdep.h
diff --git a/drivers/net/ixgbe/base/ixgbe_phy.c b/drivers/net/intel/ixgbe/base/ixgbe_phy.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_phy.c
rename to drivers/net/intel/ixgbe/base/ixgbe_phy.c
diff --git a/drivers/net/ixgbe/base/ixgbe_phy.h b/drivers/net/intel/ixgbe/base/ixgbe_phy.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_phy.h
rename to drivers/net/intel/ixgbe/base/ixgbe_phy.h
diff --git a/drivers/net/ixgbe/base/ixgbe_type.h b/drivers/net/intel/ixgbe/base/ixgbe_type.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_type.h
rename to drivers/net/intel/ixgbe/base/ixgbe_type.h
diff --git a/drivers/net/ixgbe/base/ixgbe_type_e610.h b/drivers/net/intel/ixgbe/base/ixgbe_type_e610.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_type_e610.h
rename to drivers/net/intel/ixgbe/base/ixgbe_type_e610.h
diff --git a/drivers/net/ixgbe/base/ixgbe_vf.c b/drivers/net/intel/ixgbe/base/ixgbe_vf.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_vf.c
rename to drivers/net/intel/ixgbe/base/ixgbe_vf.c
diff --git a/drivers/net/ixgbe/base/ixgbe_vf.h b/drivers/net/intel/ixgbe/base/ixgbe_vf.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_vf.h
rename to drivers/net/intel/ixgbe/base/ixgbe_vf.h
diff --git a/drivers/net/ixgbe/base/ixgbe_x540.c b/drivers/net/intel/ixgbe/base/ixgbe_x540.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_x540.c
rename to drivers/net/intel/ixgbe/base/ixgbe_x540.c
diff --git a/drivers/net/ixgbe/base/ixgbe_x540.h b/drivers/net/intel/ixgbe/base/ixgbe_x540.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_x540.h
rename to drivers/net/intel/ixgbe/base/ixgbe_x540.h
diff --git a/drivers/net/ixgbe/base/ixgbe_x550.c b/drivers/net/intel/ixgbe/base/ixgbe_x550.c
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_x550.c
rename to drivers/net/intel/ixgbe/base/ixgbe_x550.c
diff --git a/drivers/net/ixgbe/base/ixgbe_x550.h b/drivers/net/intel/ixgbe/base/ixgbe_x550.h
similarity index 100%
rename from drivers/net/ixgbe/base/ixgbe_x550.h
rename to drivers/net/intel/ixgbe/base/ixgbe_x550.h
diff --git a/drivers/net/ixgbe/base/meson.build b/drivers/net/intel/ixgbe/base/meson.build
similarity index 100%
rename from drivers/net/ixgbe/base/meson.build
rename to drivers/net/intel/ixgbe/base/meson.build
diff --git a/drivers/net/ixgbe/ixgbe_82599_bypass.c b/drivers/net/intel/ixgbe/ixgbe_82599_bypass.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_82599_bypass.c
rename to drivers/net/intel/ixgbe/ixgbe_82599_bypass.c
diff --git a/drivers/net/ixgbe/ixgbe_bypass.c b/drivers/net/intel/ixgbe/ixgbe_bypass.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_bypass.c
rename to drivers/net/intel/ixgbe/ixgbe_bypass.c
diff --git a/drivers/net/ixgbe/ixgbe_bypass.h b/drivers/net/intel/ixgbe/ixgbe_bypass.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_bypass.h
rename to drivers/net/intel/ixgbe/ixgbe_bypass.h
diff --git a/drivers/net/ixgbe/ixgbe_bypass_api.h b/drivers/net/intel/ixgbe/ixgbe_bypass_api.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_bypass_api.h
rename to drivers/net/intel/ixgbe/ixgbe_bypass_api.h
diff --git a/drivers/net/ixgbe/ixgbe_bypass_defines.h b/drivers/net/intel/ixgbe/ixgbe_bypass_defines.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_bypass_defines.h
rename to drivers/net/intel/ixgbe/ixgbe_bypass_defines.h
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/intel/ixgbe/ixgbe_ethdev.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_ethdev.c
rename to drivers/net/intel/ixgbe/ixgbe_ethdev.c
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/intel/ixgbe/ixgbe_ethdev.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_ethdev.h
rename to drivers/net/intel/ixgbe/ixgbe_ethdev.h
diff --git a/drivers/net/ixgbe/ixgbe_fdir.c b/drivers/net/intel/ixgbe/ixgbe_fdir.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_fdir.c
rename to drivers/net/intel/ixgbe/ixgbe_fdir.c
diff --git a/drivers/net/ixgbe/ixgbe_flow.c b/drivers/net/intel/ixgbe/ixgbe_flow.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_flow.c
rename to drivers/net/intel/ixgbe/ixgbe_flow.c
diff --git a/drivers/net/ixgbe/ixgbe_ipsec.c b/drivers/net/intel/ixgbe/ixgbe_ipsec.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_ipsec.c
rename to drivers/net/intel/ixgbe/ixgbe_ipsec.c
diff --git a/drivers/net/ixgbe/ixgbe_ipsec.h b/drivers/net/intel/ixgbe/ixgbe_ipsec.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_ipsec.h
rename to drivers/net/intel/ixgbe/ixgbe_ipsec.h
diff --git a/drivers/net/ixgbe/ixgbe_logs.h b/drivers/net/intel/ixgbe/ixgbe_logs.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_logs.h
rename to drivers/net/intel/ixgbe/ixgbe_logs.h
diff --git a/drivers/net/ixgbe/ixgbe_pf.c b/drivers/net/intel/ixgbe/ixgbe_pf.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_pf.c
rename to drivers/net/intel/ixgbe/ixgbe_pf.c
diff --git a/drivers/net/ixgbe/ixgbe_recycle_mbufs_vec_common.c b/drivers/net/intel/ixgbe/ixgbe_recycle_mbufs_vec_common.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_recycle_mbufs_vec_common.c
rename to drivers/net/intel/ixgbe/ixgbe_recycle_mbufs_vec_common.c
diff --git a/drivers/net/ixgbe/ixgbe_regs.h b/drivers/net/intel/ixgbe/ixgbe_regs.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_regs.h
rename to drivers/net/intel/ixgbe/ixgbe_regs.h
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/intel/ixgbe/ixgbe_rxtx.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_rxtx.c
rename to drivers/net/intel/ixgbe/ixgbe_rxtx.c
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/intel/ixgbe/ixgbe_rxtx.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_rxtx.h
rename to drivers/net/intel/ixgbe/ixgbe_rxtx.h
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_common.h b/drivers/net/intel/ixgbe/ixgbe_rxtx_vec_common.h
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_rxtx_vec_common.h
rename to drivers/net/intel/ixgbe/ixgbe_rxtx_vec_common.h
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/intel/ixgbe/ixgbe_rxtx_vec_neon.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
rename to drivers/net/intel/ixgbe/ixgbe_rxtx_vec_neon.c
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c b/drivers/net/intel/ixgbe/ixgbe_rxtx_vec_sse.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
rename to drivers/net/intel/ixgbe/ixgbe_rxtx_vec_sse.c
diff --git a/drivers/net/ixgbe/ixgbe_testpmd.c b/drivers/net/intel/ixgbe/ixgbe_testpmd.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_testpmd.c
rename to drivers/net/intel/ixgbe/ixgbe_testpmd.c
diff --git a/drivers/net/ixgbe/ixgbe_tm.c b/drivers/net/intel/ixgbe/ixgbe_tm.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_tm.c
rename to drivers/net/intel/ixgbe/ixgbe_tm.c
diff --git a/drivers/net/ixgbe/ixgbe_vf_representor.c b/drivers/net/intel/ixgbe/ixgbe_vf_representor.c
similarity index 100%
rename from drivers/net/ixgbe/ixgbe_vf_representor.c
rename to drivers/net/intel/ixgbe/ixgbe_vf_representor.c
diff --git a/drivers/net/ixgbe/meson.build b/drivers/net/intel/ixgbe/meson.build
similarity index 100%
rename from drivers/net/ixgbe/meson.build
rename to drivers/net/intel/ixgbe/meson.build
diff --git a/drivers/net/ixgbe/rte_pmd_ixgbe.c b/drivers/net/intel/ixgbe/rte_pmd_ixgbe.c
similarity index 100%
rename from drivers/net/ixgbe/rte_pmd_ixgbe.c
rename to drivers/net/intel/ixgbe/rte_pmd_ixgbe.c
diff --git a/drivers/net/ixgbe/rte_pmd_ixgbe.h b/drivers/net/intel/ixgbe/rte_pmd_ixgbe.h
similarity index 100%
rename from drivers/net/ixgbe/rte_pmd_ixgbe.h
rename to drivers/net/intel/ixgbe/rte_pmd_ixgbe.h
diff --git a/drivers/net/ixgbe/version.map b/drivers/net/intel/ixgbe/version.map
similarity index 100%
rename from drivers/net/ixgbe/version.map
rename to drivers/net/intel/ixgbe/version.map
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index dafd637ba4..02a3f5a0b6 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -13,28 +13,28 @@ drivers = [
         'bnxt',
         'bonding',
         'cnxk',
-        'cpfl',
         'cxgbe',
         'dpaa',
         'dpaa2',
-        'e1000',
         'ena',
         'enetc',
         'enetfec',
         'enic',
         'failsafe',
-        'fm10k',
         'gve',
         'hinic',
         'hns3',
-        'i40e',
-        'iavf',
-        'ice',
-        'idpf',
-        'igc',
+        'intel/cpfl',
+        'intel/e1000',
+        'intel/fm10k',
+        'intel/i40e',
+        'intel/iavf',
+        'intel/ice',
+        'intel/idpf',
+        'intel/igc',
+        'intel/ipn3ke',
+        'intel/ixgbe',
         'ionic',
-        'ipn3ke',
-        'ixgbe',
         'mana',
         'memif',
         'mlx4',
diff --git a/drivers/raw/ifpga/meson.build b/drivers/raw/ifpga/meson.build
index 20dea23206..444799cfb2 100644
--- a/drivers/raw/ifpga/meson.build
+++ b/drivers/raw/ifpga/meson.build
@@ -18,7 +18,5 @@ sources = files('ifpga_rawdev.c', 'rte_pmd_ifpga.c', 'afu_pmd_core.c',
     'afu_pmd_he_hssi.c')
 
 includes += include_directories('base')
-includes += include_directories('../../net/ipn3ke')
-includes += include_directories('../../net/i40e')
 
 headers = files('rte_pmd_ifpga.h')
diff --git a/usertools/dpdk-rss-flows.py b/usertools/dpdk-rss-flows.py
index e5e7185884..b60cd18e7b 100755
--- a/usertools/dpdk-rss-flows.py
+++ b/usertools/dpdk-rss-flows.py
@@ -213,7 +213,7 @@ def reta_size(self, num_queues: int) -> int:
         key=bytes(
             (
                 # fmt: off
-                # rss_intel_key, see drivers/net/ixgbe/ixgbe_rxtx.c
+                # rss_intel_key, see drivers/net/intel/ixgbe/ixgbe_rxtx.c
                 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2,
                 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0,
                 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4,
@@ -228,7 +228,7 @@ def reta_size(self, num_queues: int) -> int:
         key=bytes(
             (
                 # fmt: off
-                # rss_key_default, see drivers/net/i40e/i40e_ethdev.c
+                # rss_key_default, see drivers/net/intel/i40e/i40e_ethdev.c
                 # i40e is the only driver that takes 52 bytes keys
                 0x44, 0x39, 0x79, 0x6b, 0xb5, 0x4c, 0x50, 0x23,
                 0xb6, 0x75, 0xea, 0x5b, 0x12, 0x4f, 0x9f, 0x30,
-- 
2.43.0


^ permalink raw reply	[relevance 1%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  2025-01-24  3:52  3%                   ` Naga Harish K, S V
@ 2025-01-24 10:00  3%                     ` Shijith Thotton
  2025-01-29  5:04  0%                       ` Naga Harish K, S V
  0 siblings, 1 reply; 200+ results
From: Shijith Thotton @ 2025-01-24 10:00 UTC (permalink / raw)
  To: Naga Harish K, S V, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Jerin Jacob, Liang Ma, Mccarthy,
	Peter, Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly

>> >> >> >> >>> This RFC introduces a new API,
>> >> >> >> >>> rte_event_eth_rx_adapter_queues_add(),
>> >> >> >> >>> designed to enhance the flexibility of configuring multiple
>> >> >> >> >>> Rx queues in eventdev Rx adapter.
>> >> >> >> >>>
>> >> >> >> >>> The existing rte_event_eth_rx_adapter_queue_add() API
>> >> >> >> >>> supports adding multiple queues by specifying rx_queue_id =
>> >> >> >> >>> -1, but it lacks the ability to
>> >> >> >> >apply
>> >> >> >> >>> specific configurations to each of the added queues.
>> >> >> >> >>>
>> >> >> >> >>
>> >> >> >> >>The application can still use the existing
>> >> >> >> >>rte_event_eth_rx_adapter_queue_add() API in a loop with
>> >> >> >> >>different configurations for different queues.
>> >> >> >> >>
>> >> >> >> >>The proposed API is not enabling new features that cannot be
>> >> >> >> >>achieved with the existing API.
>> >> >> >> >>Adding new APIs without much usefulness causes unnecessary
>> >> >> >> >>complexity/confusion for users.
>> >> >> >> >>
>> >>
>> >> The eth_rx_adapter_queue_add eventdev PMD operation can be updated
>> to
>> >> support burst mode. Internally, both the new and existing APIs can
>> >> utilize this updated operation. This enables applications to use
>> >> either API and achieve
>> >the
>> >> same results while adding a single queue. For adding multiple RX
>> >> queues to
>> >the
>> >> adapter, the new API can be used as it is not supported by the old API.
>> >>
>> >
>> >Not all platforms implement the eventdev PMD operation for
>> >eth_rx_adapter_queue_add, so this does not apply to all platforms.
>> >
>>
>> Yes, but there are hardware PMDs that implement
>eth_rx_adapter_queue_add
>> op, and I am looking for a solution that works for both cases.
>>
>> The idea is to use the new eventdev PMD operation
>> (eth_rx_adapter_queues_add) within the
>> rte_event_eth_rx_adapter_queue_add() API. The parameters of this API can
>> be easily mapped to and supported by the new PMD operation.
>>
>
>This requires a change to the rte_event_eth_rx_adapter_queue_add() stable
>API parameters.
>This is an ABI breakage and may not be possible now.
>It requires changes to many current applications that are using the
>rte_event_eth_rx_adapter_queue_add() stable API.
>

What I meant by mapping was to retain the stable API parameters as they are.
Internally, the API can use the proposed eventdev PMD operation
(eth_rx_adapter_queues_add) without causing an ABI break, as shown below.

int rte_event_eth_rx_adapter_queue_add(uint8_t id, uint16_t eth_dev_id,
                int32_t rx_queue_id,
                const struct rte_event_eth_rx_adapter_queue_conf *conf) {
        if (rx_queue_id == -1)
                dev->dev_ops->eth_rx_adapter_queues_add)(
                        dev, &rte_eth_devices[eth_dev_id], 0,
                        conf, 0);
        else
                dev->dev_ops->eth_rx_adapter_queues_add)(
                        dev, &rte_eth_devices[eth_dev_id], &rx_queue_id,
                        conf, 1);
}

With above change, old op (eth_rx_adapter_queue_add) can be removed as
both API (stable and proposed) will be using eth_rx_adapter_queues_add.

>> typedef int (*eventdev_eth_rx_adapter_queues_add_t)(
>>     const struct rte_eventdev *dev,
>>     const struct rte_eth_dev *eth_dev,
>>     int32_t rx_queue_id[],
>>     const struct rte_event_eth_rx_adapter_queue_conf queue_conf[],
>>     uint16_t nb_rx_queues);
>>
>> With this, the old PMD op (eth_rx_adapter_queue_add) can be removed.
>>
>> >> >> >> >
>> >> >> >> >The new API was introduced because the existing API does not
>> >> >> >> >support adding multiple queues with specific configurations.
>> >> >> >> >It serves as a burst variant of the existing API, like many
>> >> >> >> >other APIs in
>> >> DPDK.
>> >> >> >> >
>> >> >> >
>> >> >> >The other burst APIs may be there for dataplane functionalities,
>> >> >> >but may not be for the control plane functionalities.
>> >> >> >
>> >> >>
>> >> >> rte_acl_add_rules() is an example of burst API in control path.
>> >> >>
>> >> >
>> >> >I mean, In general, burst APIs are for data-plane functions.
>> >> >This may be one of the rare cases where a burst API is in the control path.
>> >> >
>> >> >> >> >For better clarity, the API can be renamed to
>> >> >> >> >rte_event_eth_rx_adapter_queue_add_burst() if needed.
>> >> >> >> >
>> >> >> >> >In hardware, adding each queue individually incurs significant
>> >> >> >> >overheads, such as mailbox operations. A burst API helps to
>> >> >> >> >amortize this overhead. Since real- world applications often
>> >> >> >> >call the API with specific queue_ids, the burst API can
>> >> >> >> >provide considerable
>> >> benefits.
>> >> >> >> >Testing shows a 75% reduction in time when adding multiple
>> >> >> >> >queues to the RX adapter using the burst API on our platform.
>> >> >> >> >
>> >> >> >
>> >> >> > As batching helps for a particular hardware device, this may not
>> >> >> >be applicable for all platforms/cases.
>> >> >> >	Since queue_add is a control plane operation, latency may not
>be
>> >> >> >a concern.
>> >> >>
>> >> >> In certain use cases, these APIs can be considered semi-fast path.
>> >> >> For
>> >> >instance,
>> >> >> in an application that hotplugs a port on demand, configuring all
>> >> >> available queues simultaneously can significantly reduce latency.
>> >> >>
>> >> >
>> >> >As said earlier, this latency reduction (when trying to add multiple
>> >> >RX queues to the Event Ethernet Rx adapter) may not apply to all
>> >> platforms/cases.
>> >> >This API is not for configuring queues but for adding the queues to
>> >> >the RX adapter.
>> >> >
>> >> >> >How to specify a particular set(specific queue_ids) of rx_queues
>> >> >> >that has a non- zero start index with the new proposed API?
>> >> >>
>> >> >> In the proposed API,
>> >> >> int rte_event_eth_rx_adapter_queues_add(
>> >> >>                         uint8_t id, uint16_t eth_dev_id, int32_t rx_queue_id[],
>> >> >>                         const struct rte_event_eth_rx_adapter_queue_conf conf[],
>> >> >>                         uint16_t nb_rx_queues); rx_queues_id is an
>> >> >> array containing the receive queues ids, which can start from a
>> >> >> non-zero value. The array index is used solely to locate the
>> >> >> corresponding queue_conf. For example, rx_queues_id[i] will use
>conf[i].
>> >> >>
>> >> >
>> >> >Ok
>> >> >
>> >> >> >	Since this is still not possible with the proposed API, the
>> >> >> >existing queue_add API needs to be used with specific queue_ids
>> >> >> >and their configurations.
>> >> >> >
>> >> >> >> >I can modify the old API implementation to act as a wrapper
>> >> >> >> >around the burst API, with number of queues equal to 1. If
>> >> >> >> >concerns remain, we can explore deprecation as an alternative.
>> >> >> >> >
>> >> >> >>
>> >> >> >> Please let me know if you have any suggestions/feedback on what
>> >> >> >> I said above.
>> >> >> >
>> >> >> >Still feel the new proposed API can be avoided as it looks like a
>> >> >> >different combination of existing API instead of adding some new
>> >features.
>> >> >> >
>> >> >> >> If not, I can go ahead and send v1.
>> >> >> >>
>> >> >> >> >>> The proposed API, rte_event_eth_rx_adapter_queues_add,
>> >> >> >> >>> addresses this limitation by:
>> >> >> >> >>>
>> >> >> >> >>> - Enabling users to specify an array of rx_queue_id values
>> alongside
>> >> >> >> >>>   individual configurations for each queue.
>> >> >> >> >>>
>> >> >> >> >>> - Supporting a nb_rx_queues argument to define the number
>> >> >> >> >>> of queues
>> >> >> >to
>> >> >> >> >>>   configure. When set to 0, the API applies a common
>> >> >> >> >>> configuration
>> >> to
>> >> >> >> >>>   all queues, similar to the existing rx_queue_id = -1 behavior.
>> >> >> >> >>>
>> >> >> >> >>> This enhancement allows for more granular control when
>> >> >> >> >>> configuring
>> >> >> >> >multiple
>> >> >> >> >>> Rx queues. Additionally, the API can act as a replacement
>> >> >> >> >>> for the older API, offering both flexibility and improved
>> functionality.
>> >> >> >> >>>
>> >> >> >> >>> Signed-off-by: Shijith Thotton <sthotton@marvell.com>
>> >> >> >> >>> ---
>> >> >> >> >>>  lib/eventdev/eventdev_pmd.h             | 34
>> >> >> >> +++++++++++++++++++++++++
>> >> >> >> >>>  lib/eventdev/rte_event_eth_rx_adapter.h | 34
>> >> >> >> >>> +++++++++++++++++++++++++
>> >> >> >> >>>  2 files changed, 68 insertions(+)
>> >> >> >> >>>
>> >> >> >> >>> diff --git a/lib/eventdev/eventdev_pmd.h
>> >> >> >> >>> b/lib/eventdev/eventdev_pmd.h index
>> 36148f8d86..2e458a9779
>> >> >> >> 100644
>> >> >> >> >>> --- a/lib/eventdev/eventdev_pmd.h
>> >> >> >> >>> +++ b/lib/eventdev/eventdev_pmd.h
>> >> >> >> >>> @@ -25,6 +25,7 @@
>> >> >> >> >>>  #include <rte_mbuf_dyn.h>
>> >> >> >> >>>
>> >> >> >> >>>  #include "event_timer_adapter_pmd.h"
>> >> >> >> >>> +#include "rte_event_eth_rx_adapter.h"
>> >> >> >> >>>  #include "rte_eventdev.h"
>> >> >> >> >>>
>> >> >> >> >>>  #ifdef __cplusplus
>> >> >> >> >>> @@ -708,6 +709,37 @@ typedef int
>> >> >> >> >>> (*eventdev_eth_rx_adapter_queue_add_t)(
>> >> >> >> >>>  		int32_t rx_queue_id,
>> >> >> >> >>>  		const struct rte_event_eth_rx_adapter_queue_conf
>> >> >> >> >>> *queue_conf);
>> >> >> >> >>>
>> >> >> >> >>> +/**
>> >> >> >> >>> + * Add ethernet Rx queues to event device. This callback
>> >> >> >> >>> +is invoked if
>> >> >> >> >>> + * the caps returned from
>> >> >> >> >>> +rte_eventdev_eth_rx_adapter_caps_get(,
>> >> >> >> >>> +eth_port_id)
>> >> >> >> >>> + * has RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT
>set.
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param dev
>> >> >> >> >>> + *   Event device pointer
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param eth_dev
>> >> >> >> >>> + *   Ethernet device pointer
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param rx_queue_id
>> >> >> >> >>> + *   Ethernet device receive queue index array
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param queue_conf
>> >> >> >> >>> + *   Additional configuration structure array
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param nb_rx_queues
>> >> >> >> >>> + *   Number of ethernet device receive queues
>> >> >> >> >>> + *
>> >> >> >> >>> + * @return
>> >> >> >> >>> + *   - 0: Success, ethernet receive queues added successfully.
>> >> >> >> >>> + *   - <0: Error code returned by the driver function.
>> >> >> >> >>> + */
>> >> >> >> >>> +typedef int (*eventdev_eth_rx_adapter_queues_add_t)(
>> >> >> >> >>> +		const struct rte_eventdev *dev,
>> >> >> >> >>> +		const struct rte_eth_dev *eth_dev,
>> >> >> >> >>> +		int32_t rx_queue_id[],
>> >> >> >> >>> +		const struct rte_event_eth_rx_adapter_queue_conf
>> >> >> >> >>> queue_conf[],
>> >> >> >> >>> +		uint16_t nb_rx_queues);
>> >> >> >> >>> +
>> >> >> >> >>>  /**
>> >> >> >> >>>   * Delete ethernet Rx queues from event device. This
>> >> >> >> >>> callback is
>> >> >invoked
>> >> >> if
>> >> >> >> >>>   * the caps returned from
>> >> >> >> >>> eventdev_eth_rx_adapter_caps_get(,
>> >> >> >> >eth_port_id)
>> >> >> >> >>> @@ -1578,6 +1610,8 @@ struct eventdev_ops {
>> >> >> >> >>>  	/**< Get ethernet Rx adapter capabilities */
>> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_add_t
>> >> >eth_rx_adapter_queue_add;
>> >> >> >> >>>  	/**< Add Rx queues to ethernet Rx adapter */
>> >> >> >> >>> +	eventdev_eth_rx_adapter_queues_add_t
>> >> >> >> >>> eth_rx_adapter_queues_add;
>> >> >> >> >>> +	/**< Add Rx queues to ethernet Rx adapter */
>> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_del_t
>> >> >eth_rx_adapter_queue_del;
>> >> >> >> >>>  	/**< Delete Rx queues from ethernet Rx adapter */
>> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_conf_get_t
>> >> >> >> >>> eth_rx_adapter_queue_conf_get; diff --git
>> >> >> >> >>> a/lib/eventdev/rte_event_eth_rx_adapter.h
>> >> >> >> >>> b/lib/eventdev/rte_event_eth_rx_adapter.h
>> >> >> >> >>> index 9237e198a7..9a5c560b67 100644
>> >> >> >> >>> --- a/lib/eventdev/rte_event_eth_rx_adapter.h
>> >> >> >> >>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.h
>> >> >> >> >>> @@ -553,6 +553,40 @@ int
>> >> >> >> rte_event_eth_rx_adapter_queue_add(uint8_t
>> >> >> >> >>> id,
>> >> >> >> >>>  			int32_t rx_queue_id,
>> >> >> >> >>>  			const struct
>> >> >rte_event_eth_rx_adapter_queue_conf
>> >> >> >> >>> *conf);
>> >> >> >> >>>
>> >> >> >> >>> +/**
>> >> >> >> >>> + * Add multiple receive queues to an event adapter.
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param id
>> >> >> >> >>> + *  Adapter identifier.
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param eth_dev_id
>> >> >> >> >>> + *  Port identifier of Ethernet device.
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param rx_queue_id
>> >> >> >> >>> + *  Array of Ethernet device receive queue indices.
>> >> >> >> >>> + *  If nb_rx_queues is 0, then rx_queue_id is ignored.
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param conf
>> >> >> >> >>> + *  Array of additional configuration structures of type
>> >> >> >> >>> + *  *rte_event_eth_rx_adapter_queue_conf*. conf[i] is used
>> >> >> >> >>> +for
>> >> >> >> >>> rx_queue_id[i].
>> >> >> >> >>> + *  If nb_rx_queues is 0, then conf[0] is used for all Rx queues.
>> >> >> >> >>> + *
>> >> >> >> >>> + * @param nb_rx_queues
>> >> >> >> >>> + *  Number of receive queues to add.
>> >> >> >> >>> + *  If nb_rx_queues is 0, then all Rx queues configured
>> >> >> >> >>> +for
>> >> >> >> >>> + *  the device are added with the same configuration in conf[0].
>> >> >> >> >>> + * @see RTE_EVENT_ETH_RX_ADAPTER_CAP_MULTI_EVENTQ
>> >> >> >> >>> + *
>> >> >> >> >>> + * @return
>> >> >> >> >>> + *  - 0: Success, Receive queues added correctly.
>> >> >> >> >>> + *  - <0: Error code on failure.
>> >> >> >> >>> + */
>> >> >> >> >>> +__rte_experimental
>> >> >> >> >>> +int rte_event_eth_rx_adapter_queues_add(
>> >> >> >> >>> +			uint8_t id, uint16_t eth_dev_id, int32_t
>> >> >> >> >>> rx_queue_id[],
>> >> >> >> >>> +			const struct
>> >> >> rte_event_eth_rx_adapter_queue_conf
>> >> >> >> >>> conf[],
>> >> >> >> >>> +			uint16_t nb_rx_queues);
>> >> >> >> >>> +
>> >> >> >> >>>  /**
>> >> >> >> >>>   * Delete receive queue from an event adapter.
>> >> >> >> >>>   *
>> >> >> >> >>> --
>> >> >> >> >>> 2.25.1


^ permalink raw reply	[relevance 3%]

* RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx queues
  @ 2025-01-24  3:52  3%                   ` Naga Harish K, S V
  2025-01-24 10:00  3%                     ` Shijith Thotton
  0 siblings, 1 reply; 200+ results
From: Naga Harish K, S V @ 2025-01-24  3:52 UTC (permalink / raw)
  To: Shijith Thotton, dev
  Cc: Pavan Nikhilesh Bhagavatula, Pathak, Pravin, Hemant Agrawal,
	Sachin Saxena, Mattias R_nnblom, Jerin Jacob, Liang Ma, Mccarthy,
	Peter, Van Haaren, Harry, Carrillo, Erik G, Gujjar, Abhinandan S,
	Amit Prakash Shukla, Burakov, Anatoly



> -----Original Message-----
> From: Shijith Thotton <sthotton@marvell.com>
> Sent: Wednesday, January 22, 2025 7:13 PM
> To: Naga Harish K, S V <s.v.naga.harish.k@intel.com>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Pathak,
> Pravin <pravin.pathak@intel.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Sachin Saxena <sachin.saxena@nxp.com>;
> Mattias R_nnblom <mattias.ronnblom@ericsson.com>; Jerin Jacob
> <jerinj@marvell.com>; Liang Ma <liangma@liangbit.com>; Mccarthy, Peter
> <peter.mccarthy@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Amit Prakash Shukla
> <amitprakashs@marvell.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>
> Subject: RE: [RFC PATCH] eventdev: adapter API to configure multiple Rx
> queues
> 
> >> >> >> >>> This RFC introduces a new API,
> >> >> >> >>> rte_event_eth_rx_adapter_queues_add(),
> >> >> >> >>> designed to enhance the flexibility of configuring multiple
> >> >> >> >>> Rx queues in eventdev Rx adapter.
> >> >> >> >>>
> >> >> >> >>> The existing rte_event_eth_rx_adapter_queue_add() API
> >> >> >> >>> supports adding multiple queues by specifying rx_queue_id =
> >> >> >> >>> -1, but it lacks the ability to
> >> >> >> >apply
> >> >> >> >>> specific configurations to each of the added queues.
> >> >> >> >>>
> >> >> >> >>
> >> >> >> >>The application can still use the existing
> >> >> >> >>rte_event_eth_rx_adapter_queue_add() API in a loop with
> >> >> >> >>different configurations for different queues.
> >> >> >> >>
> >> >> >> >>The proposed API is not enabling new features that cannot be
> >> >> >> >>achieved with the existing API.
> >> >> >> >>Adding new APIs without much usefulness causes unnecessary
> >> >> >> >>complexity/confusion for users.
> >> >> >> >>
> >>
> >> The eth_rx_adapter_queue_add eventdev PMD operation can be updated
> to
> >> support burst mode. Internally, both the new and existing APIs can
> >> utilize this updated operation. This enables applications to use
> >> either API and achieve
> >the
> >> same results while adding a single queue. For adding multiple RX
> >> queues to
> >the
> >> adapter, the new API can be used as it is not supported by the old API.
> >>
> >
> >Not all platforms implement the eventdev PMD operation for
> >eth_rx_adapter_queue_add, so this does not apply to all platforms.
> >
> 
> Yes, but there are hardware PMDs that implement eth_rx_adapter_queue_add
> op, and I am looking for a solution that works for both cases.
> 
> The idea is to use the new eventdev PMD operation
> (eth_rx_adapter_queues_add) within the
> rte_event_eth_rx_adapter_queue_add() API. The parameters of this API can
> be easily mapped to and supported by the new PMD operation.
> 

This requires a change to the rte_event_eth_rx_adapter_queue_add() stable API parameters.
This is an ABI breakage and may not be possible now.
It requires changes to many current applications that are using the rte_event_eth_rx_adapter_queue_add() stable API.

> typedef int (*eventdev_eth_rx_adapter_queues_add_t)(
>     const struct rte_eventdev *dev,
>     const struct rte_eth_dev *eth_dev,
>     int32_t rx_queue_id[],
>     const struct rte_event_eth_rx_adapter_queue_conf queue_conf[],
>     uint16_t nb_rx_queues);
> 
> With this, the old PMD op (eth_rx_adapter_queue_add) can be removed.
> 
> >> >> >> >
> >> >> >> >The new API was introduced because the existing API does not
> >> >> >> >support adding multiple queues with specific configurations.
> >> >> >> >It serves as a burst variant of the existing API, like many
> >> >> >> >other APIs in
> >> DPDK.
> >> >> >> >
> >> >> >
> >> >> >The other burst APIs may be there for dataplane functionalities,
> >> >> >but may not be for the control plane functionalities.
> >> >> >
> >> >>
> >> >> rte_acl_add_rules() is an example of burst API in control path.
> >> >>
> >> >
> >> >I mean, In general, burst APIs are for data-plane functions.
> >> >This may be one of the rare cases where a burst API is in the control path.
> >> >
> >> >> >> >For better clarity, the API can be renamed to
> >> >> >> >rte_event_eth_rx_adapter_queue_add_burst() if needed.
> >> >> >> >
> >> >> >> >In hardware, adding each queue individually incurs significant
> >> >> >> >overheads, such as mailbox operations. A burst API helps to
> >> >> >> >amortize this overhead. Since real- world applications often
> >> >> >> >call the API with specific queue_ids, the burst API can
> >> >> >> >provide considerable
> >> benefits.
> >> >> >> >Testing shows a 75% reduction in time when adding multiple
> >> >> >> >queues to the RX adapter using the burst API on our platform.
> >> >> >> >
> >> >> >
> >> >> > As batching helps for a particular hardware device, this may not
> >> >> >be applicable for all platforms/cases.
> >> >> >	Since queue_add is a control plane operation, latency may not be
> >> >> >a concern.
> >> >>
> >> >> In certain use cases, these APIs can be considered semi-fast path.
> >> >> For
> >> >instance,
> >> >> in an application that hotplugs a port on demand, configuring all
> >> >> available queues simultaneously can significantly reduce latency.
> >> >>
> >> >
> >> >As said earlier, this latency reduction (when trying to add multiple
> >> >RX queues to the Event Ethernet Rx adapter) may not apply to all
> >> platforms/cases.
> >> >This API is not for configuring queues but for adding the queues to
> >> >the RX adapter.
> >> >
> >> >> >How to specify a particular set(specific queue_ids) of rx_queues
> >> >> >that has a non- zero start index with the new proposed API?
> >> >>
> >> >> In the proposed API,
> >> >> int rte_event_eth_rx_adapter_queues_add(
> >> >>                         uint8_t id, uint16_t eth_dev_id, int32_t rx_queue_id[],
> >> >>                         const struct rte_event_eth_rx_adapter_queue_conf conf[],
> >> >>                         uint16_t nb_rx_queues); rx_queues_id is an
> >> >> array containing the receive queues ids, which can start from a
> >> >> non-zero value. The array index is used solely to locate the
> >> >> corresponding queue_conf. For example, rx_queues_id[i] will use conf[i].
> >> >>
> >> >
> >> >Ok
> >> >
> >> >> >	Since this is still not possible with the proposed API, the
> >> >> >existing queue_add API needs to be used with specific queue_ids
> >> >> >and their configurations.
> >> >> >
> >> >> >> >I can modify the old API implementation to act as a wrapper
> >> >> >> >around the burst API, with number of queues equal to 1. If
> >> >> >> >concerns remain, we can explore deprecation as an alternative.
> >> >> >> >
> >> >> >>
> >> >> >> Please let me know if you have any suggestions/feedback on what
> >> >> >> I said above.
> >> >> >
> >> >> >Still feel the new proposed API can be avoided as it looks like a
> >> >> >different combination of existing API instead of adding some new
> >features.
> >> >> >
> >> >> >> If not, I can go ahead and send v1.
> >> >> >>
> >> >> >> >>> The proposed API, rte_event_eth_rx_adapter_queues_add,
> >> >> >> >>> addresses this limitation by:
> >> >> >> >>>
> >> >> >> >>> - Enabling users to specify an array of rx_queue_id values
> alongside
> >> >> >> >>>   individual configurations for each queue.
> >> >> >> >>>
> >> >> >> >>> - Supporting a nb_rx_queues argument to define the number
> >> >> >> >>> of queues
> >> >> >to
> >> >> >> >>>   configure. When set to 0, the API applies a common
> >> >> >> >>> configuration
> >> to
> >> >> >> >>>   all queues, similar to the existing rx_queue_id = -1 behavior.
> >> >> >> >>>
> >> >> >> >>> This enhancement allows for more granular control when
> >> >> >> >>> configuring
> >> >> >> >multiple
> >> >> >> >>> Rx queues. Additionally, the API can act as a replacement
> >> >> >> >>> for the older API, offering both flexibility and improved
> functionality.
> >> >> >> >>>
> >> >> >> >>> Signed-off-by: Shijith Thotton <sthotton@marvell.com>
> >> >> >> >>> ---
> >> >> >> >>>  lib/eventdev/eventdev_pmd.h             | 34
> >> >> >> +++++++++++++++++++++++++
> >> >> >> >>>  lib/eventdev/rte_event_eth_rx_adapter.h | 34
> >> >> >> >>> +++++++++++++++++++++++++
> >> >> >> >>>  2 files changed, 68 insertions(+)
> >> >> >> >>>
> >> >> >> >>> diff --git a/lib/eventdev/eventdev_pmd.h
> >> >> >> >>> b/lib/eventdev/eventdev_pmd.h index
> 36148f8d86..2e458a9779
> >> >> >> 100644
> >> >> >> >>> --- a/lib/eventdev/eventdev_pmd.h
> >> >> >> >>> +++ b/lib/eventdev/eventdev_pmd.h
> >> >> >> >>> @@ -25,6 +25,7 @@
> >> >> >> >>>  #include <rte_mbuf_dyn.h>
> >> >> >> >>>
> >> >> >> >>>  #include "event_timer_adapter_pmd.h"
> >> >> >> >>> +#include "rte_event_eth_rx_adapter.h"
> >> >> >> >>>  #include "rte_eventdev.h"
> >> >> >> >>>
> >> >> >> >>>  #ifdef __cplusplus
> >> >> >> >>> @@ -708,6 +709,37 @@ typedef int
> >> >> >> >>> (*eventdev_eth_rx_adapter_queue_add_t)(
> >> >> >> >>>  		int32_t rx_queue_id,
> >> >> >> >>>  		const struct rte_event_eth_rx_adapter_queue_conf
> >> >> >> >>> *queue_conf);
> >> >> >> >>>
> >> >> >> >>> +/**
> >> >> >> >>> + * Add ethernet Rx queues to event device. This callback
> >> >> >> >>> +is invoked if
> >> >> >> >>> + * the caps returned from
> >> >> >> >>> +rte_eventdev_eth_rx_adapter_caps_get(,
> >> >> >> >>> +eth_port_id)
> >> >> >> >>> + * has RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT set.
> >> >> >> >>> + *
> >> >> >> >>> + * @param dev
> >> >> >> >>> + *   Event device pointer
> >> >> >> >>> + *
> >> >> >> >>> + * @param eth_dev
> >> >> >> >>> + *   Ethernet device pointer
> >> >> >> >>> + *
> >> >> >> >>> + * @param rx_queue_id
> >> >> >> >>> + *   Ethernet device receive queue index array
> >> >> >> >>> + *
> >> >> >> >>> + * @param queue_conf
> >> >> >> >>> + *   Additional configuration structure array
> >> >> >> >>> + *
> >> >> >> >>> + * @param nb_rx_queues
> >> >> >> >>> + *   Number of ethernet device receive queues
> >> >> >> >>> + *
> >> >> >> >>> + * @return
> >> >> >> >>> + *   - 0: Success, ethernet receive queues added successfully.
> >> >> >> >>> + *   - <0: Error code returned by the driver function.
> >> >> >> >>> + */
> >> >> >> >>> +typedef int (*eventdev_eth_rx_adapter_queues_add_t)(
> >> >> >> >>> +		const struct rte_eventdev *dev,
> >> >> >> >>> +		const struct rte_eth_dev *eth_dev,
> >> >> >> >>> +		int32_t rx_queue_id[],
> >> >> >> >>> +		const struct rte_event_eth_rx_adapter_queue_conf
> >> >> >> >>> queue_conf[],
> >> >> >> >>> +		uint16_t nb_rx_queues);
> >> >> >> >>> +
> >> >> >> >>>  /**
> >> >> >> >>>   * Delete ethernet Rx queues from event device. This
> >> >> >> >>> callback is
> >> >invoked
> >> >> if
> >> >> >> >>>   * the caps returned from
> >> >> >> >>> eventdev_eth_rx_adapter_caps_get(,
> >> >> >> >eth_port_id)
> >> >> >> >>> @@ -1578,6 +1610,8 @@ struct eventdev_ops {
> >> >> >> >>>  	/**< Get ethernet Rx adapter capabilities */
> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_add_t
> >> >eth_rx_adapter_queue_add;
> >> >> >> >>>  	/**< Add Rx queues to ethernet Rx adapter */
> >> >> >> >>> +	eventdev_eth_rx_adapter_queues_add_t
> >> >> >> >>> eth_rx_adapter_queues_add;
> >> >> >> >>> +	/**< Add Rx queues to ethernet Rx adapter */
> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_del_t
> >> >eth_rx_adapter_queue_del;
> >> >> >> >>>  	/**< Delete Rx queues from ethernet Rx adapter */
> >> >> >> >>>  	eventdev_eth_rx_adapter_queue_conf_get_t
> >> >> >> >>> eth_rx_adapter_queue_conf_get; diff --git
> >> >> >> >>> a/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >>> b/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >>> index 9237e198a7..9a5c560b67 100644
> >> >> >> >>> --- a/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >>> +++ b/lib/eventdev/rte_event_eth_rx_adapter.h
> >> >> >> >>> @@ -553,6 +553,40 @@ int
> >> >> >> rte_event_eth_rx_adapter_queue_add(uint8_t
> >> >> >> >>> id,
> >> >> >> >>>  			int32_t rx_queue_id,
> >> >> >> >>>  			const struct
> >> >rte_event_eth_rx_adapter_queue_conf
> >> >> >> >>> *conf);
> >> >> >> >>>
> >> >> >> >>> +/**
> >> >> >> >>> + * Add multiple receive queues to an event adapter.
> >> >> >> >>> + *
> >> >> >> >>> + * @param id
> >> >> >> >>> + *  Adapter identifier.
> >> >> >> >>> + *
> >> >> >> >>> + * @param eth_dev_id
> >> >> >> >>> + *  Port identifier of Ethernet device.
> >> >> >> >>> + *
> >> >> >> >>> + * @param rx_queue_id
> >> >> >> >>> + *  Array of Ethernet device receive queue indices.
> >> >> >> >>> + *  If nb_rx_queues is 0, then rx_queue_id is ignored.
> >> >> >> >>> + *
> >> >> >> >>> + * @param conf
> >> >> >> >>> + *  Array of additional configuration structures of type
> >> >> >> >>> + *  *rte_event_eth_rx_adapter_queue_conf*. conf[i] is used
> >> >> >> >>> +for
> >> >> >> >>> rx_queue_id[i].
> >> >> >> >>> + *  If nb_rx_queues is 0, then conf[0] is used for all Rx queues.
> >> >> >> >>> + *
> >> >> >> >>> + * @param nb_rx_queues
> >> >> >> >>> + *  Number of receive queues to add.
> >> >> >> >>> + *  If nb_rx_queues is 0, then all Rx queues configured
> >> >> >> >>> +for
> >> >> >> >>> + *  the device are added with the same configuration in conf[0].
> >> >> >> >>> + * @see RTE_EVENT_ETH_RX_ADAPTER_CAP_MULTI_EVENTQ
> >> >> >> >>> + *
> >> >> >> >>> + * @return
> >> >> >> >>> + *  - 0: Success, Receive queues added correctly.
> >> >> >> >>> + *  - <0: Error code on failure.
> >> >> >> >>> + */
> >> >> >> >>> +__rte_experimental
> >> >> >> >>> +int rte_event_eth_rx_adapter_queues_add(
> >> >> >> >>> +			uint8_t id, uint16_t eth_dev_id, int32_t
> >> >> >> >>> rx_queue_id[],
> >> >> >> >>> +			const struct
> >> >> rte_event_eth_rx_adapter_queue_conf
> >> >> >> >>> conf[],
> >> >> >> >>> +			uint16_t nb_rx_queues);
> >> >> >> >>> +
> >> >> >> >>>  /**
> >> >> >> >>>   * Delete receive queue from an event adapter.
> >> >> >> >>>   *
> >> >> >> >>> --
> >> >> >> >>> 2.25.1


^ permalink raw reply	[relevance 3%]

* [PATCH v6] graph: mcore: optimize graph search
  2024-12-16  1:43 11%         ` [PATCH v6] " Huichao Cai
  2024-12-16 14:49  4%           ` David Marchand
@ 2025-01-20 14:36  4%           ` Huichao Cai
  2025-02-06  2:53 11%           ` [PATCH v7 1/1] " Huichao Cai
  2 siblings, 0 replies; 200+ results
From: Huichao Cai @ 2025-01-20 14:36 UTC (permalink / raw)
  To: thomas; +Cc: dev, jerinj, kirankumark, ndabilpuram, yanzhirun_163

Hi, Thomas
I tested(See below) this patch locally and it can suppress this error(github build: failed).
Is the difference in CI results from patchwork due to different versions of abidiff?
My abidiff version:
[root@localhost dpdk.chc1]# abidiff --version
abidiff: 1.6.0
There is currently no version 2.6.0 abidiff available in my local environment...

The following is the testing process and results:

==========When not adding the [suppress_type] field==========
[root@localhost dpdk.chc1]# abidiff --suppr ./devtools/libabigail.abignore --no-added-syms --headers-dir1 /tmp/v24.11/build-gcc-shared/usr/local/include --headers-dir2 ./build-gcc-shared/install/usr/local/include /tmp/v24.11/build-gcc-shared/usr/local/lib64/librte_graph.so.25.0 ./build-gcc-shared/install/usr/local/lib64/librte_graph.so.25.1
Functions changes summary: 0 Removed, 1 Changed (5 filtered out), 0 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 function with some indirect sub-type change:

  [C]'function rte_node_t __rte_node_register(const rte_node_register*)' at node.c:58:1 has some indirect sub-type changes:
    parameter 1 of type 'const rte_node_register*' has sub-type changes:
      in pointed to type 'const rte_node_register':
        in unqualified underlying type 'struct rte_node_register' at rte_graph.h:482:1:
          type size hasn't changed
          1 data member changes (2 filtered):
           type of 'rte_node_fini_t rte_node_register::fini' changed:
             underlying type 'void (const rte_graph*, rte_node*)*' changed:
               in pointed to type 'function type void (const rte_graph*, rte_node*)':
                 parameter 2 of type 'rte_node*' has sub-type changes:
                   in pointed to type 'struct rte_node' at rte_graph_worker_common.h:92:1:
                     type size hasn't changed
                     no data member change (1 filtered);
                     1 data member change:
                      anonymous data member at offset 1536 (in bits) changed from:
                        union {struct {unsigned int lcore_id; uint64_t total_sched_objs; uint64_t total_sched_fail;} dispatch;}
                      to:
                        union {struct {unsigned int lcore_id; uint64_t total_sched_objs; uint64_t total_sched_fail; rte_graph* graph;} dispatch;}

==========When adding the [suppress_type] field==========
root@localhost devtools]# cat libabigail.abignore 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Core suppression rules: DO NOT TOUCH ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

[suppress_function]
        symbol_version = EXPERIMENTAL
[suppress_variable]
        symbol_version = EXPERIMENTAL

[suppress_function]
        symbol_version = INTERNAL
[suppress_variable]
        symbol_version = INTERNAL

; Ignore generated PMD information strings
[suppress_variable]
        name_regexp = _pmd_info$

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Special rules to skip libraries ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
; This is not a libabigail rule (see check-abi.sh).
; This is used for driver removal and other special cases like mlx glue libs.
;
; SKIP_LIBRARY=librte_common_mlx5_glue
; SKIP_LIBRARY=librte_net_mlx4_glue

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Experimental APIs exceptions ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Temporary exceptions till next major ABI version ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[suppress_type]
       name = rte_node
       has_size_change = no
       has_data_member_inserted_between =
{offset_of(total_sched_fail), offset_of(xstat_off)}

[root@localhost dpdk.chc1]# abidiff --suppr ./devtools/libabigail.abignore --no-added-syms --headers-dir1 /tmp/v24.11/build-gcc-shared/usr/local/include --headers-dir2 ./build-gcc-shared/install/usr/local/include /tmp/v24.11/build-gcc-shared/usr/local/lib64/librte_graph.so.25.0 ./build-gcc-shared/install/usr/local/lib64/librte_graph.so.25.1
Functions changes summary: 0 Removed, 0 Changed (6 filtered out), 0 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable




^ permalink raw reply	[relevance 4%]

* [PATCH v8] app/testpmd: add attach and detach port for multiple process
       [not found]     <20220825024425.10534-1-lihuisong@huawei.com>
  @ 2025-01-20  6:42  2% ` Huisong Li
  1 sibling, 0 replies; 200+ results
From: Huisong Li @ 2025-01-20  6:42 UTC (permalink / raw)
  To: thomas, stephen, Aman Singh
  Cc: dev, ferruh.yigit, fengchengwen, liuyonglong, lihuisong

The port information needs to be updated due to attaching and detaching
port. Currently, it is done in the same thread as removing or probing
device, which doesn't satisfy the operation of attaching and detaching
device in multiple process.

If this operation is performed in one process, the other process can
receive 'new' or 'destroy' event. So we can move updating port information
to event callback to support attaching and detaching port in primary and
secondary process.

Note: the reason for adding an alarm callback in 'destroy' event is that
the ethdev state is changed from 'ATTACHED' to 'UNUSED' only after the
event callback finished. But the remove_invalid_ports() function removes
invalid port only if ethdev state is 'UNUSED'. If we don't add alarm
callback, this detached port information can not be removed.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
---
 -v8:
   #1 remove other patches because they have been clarified in another
      patchset[1][2].
   #2 move the configuring and querying the port to start_port() because
      they are not approprate in new event callback.
 -v7: fix conflicts
 -v6: adjust rte_eth_dev_is_used position based on alphabetical order
      in version.map
 -v5: move 'ALLOCATED' state to the back of 'REMOVED' to avoid abi break.
 -v4: fix a misspelling. 
 -v3:
   #1 merge patch 1/6 and patch 2/6 into patch 1/5, and add modification
      for other bus type.
   #2 add a RTE_ETH_DEV_ALLOCATED state in rte_eth_dev_state to resolve
      the probelm in patch 2/5. 
 -v2: resend due to CI unexplained failure.

[1] https://patches.dpdk.org/project/dpdk/cover/20250113025521.32703-1-lihuisong@huawei.com/
[2] https://patches.dpdk.org/project/dpdk/cover/20250116114034.9858-1-lihuisong@huawei.com/

---
 app/test-pmd/testpmd.c | 68 +++++++++++++++++++++++++++++++-----------
 1 file changed, 51 insertions(+), 17 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index ac654048df..e47d480205 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2895,6 +2895,9 @@ start_port(portid_t pid)
 		at_least_one_port_exist = true;
 
 		port = &ports[pi];
+		if (port->need_setup)
+			setup_attached_port(pi);
+
 		if (port->port_status == RTE_PORT_STOPPED) {
 			port->port_status = RTE_PORT_HANDLING;
 			all_ports_already_started = false;
@@ -3242,6 +3245,7 @@ remove_invalid_ports(void)
 	remove_invalid_ports_in(ports_ids, &nb_ports);
 	remove_invalid_ports_in(fwd_ports_ids, &nb_fwd_ports);
 	nb_cfg_ports = nb_fwd_ports;
+	printf("Now total ports is %d\n", nb_ports);
 }
 
 static void
@@ -3414,14 +3418,11 @@ attach_port(char *identifier)
 		return;
 	}
 
-	/* first attach mode: event */
+	/* First attach mode: event
+	 * New port flag is updated on RTE_ETH_EVENT_NEW event
+	 */
 	if (setup_on_probe_event) {
-		/* new ports are detected on RTE_ETH_EVENT_NEW event */
-		for (pi = 0; pi < RTE_MAX_ETHPORTS; pi++)
-			if (ports[pi].port_status == RTE_PORT_HANDLING &&
-					ports[pi].need_setup != 0)
-				setup_attached_port(pi);
-		return;
+		goto out;
 	}
 
 	/* second attach mode: iterator */
@@ -3431,6 +3432,9 @@ attach_port(char *identifier)
 			continue; /* port was already attached before */
 		setup_attached_port(pi);
 	}
+out:
+	printf("Port %s is attached.\n", identifier);
+	printf("Done\n");
 }
 
 static void
@@ -3450,14 +3454,8 @@ setup_attached_port(portid_t pi)
 			"Error during enabling promiscuous mode for port %u: %s - ignore\n",
 			pi, rte_strerror(-ret));
 
-	ports_ids[nb_ports++] = pi;
-	fwd_ports_ids[nb_fwd_ports++] = pi;
-	nb_cfg_ports = nb_fwd_ports;
 	ports[pi].need_setup = 0;
 	ports[pi].port_status = RTE_PORT_STOPPED;
-
-	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
-	printf("Done\n");
 }
 
 static void
@@ -3487,10 +3485,8 @@ detach_device(struct rte_device *dev)
 		TESTPMD_LOG(ERR, "Failed to detach device %s\n", rte_dev_name(dev));
 		return;
 	}
-	remove_invalid_ports();
 
 	printf("Device is detached\n");
-	printf("Now total ports is %d\n", nb_ports);
 	printf("Done\n");
 	return;
 }
@@ -3722,7 +3718,25 @@ rmv_port_callback(void *arg)
 		struct rte_device *device = dev_info.device;
 		close_port(port_id);
 		detach_device(device); /* might be already removed or have more ports */
+		remove_invalid_ports();
+	}
+	if (need_to_start)
+		start_packet_forwarding(0);
+}
+
+static void
+remove_invalid_ports_callback(void *arg)
+{
+	portid_t port_id = (intptr_t)arg;
+	int need_to_start = 0;
+
+	if (!test_done && port_is_forwarding(port_id)) {
+		need_to_start = 1;
+		stop_packet_forwarding();
 	}
+
+	remove_invalid_ports();
+
 	if (need_to_start)
 		start_packet_forwarding(0);
 }
@@ -3748,8 +3762,19 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
 
 	switch (type) {
 	case RTE_ETH_EVENT_NEW:
-		ports[port_id].need_setup = 1;
-		ports[port_id].port_status = RTE_PORT_HANDLING;
+		/* The port in ports_id and fwd_ports_ids is always valid
+		 * from index 0 ~ (nb_ports - 1) due to updating their
+		 * position when one port is detached or removed.
+		 */
+		ports_ids[nb_ports++] = port_id;
+		fwd_ports_ids[nb_fwd_ports++] = port_id;
+		nb_cfg_ports = nb_fwd_ports;
+		printf("Port %d is probed. Now total ports is %d\n", port_id, nb_ports);
+
+		if (setup_on_probe_event) {
+			ports[port_id].need_setup = 1;
+			ports[port_id].port_status = RTE_PORT_HANDLING;
+		}
 		break;
 	case RTE_ETH_EVENT_INTR_RMV:
 		if (port_id_is_invalid(port_id, DISABLED_WARN))
@@ -3762,6 +3787,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
 	case RTE_ETH_EVENT_DESTROY:
 		ports[port_id].port_status = RTE_PORT_CLOSED;
 		printf("Port %u is closed\n", port_id);
+		/*
+		 * Defer to remove port id due to the reason that the ethdev
+		 * state is changed from 'ATTACHED' to 'UNUSED' only after the
+		 * event callback finished. Otherwise this port id can not be
+		 * removed.
+		 */
+		if (rte_eal_alarm_set(100000, remove_invalid_ports_callback,
+				      (void *)(intptr_t)port_id))
+			fprintf(stderr, "Could not set up deferred task to remove this port id.\n");
 		break;
 	case RTE_ETH_EVENT_RX_AVAIL_THRESH: {
 		uint16_t rxq_id;
-- 
2.22.0


^ permalink raw reply	[relevance 2%]

* Re: [PATCH v2] ring: add the second version of the RTS interface
  2025-01-08  1:41  3%   ` Huichao Cai
@ 2025-01-14 15:04  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2025-01-14 15:04 UTC (permalink / raw)
  To: Huichao Cai; +Cc: dev, honnappa.nagarahalli, konstantin.v.ananyev

08/01/2025 02:41, Huichao Cai:
> Hi,Thomas
>     This patch adds a field to the ABI structure.I have added the suppress_type
> field in the file libabigail.abignore, but "ci/github-robot: Build" still reported
> an error, could you please advise on how to fill in the suppress_type field?

You must check locally and see what happens when you add some suppressions.

You will find documentation here:
https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppression-specifications



^ permalink raw reply	[relevance 0%]

* [PATCH v1 2/2] ethdev: fix skip valid port in probing callback
  @ 2025-01-13  2:55  2% ` Huisong Li
  0 siblings, 0 replies; 200+ results
From: Huisong Li @ 2025-01-13  2:55 UTC (permalink / raw)
  To: dev, stephen, thomas, ferruh.yigit, Ajit Khaparde, Somnath Kotur,
	Praveen Shetty, Andrew Boyer, Dariusz Sosnowski,
	Viacheslav Ovsiienko, Bing Zhao, Ori Kam, Suanming Mou,
	Matan Azrad, Chaoyong He, Andrew Rybchenko
  Cc: fengchengwen, liuyonglong, lihuisong

The event callback in application may use the macro RTE_ETH_FOREACH_DEV to
iterate over all enabled ports to do something(like, verifying the port id
validity) when receive a probing event. If the ethdev state of a port is
not RTE_ETH_DEV_UNUSED, this port will be considered as a valid port.

However, this state is set to RTE_ETH_DEV_ATTACHED after pushing probing
event. It means that probing callback will skip this port. But this
assignment can not move to front of probing notification. See
commit be8cd210379a ("ethdev: fix port probing notification")

So this patch has to add a new state, RTE_ETH_DEV_ALLOCATED. Set the ethdev
state to RTE_ETH_DEV_ALLOCATED before pushing probing event and set it to
RTE_ETH_DEV_ATTACHED after definitely probed. And this port is valid if its
device state is 'ALLOCATED' or 'ATTACHED'.

In addition, the new state has to be placed behind 'REMOVED' to avoid ABI
break. Fortunately, this ethdev state is internal and applications can not
access it directly. So this patch encapsulates an API, rte_eth_dev_is_used,
for ethdev or PMD to call and eliminate concerns about using this state
enum value comparison.

Fixes: be8cd210379a ("ethdev: fix port probing notification")
Cc: stable@dpdk.org

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
---
 drivers/net/bnxt/bnxt_ethdev.c   |  2 +-
 drivers/net/cpfl/cpfl_ethdev.h   |  2 +-
 drivers/net/ionic/ionic_ethdev.c |  2 +-
 drivers/net/mlx5/mlx5.c          |  2 +-
 drivers/net/nfp/nfp_ethdev.c     |  4 ++--
 lib/ethdev/ethdev_driver.c       | 13 ++++++++++---
 lib/ethdev/ethdev_driver.h       | 12 ++++++++++++
 lib/ethdev/ethdev_pci.h          |  2 +-
 lib/ethdev/rte_class_eth.c       |  2 +-
 lib/ethdev/rte_ethdev.c          |  4 ++--
 lib/ethdev/rte_ethdev.h          |  4 +++-
 lib/ethdev/version.map           |  1 +
 12 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index ef8a928c91..1441194b85 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -6706,7 +6706,7 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev)
 
 	PMD_DRV_LOG_LINE(DEBUG, "Calling Device uninit");
 
-	if (eth_dev->state != RTE_ETH_DEV_UNUSED)
+	if (rte_eth_dev_is_used(eth_dev->state))
 		bnxt_dev_close_op(eth_dev);
 
 	return 0;
diff --git a/drivers/net/cpfl/cpfl_ethdev.h b/drivers/net/cpfl/cpfl_ethdev.h
index 9a38a69194..aad05aafd6 100644
--- a/drivers/net/cpfl/cpfl_ethdev.h
+++ b/drivers/net/cpfl/cpfl_ethdev.h
@@ -328,7 +328,7 @@ cpfl_get_itf_by_port_id(uint16_t port_id)
 	}
 
 	dev = &rte_eth_devices[port_id];
-	if (dev->state == RTE_ETH_DEV_UNUSED) {
+	if (!rte_eth_dev_is_used(dev->state)) {
 		PMD_DRV_LOG(ERR, "eth_dev[%d] is unused.", port_id);
 		return NULL;
 	}
diff --git a/drivers/net/ionic/ionic_ethdev.c b/drivers/net/ionic/ionic_ethdev.c
index aa22b6a70d..2a4e565c4f 100644
--- a/drivers/net/ionic/ionic_ethdev.c
+++ b/drivers/net/ionic/ionic_ethdev.c
@@ -1109,7 +1109,7 @@ eth_ionic_dev_uninit(struct rte_eth_dev *eth_dev)
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
 		return 0;
 
-	if (eth_dev->state != RTE_ETH_DEV_UNUSED)
+	if (rte_eth_dev_is_used(eth_dev->state))
 		ionic_dev_close(eth_dev);
 
 	eth_dev->dev_ops = NULL;
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 6e4473e2f4..642e762868 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -3376,7 +3376,7 @@ mlx5_eth_find_next(uint16_t port_id, struct rte_device *odev)
 	while (port_id < RTE_MAX_ETHPORTS) {
 		struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 
-		if (dev->state != RTE_ETH_DEV_UNUSED &&
+		if (rte_eth_dev_is_used(dev->state) &&
 		    dev->device &&
 		    (dev->device == odev ||
 		     (dev->device->driver &&
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index df5482f74a..dae4594e56 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -754,11 +754,11 @@ nfp_net_close(struct rte_eth_dev *dev)
 	/*
 	 * In secondary process, a released eth device can be found by its name
 	 * in shared memory.
-	 * If the state of the eth device is RTE_ETH_DEV_UNUSED, it means the
+	 * If the state of the eth device isn't used, it means the
 	 * eth device has been released.
 	 */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
-		if (dev->state == RTE_ETH_DEV_UNUSED)
+		if (!rte_eth_dev_is_used(dev->state))
 			return 0;
 
 		nfp_pf_secondary_uninit(hw_priv);
diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index 9afef06431..5537c2f7af 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -55,8 +55,8 @@ eth_dev_find_free_port(void)
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		/* Using shared name field to find a free port. */
 		if (eth_dev_shared_data->data[i].name[0] == '\0') {
-			RTE_ASSERT(rte_eth_devices[i].state ==
-				   RTE_ETH_DEV_UNUSED);
+			RTE_ASSERT(!rte_eth_dev_is_used(
+					rte_eth_devices[i].state));
 			return i;
 		}
 	}
@@ -221,11 +221,18 @@ rte_eth_dev_probing_finish(struct rte_eth_dev *dev)
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
 
+	dev->state = RTE_ETH_DEV_ALLOCATED;
 	rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_NEW, NULL);
 
 	dev->state = RTE_ETH_DEV_ATTACHED;
 }
 
+bool rte_eth_dev_is_used(uint16_t dev_state)
+{
+	return dev_state == RTE_ETH_DEV_ALLOCATED ||
+		dev_state == RTE_ETH_DEV_ATTACHED;
+}
+
 int
 rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 {
@@ -243,7 +250,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 	if (ret != 0)
 		return ret;
 
-	if (eth_dev->state != RTE_ETH_DEV_UNUSED)
+	if (rte_eth_dev_is_used(eth_dev->state))
 		rte_eth_dev_callback_process(eth_dev,
 				RTE_ETH_EVENT_DESTROY, NULL);
 
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 1fd4562b40..dc496daf05 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1754,6 +1754,18 @@ int rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 __rte_internal
 void rte_eth_dev_probing_finish(struct rte_eth_dev *dev);
 
+/**
+ * Check if a Ethernet device state is used or not
+ *
+ * @param dev_state
+ *   The state of the Ethernet device
+ * @return
+ *   - true if the state of the Ethernet device is allocated or attached
+ *   - false if this state is neither allocated nor attached
+ */
+__rte_internal
+bool rte_eth_dev_is_used(uint16_t dev_state);
+
 /**
  * Create memzone for HW rings.
  * malloc can't be used as the physical address is needed.
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 2229ffa252..1e62f30d8d 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -179,7 +179,7 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
 	 * eth device has been released.
 	 */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
-	    eth_dev->state == RTE_ETH_DEV_UNUSED)
+	    !rte_eth_dev_is_used(eth_dev->state))
 		return 0;
 
 	if (dev_uninit) {
diff --git a/lib/ethdev/rte_class_eth.c b/lib/ethdev/rte_class_eth.c
index a8d01e2595..f343c4b6eb 100644
--- a/lib/ethdev/rte_class_eth.c
+++ b/lib/ethdev/rte_class_eth.c
@@ -120,7 +120,7 @@ eth_dev_match(const struct rte_eth_dev *edev,
 	const struct rte_kvargs *kvlist = arg->kvlist;
 	unsigned int pair;
 
-	if (edev->state == RTE_ETH_DEV_UNUSED)
+	if (!rte_eth_dev_is_used(edev->state))
 		return -1;
 	if (arg->device != NULL && arg->device != edev->device)
 		return -1;
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 6413c54e3b..3d7a3c39d3 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -349,7 +349,7 @@ uint16_t
 rte_eth_find_next(uint16_t port_id)
 {
 	while (port_id < RTE_MAX_ETHPORTS &&
-			rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED)
+	       !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
 		port_id++;
 
 	if (port_id >= RTE_MAX_ETHPORTS)
@@ -408,7 +408,7 @@ rte_eth_dev_is_valid_port(uint16_t port_id)
 	int is_valid;
 
 	if (port_id >= RTE_MAX_ETHPORTS ||
-	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
+	    !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
 		is_valid = 0;
 	else
 		is_valid = 1;
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 1f71cad244..f9a72b9883 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2091,10 +2091,12 @@ typedef uint16_t (*rte_tx_callback_fn)(uint16_t port_id, uint16_t queue,
 enum rte_eth_dev_state {
 	/** Device is unused before being probed. */
 	RTE_ETH_DEV_UNUSED = 0,
-	/** Device is attached when allocated in probing. */
+	/** Device is attached when definitely probed. */
 	RTE_ETH_DEV_ATTACHED,
 	/** Device is in removed state when plug-out is detected. */
 	RTE_ETH_DEV_REMOVED,
+	/** Device is allocated and is set before reporting new event. */
+	RTE_ETH_DEV_ALLOCATED,
 };
 
 struct rte_eth_dev_sriov {
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 12f48c70a0..45b982e98d 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -351,6 +351,7 @@ INTERNAL {
 	rte_eth_dev_get_by_name;
 	rte_eth_dev_is_rx_hairpin_queue;
 	rte_eth_dev_is_tx_hairpin_queue;
+	rte_eth_dev_is_used;
 	rte_eth_dev_probing_finish;
 	rte_eth_dev_release_port;
 	rte_eth_dev_internal_reset;
-- 
2.22.0


^ permalink raw reply	[relevance 2%]

* Re: [PATCH RESEND v7 2/5] ethdev: fix skip valid port in probing callback
  2025-01-10 17:54  3%         ` Stephen Hemminger
@ 2025-01-13  2:32  0%           ` lihuisong (C)
  0 siblings, 0 replies; 200+ results
From: lihuisong (C) @ 2025-01-13  2:32 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, fengchengwen, liuyonglong, andrew.rybchenko, Somnath Kotur,
	Ajit Khaparde, Dariusz Sosnowski, Suanming Mou, Matan Azrad,
	Ori Kam, Viacheslav Ovsiienko, ferruh.yigit, thomas


在 2025/1/11 1:54, Stephen Hemminger 写道:
> On Fri, 10 Jan 2025 11:21:26 +0800
> "lihuisong (C)" <lihuisong@huawei.com> wrote:
>
>> Hi Stephen,
>>
>> Can you take a look at my below reply and reconsider this patch?
>>
>> /Huisong
>>
>> 在 2024/12/10 9:50, lihuisong (C) 写道:
>>> Hi Ferruh, Stephen and Thomas,
>>>
>>> Can you take a look at this patch? After all, it is an issue in ethdev
>>> layer.
>>> This also is the fruit we disscussed with Thomas and Ferruh before.
>>> Please go back to this thread. If we don't need this patch, please let
>>> me know. I will drop it from my upstreaming list.
>>>
>>> /Huisong
>>>
>>>
>>> 在 2024/9/29 13:52, Huisong Li 写道:
>>>> The event callback in application may use the macro
>>>> RTE_ETH_FOREACH_DEV to
>>>> iterate over all enabled ports to do something(like, verifying the
>>>> port id
>>>> validity) when receive a probing event. If the ethdev state of a port is
>>>> not RTE_ETH_DEV_UNUSED, this port will be considered as a valid port.
>>>>
>>>> However, this state is set to RTE_ETH_DEV_ATTACHED after pushing probing
>>>> event. It means that probing callback will skip this port. But this
>>>> assignment can not move to front of probing notification. See
>>>> commit be8cd210379a ("ethdev: fix port probing notification")
>>>>
>>>> So this patch has to add a new state, RTE_ETH_DEV_ALLOCATED. Set the
>>>> ethdev
>>>> state to RTE_ETH_DEV_ALLOCATED before pushing probing event and set
>>>> it to
>>>> RTE_ETH_DEV_ATTACHED after definitely probed. And this port is valid
>>>> if its
>>>> device state is 'ALLOCATED' or 'ATTACHED'.
>>>>
>>>> In addition, the new state has to be placed behind 'REMOVED' to avoid
>>>> ABI
>>>> break. Fortunately, this ethdev state is internal and applications
>>>> can not
>>>> access it directly. So this patch encapsulates an API,
>>>> rte_eth_dev_is_used,
>>>> for ethdev or PMD to call and eliminate concerns about using this state
>>>> enum value comparison.
>>>>
>>>> Fixes: be8cd210379a ("ethdev: fix port probing notification")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>>>> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
>>>> ---
>>>>    drivers/net/bnxt/bnxt_ethdev.c |  3 ++-
>>>>    drivers/net/mlx5/mlx5.c        |  2 +-
>>>>    lib/ethdev/ethdev_driver.c     | 13 ++++++++++---
>>>>    lib/ethdev/ethdev_driver.h     | 12 ++++++++++++
>>>>    lib/ethdev/ethdev_pci.h        |  2 +-
>>>>    lib/ethdev/rte_class_eth.c     |  2 +-
>>>>    lib/ethdev/rte_ethdev.c        |  4 ++--
>>>>    lib/ethdev/rte_ethdev.h        |  4 +++-
>>>>    lib/ethdev/version.map         |  1 +
>>>>    9 files changed, 33 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c
>>>> b/drivers/net/bnxt/bnxt_ethdev.c
>>>> index c6ad764813..7401dcd8b5 100644
>>>> --- a/drivers/net/bnxt/bnxt_ethdev.c
>>>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
>>>> @@ -6612,7 +6612,8 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev)
>>>>          PMD_DRV_LOG(DEBUG, "Calling Device uninit\n");
>>>>    -    if (eth_dev->state != RTE_ETH_DEV_UNUSED)
>>>> +
>>>> +    if (rte_eth_dev_is_used(eth_dev->state))
>>>>            bnxt_dev_close_op(eth_dev);
>>>>          return 0;
>>>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
>>>> index 8d266b0e64..0df49e1f69 100644
>>>> --- a/drivers/net/mlx5/mlx5.c
>>>> +++ b/drivers/net/mlx5/mlx5.c
>>>> @@ -3371,7 +3371,7 @@ mlx5_eth_find_next(uint16_t port_id, struct
>>>> rte_device *odev)
>>>>        while (port_id < RTE_MAX_ETHPORTS) {
>>>>            struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>>>    -        if (dev->state != RTE_ETH_DEV_UNUSED &&
>>>> +        if (rte_eth_dev_is_used(dev->state) &&
>>>>                dev->device &&
>>>>                (dev->device == odev ||
>>>>                 (dev->device->driver &&
>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>> index c335a25a82..a87dbb00ff 100644
>>>> --- a/lib/ethdev/ethdev_driver.c
>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>> @@ -55,8 +55,8 @@ eth_dev_find_free_port(void)
>>>>        for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
>>>>            /* Using shared name field to find a free port. */
>>>>            if (eth_dev_shared_data->data[i].name[0] == '\0') {
>>>> -            RTE_ASSERT(rte_eth_devices[i].state ==
>>>> -                   RTE_ETH_DEV_UNUSED);
>>>> +            RTE_ASSERT(!rte_eth_dev_is_used(
>>>> +                    rte_eth_devices[i].state));
>>>>                return i;
>>>>            }
>>>>        }
>>>> @@ -221,11 +221,18 @@ rte_eth_dev_probing_finish(struct rte_eth_dev
>>>> *dev)
>>>>        if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>>>>            eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id,
>>>> dev);
>>>>    +    dev->state = RTE_ETH_DEV_ALLOCATED;
>>>>        rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_NEW, NULL);
>>>>          dev->state = RTE_ETH_DEV_ATTACHED;
>>>>    }
>>>>    +bool rte_eth_dev_is_used(uint16_t dev_state)
>>>> +{
>>>> +    return dev_state == RTE_ETH_DEV_ALLOCATED ||
>>>> +        dev_state == RTE_ETH_DEV_ATTACHED;
>>>> +}
>>>> +
>>>>    int
>>>>    rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>>>>    {
>>>> @@ -243,7 +250,7 @@ rte_eth_dev_release_port(struct rte_eth_dev
>>>> *eth_dev)
>>>>        if (ret != 0)
>>>>            return ret;
>>>>    -    if (eth_dev->state != RTE_ETH_DEV_UNUSED)
>>>> +    if (rte_eth_dev_is_used(eth_dev->state))
>>>>            rte_eth_dev_callback_process(eth_dev,
>>>>                    RTE_ETH_EVENT_DESTROY, NULL);
>>>>    diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>> index abed4784aa..aa35b65848 100644
>>>> --- a/lib/ethdev/ethdev_driver.h
>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>> @@ -1704,6 +1704,18 @@ int rte_eth_dev_callback_process(struct
>>>> rte_eth_dev *dev,
>>>>    __rte_internal
>>>>    void rte_eth_dev_probing_finish(struct rte_eth_dev *dev);
>>>>    +/**
>>>> + * Check if a Ethernet device state is used or not
>>>> + *
>>>> + * @param dev_state
>>>> + *   The state of the Ethernet device
>>>> + * @return
>>>> + *   - true if the state of the Ethernet device is allocated or
>>>> attached
>>>> + *   - false if this state is neither allocated nor attached
>>>> + */
>>>> +__rte_internal
>>>> +bool rte_eth_dev_is_used(uint16_t dev_state);
>>>> +
>>>>    /**
>>>>     * Create memzone for HW rings.
>>>>     * malloc can't be used as the physical address is needed.
>>>> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
>>>> index ec4f731270..05dec6716b 100644
>>>> --- a/lib/ethdev/ethdev_pci.h
>>>> +++ b/lib/ethdev/ethdev_pci.h
>>>> @@ -179,7 +179,7 @@ rte_eth_dev_pci_generic_remove(struct
>>>> rte_pci_device *pci_dev,
>>>>         * eth device has been released.
>>>>         */
>>>>        if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
>>>> -        eth_dev->state == RTE_ETH_DEV_UNUSED)
>>>> +        !rte_eth_dev_is_used(eth_dev->state))
>>>>            return 0;
>>>>          if (dev_uninit) {
>>>> diff --git a/lib/ethdev/rte_class_eth.c b/lib/ethdev/rte_class_eth.c
>>>> index b52f1dd9f2..81e70670d9 100644
>>>> --- a/lib/ethdev/rte_class_eth.c
>>>> +++ b/lib/ethdev/rte_class_eth.c
>>>> @@ -118,7 +118,7 @@ eth_dev_match(const struct rte_eth_dev *edev,
>>>>        const struct rte_kvargs *kvlist = arg->kvlist;
>>>>        unsigned int pair;
>>>>    -    if (edev->state == RTE_ETH_DEV_UNUSED)
>>>> +    if (!rte_eth_dev_is_used(edev->state))
>>>>            return -1;
>>>>        if (arg->device != NULL && arg->device != edev->device)
>>>>            return -1;
>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>>> index a1f7efa913..4dc66abb7b 100644
>>>> --- a/lib/ethdev/rte_ethdev.c
>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>> @@ -349,7 +349,7 @@ uint16_t
>>>>    rte_eth_find_next(uint16_t port_id)
>>>>    {
>>>>        while (port_id < RTE_MAX_ETHPORTS &&
>>>> -            rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED)
>>>> + !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
>>>>            port_id++;
>>>>          if (port_id >= RTE_MAX_ETHPORTS)
>>>> @@ -408,7 +408,7 @@ rte_eth_dev_is_valid_port(uint16_t port_id)
>>>>        int is_valid;
>>>>          if (port_id >= RTE_MAX_ETHPORTS ||
>>>> -        (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
>>>> +        !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
>>>>            is_valid = 0;
>>>>        else
>>>>            is_valid = 1;
>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>> index a9f92006da..9cc37e8cde 100644
>>>> --- a/lib/ethdev/rte_ethdev.h
>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>> @@ -2083,10 +2083,12 @@ typedef uint16_t
>>>> (*rte_tx_callback_fn)(uint16_t port_id, uint16_t queue,
>>>>    enum rte_eth_dev_state {
>>>>        /** Device is unused before being probed. */
>>>>        RTE_ETH_DEV_UNUSED = 0,
>>>> -    /** Device is attached when allocated in probing. */
>>>> +    /** Device is attached when definitely probed. */
>>>>        RTE_ETH_DEV_ATTACHED,
>>>>        /** Device is in removed state when plug-out is detected. */
>>>>        RTE_ETH_DEV_REMOVED,
>>>> +    /** Device is allocated and is set before reporting new event. */
>>>> +    RTE_ETH_DEV_ALLOCATED,
>>>>    };
>>>>      struct rte_eth_dev_sriov {
>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>> index f63dc32aa2..6ecf1ab89d 100644
>>>> --- a/lib/ethdev/version.map
>>>> +++ b/lib/ethdev/version.map
>>>> @@ -349,6 +349,7 @@ INTERNAL {
>>>>        rte_eth_dev_get_by_name;
>>>>        rte_eth_dev_is_rx_hairpin_queue;
>>>>        rte_eth_dev_is_tx_hairpin_queue;
>>>> +    rte_eth_dev_is_used;
>>>>        rte_eth_dev_probing_finish;
>>>>        rte_eth_dev_release_port;
>>>>        rte_eth_dev_internal_reset;
>>> .
> Please resubmit for 25.03 release.
> But it looks like an API/ABI change since rte_eth_dev_state is visible
> to applications.
>
> A more detailed bug report would also help
> .

ok,many thanks for your reply.

I will resubmit this patch and send out separately.

And this series that testpmd add attach and detach port for multiple 
process will be updated later.


^ permalink raw reply	[relevance 0%]

* [PATCH v11 27/30] net: replace packed attributes
  @ 2025-01-10 22:16  1%   ` Andre Muezerie
  0 siblings, 0 replies; 200+ results
From: Andre Muezerie @ 2025-01-10 22:16 UTC (permalink / raw)
  To: roretzla
  Cc: aman.deep.singh, anatoly.burakov, bruce.richardson, byron.marohn,
	conor.walsh, cristian.dumitrescu, david.hunt, dev, dsosnowski,
	gakhil, jerinj, jingjing.wu, kirill.rybalchenko,
	konstantin.v.ananyev, matan, mb, orika, radu.nicolau,
	ruifeng.wang, sameh.gobriel, sivaprasad.tummala, skori, stephen,
	suanmingm, vattunuru, viacheslavo, vladimir.medvedkin,
	yipeng1.wang, Andre Muezerie

MSVC struct packing is not compatible with GCC. Replace macro
__rte_packed with __rte_packed_begin to push existing pack value
and set packing to 1-byte and macro __rte_packed_end to restore
the pack value prior to the push.

Macro __rte_packed_end is deliberately utilized to trigger a
MSVC compiler warning if no existing packing has been pushed allowing
easy identification of locations where the __rte_packed_begin is
missing.

This change affects the storage size of a variable of enum
rte_ipv6_mc_scope (at least with gcc). It should be OK from an ABI POV
though: there is one (inline) helper using this type, and nothing else
in DPDK takes a IPv6 multicast scope as input.

Signed-off-by: Andre Muezerie <andremue@linux.microsoft.com>
---
 lib/net/rte_arp.h      |  8 ++++----
 lib/net/rte_dtls.h     |  4 ++--
 lib/net/rte_esp.h      |  8 ++++----
 lib/net/rte_geneve.h   |  4 ++--
 lib/net/rte_gre.h      | 16 ++++++++--------
 lib/net/rte_gtp.h      | 20 ++++++++++----------
 lib/net/rte_ib.h       |  4 ++--
 lib/net/rte_icmp.h     | 12 ++++++------
 lib/net/rte_ip4.h      |  4 ++--
 lib/net/rte_ip6.h      | 14 +++++++-------
 lib/net/rte_l2tpv2.h   | 16 ++++++++--------
 lib/net/rte_macsec.h   |  8 ++++----
 lib/net/rte_mpls.h     |  4 ++--
 lib/net/rte_pdcp_hdr.h | 16 ++++++++--------
 lib/net/rte_ppp.h      |  4 ++--
 lib/net/rte_sctp.h     |  4 ++--
 lib/net/rte_tcp.h      |  4 ++--
 lib/net/rte_tls.h      |  4 ++--
 lib/net/rte_udp.h      |  4 ++--
 lib/net/rte_vxlan.h    | 28 ++++++++++++++--------------
 20 files changed, 93 insertions(+), 93 deletions(-)

diff --git a/lib/net/rte_arp.h b/lib/net/rte_arp.h
index 668cea1704..e885a71292 100644
--- a/lib/net/rte_arp.h
+++ b/lib/net/rte_arp.h
@@ -21,17 +21,17 @@ extern "C" {
 /**
  * ARP header IPv4 payload.
  */
-struct __rte_aligned(2) rte_arp_ipv4 {
+struct __rte_aligned(2) __rte_packed_begin rte_arp_ipv4 {
 	struct rte_ether_addr arp_sha;  /**< sender hardware address */
 	rte_be32_t            arp_sip;  /**< sender IP address */
 	struct rte_ether_addr arp_tha;  /**< target hardware address */
 	rte_be32_t            arp_tip;  /**< target IP address */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ARP header.
  */
-struct __rte_aligned(2) rte_arp_hdr {
+struct __rte_aligned(2) __rte_packed_begin rte_arp_hdr {
 	rte_be16_t arp_hardware; /**< format of hardware address */
 #define RTE_ARP_HRD_ETHER     1  /**< ARP Ethernet address format */
 
@@ -47,7 +47,7 @@ struct __rte_aligned(2) rte_arp_hdr {
 #define	RTE_ARP_OP_INVREPLY   9  /**< response identifying peer */
 
 	struct rte_arp_ipv4 arp_data;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Make a RARP packet based on MAC addr.
diff --git a/lib/net/rte_dtls.h b/lib/net/rte_dtls.h
index 246cd8a72d..1dd95ce899 100644
--- a/lib/net/rte_dtls.h
+++ b/lib/net/rte_dtls.h
@@ -30,7 +30,7 @@
  * DTLS Header
  */
 __extension__
-struct rte_dtls_hdr {
+struct __rte_packed_begin rte_dtls_hdr {
 	/** Content type of DTLS packet. Defined as RTE_DTLS_TYPE_*. */
 	uint8_t type;
 	/** DTLS Version defined as RTE_DTLS_VERSION*. */
@@ -48,6 +48,6 @@ struct rte_dtls_hdr {
 #endif
 	/** The length (in bytes) of the following DTLS packet. */
 	rte_be16_t length;
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_DTLS_H */
diff --git a/lib/net/rte_esp.h b/lib/net/rte_esp.h
index 745a9847fe..2a0002f4d9 100644
--- a/lib/net/rte_esp.h
+++ b/lib/net/rte_esp.h
@@ -16,17 +16,17 @@
 /**
  * ESP Header
  */
-struct rte_esp_hdr {
+struct __rte_packed_begin rte_esp_hdr {
 	rte_be32_t spi;  /**< Security Parameters Index */
 	rte_be32_t seq;  /**< packet sequence number */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ESP Trailer
  */
-struct rte_esp_tail {
+struct __rte_packed_begin rte_esp_tail {
 	uint8_t pad_len;     /**< number of pad bytes (0-255) */
 	uint8_t next_proto;  /**< IPv4 or IPv6 or next layer header */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_ESP_H_ */
diff --git a/lib/net/rte_geneve.h b/lib/net/rte_geneve.h
index eb2c85f1e9..f962c587ee 100644
--- a/lib/net/rte_geneve.h
+++ b/lib/net/rte_geneve.h
@@ -34,7 +34,7 @@
  * More-bits (optional) variable length options.
  */
 __extension__
-struct rte_geneve_hdr {
+struct __rte_packed_begin rte_geneve_hdr {
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t ver:2;		/**< Version. */
 	uint8_t opt_len:6;	/**< Options length. */
@@ -52,7 +52,7 @@ struct rte_geneve_hdr {
 	uint8_t vni[3];		/**< Virtual network identifier. */
 	uint8_t reserved2;	/**< Reserved. */
 	uint32_t opts[];	/**< Variable length options. */
-} __rte_packed;
+} __rte_packed_end;
 
 /* GENEVE ETH next protocol types */
 #define RTE_GENEVE_TYPE_ETH	0x6558 /**< Ethernet Protocol. */
diff --git a/lib/net/rte_gre.h b/lib/net/rte_gre.h
index 1483e1b42d..768c4ce7b5 100644
--- a/lib/net/rte_gre.h
+++ b/lib/net/rte_gre.h
@@ -23,7 +23,7 @@
  * GRE Header
  */
 __extension__
-struct rte_gre_hdr {
+struct __rte_packed_begin rte_gre_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint16_t res2:4; /**< Reserved */
 	uint16_t s:1;    /**< Sequence Number Present bit */
@@ -42,28 +42,28 @@ struct rte_gre_hdr {
 	uint16_t ver:3;  /**< Version Number */
 #endif
 	rte_be16_t proto;  /**< Protocol Type */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional field checksum in GRE header
  */
-struct rte_gre_hdr_opt_checksum_rsvd {
+struct __rte_packed_begin rte_gre_hdr_opt_checksum_rsvd {
 	rte_be16_t checksum;
 	rte_be16_t reserved1;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional field key in GRE header
  */
-struct rte_gre_hdr_opt_key {
+struct __rte_packed_begin rte_gre_hdr_opt_key {
 	rte_be32_t key;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional field sequence in GRE header
  */
-struct rte_gre_hdr_opt_sequence {
+struct __rte_packed_begin rte_gre_hdr_opt_sequence {
 	rte_be32_t sequence;
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_GRE_H_ */
diff --git a/lib/net/rte_gtp.h b/lib/net/rte_gtp.h
index ab06e23a6e..0332d35c16 100644
--- a/lib/net/rte_gtp.h
+++ b/lib/net/rte_gtp.h
@@ -24,7 +24,7 @@
  * No optional fields and next extension header.
  */
 __extension__
-struct rte_gtp_hdr {
+struct __rte_packed_begin rte_gtp_hdr {
 	union {
 		uint8_t gtp_hdr_info; /**< GTP header info */
 		struct {
@@ -48,21 +48,21 @@ struct rte_gtp_hdr {
 	uint8_t msg_type;     /**< GTP message type */
 	rte_be16_t plen;      /**< Total payload length */
 	rte_be32_t teid;      /**< Tunnel endpoint ID */
-} __rte_packed;
+} __rte_packed_end;
 
 /* Optional word of GTP header, present if any of E, S, PN is set. */
-struct rte_gtp_hdr_ext_word {
+struct __rte_packed_begin rte_gtp_hdr_ext_word {
 	rte_be16_t sqn;	      /**< Sequence Number. */
 	uint8_t npdu;	      /**< N-PDU number. */
 	uint8_t next_ext;     /**< Next Extension Header Type. */
-}  __rte_packed;
+}  __rte_packed_end;
 
 /**
  * Optional extension for GTP with next_ext set to 0x85
  * defined based on RFC 38415-g30.
  */
 __extension__
-struct rte_gtp_psc_generic_hdr {
+struct __rte_packed_begin rte_gtp_psc_generic_hdr {
 	uint8_t ext_hdr_len;	/**< PDU ext hdr len in multiples of 4 bytes */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t type:4;		/**< PDU type */
@@ -78,14 +78,14 @@ struct rte_gtp_psc_generic_hdr {
 	uint8_t spare:2;	/**< type specific spare bits */
 #endif
 	uint8_t data[0];	/**< variable length data fields */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional extension for GTP with next_ext set to 0x85
  * type0 defined based on RFC 38415-g30
  */
 __extension__
-struct rte_gtp_psc_type0_hdr {
+struct __rte_packed_begin rte_gtp_psc_type0_hdr {
 	uint8_t ext_hdr_len;	/**< PDU ext hdr len in multiples of 4 bytes */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t type:4;		/**< PDU type */
@@ -105,14 +105,14 @@ struct rte_gtp_psc_type0_hdr {
 	uint8_t ppp:1;		/**< Paging policy presence */
 #endif
 	uint8_t data[0];	/**< variable length data fields */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional extension for GTP with next_ext set to 0x85
  * type1 defined based on RFC 38415-g30
  */
 __extension__
-struct rte_gtp_psc_type1_hdr {
+struct __rte_packed_begin rte_gtp_psc_type1_hdr {
 	uint8_t ext_hdr_len;	/**< PDU ext hdr len in multiples of 4 bytes */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t type:4;		/**< PDU type */
@@ -134,7 +134,7 @@ struct rte_gtp_psc_type1_hdr {
 	uint8_t n_delay_ind:1;	/**< N3/N9 delay result presence */
 #endif
 	uint8_t data[0];	/**< variable length data fields */
-} __rte_packed;
+} __rte_packed_end;
 
 /** GTP header length */
 #define RTE_ETHER_GTP_HLEN \
diff --git a/lib/net/rte_ib.h b/lib/net/rte_ib.h
index a551f3753f..f1b455cea0 100644
--- a/lib/net/rte_ib.h
+++ b/lib/net/rte_ib.h
@@ -22,7 +22,7 @@
  * IB Specification Vol 1-Release-1.4.
  */
 __extension__
-struct rte_ib_bth {
+struct __rte_packed_begin rte_ib_bth {
 	uint8_t	opcode;		/**< Opcode. */
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t	tver:4;		/**< Transport Header Version. */
@@ -54,7 +54,7 @@ struct rte_ib_bth {
 	uint8_t	rsvd1:7;	/**< Reserved. */
 #endif
 	uint8_t	psn[3];		/**< Packet Sequence Number */
-} __rte_packed;
+} __rte_packed_end;
 
 /** RoCEv2 default port. */
 #define RTE_ROCEV2_DEFAULT_PORT 4791
diff --git a/lib/net/rte_icmp.h b/lib/net/rte_icmp.h
index e69d68ab6e..cca73b3733 100644
--- a/lib/net/rte_icmp.h
+++ b/lib/net/rte_icmp.h
@@ -21,33 +21,33 @@
 /**
  * ICMP base header
  */
-struct rte_icmp_base_hdr {
+struct __rte_packed_begin rte_icmp_base_hdr {
 	uint8_t type;
 	uint8_t code;
 	rte_be16_t checksum;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ICMP echo header
  */
-struct rte_icmp_echo_hdr {
+struct __rte_packed_begin rte_icmp_echo_hdr {
 	struct rte_icmp_base_hdr base;
 	rte_be16_t identifier;
 	rte_be16_t sequence;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ICMP Header
  *
  * @see rte_icmp_echo_hdr which is similar.
  */
-struct rte_icmp_hdr {
+struct __rte_packed_begin rte_icmp_hdr {
 	uint8_t  icmp_type;     /* ICMP packet type. */
 	uint8_t  icmp_code;     /* ICMP packet code. */
 	rte_be16_t icmp_cksum;  /* ICMP packet checksum. */
 	rte_be16_t icmp_ident;  /* ICMP packet identifier. */
 	rte_be16_t icmp_seq_nb; /* ICMP packet sequence number. */
-} __rte_packed;
+} __rte_packed_end;
 
 /* ICMP packet types */
 #define RTE_ICMP_TYPE_ECHO_REPLY 0
diff --git a/lib/net/rte_ip4.h b/lib/net/rte_ip4.h
index f9b8333332..d4b38c513c 100644
--- a/lib/net/rte_ip4.h
+++ b/lib/net/rte_ip4.h
@@ -39,7 +39,7 @@ extern "C" {
 /**
  * IPv4 Header
  */
-struct __rte_aligned(2) rte_ipv4_hdr {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv4_hdr {
 	__extension__
 	union {
 		uint8_t version_ihl;    /**< version and header length */
@@ -62,7 +62,7 @@ struct __rte_aligned(2) rte_ipv4_hdr {
 	rte_be16_t hdr_checksum;	/**< header checksum */
 	rte_be32_t src_addr;		/**< source address */
 	rte_be32_t dst_addr;		/**< destination address */
-} __rte_packed;
+} __rte_packed_end;
 
 /** Create IPv4 address */
 #define RTE_IPV4(a, b, c, d) ((uint32_t)(((a) & 0xff) << 24) | \
diff --git a/lib/net/rte_ip6.h b/lib/net/rte_ip6.h
index 992ab5ee1f..92558a124a 100644
--- a/lib/net/rte_ip6.h
+++ b/lib/net/rte_ip6.h
@@ -358,7 +358,7 @@ enum rte_ipv6_mc_scope {
 	RTE_IPV6_MC_SCOPE_ORGLOCAL = 0x08,
 	/** Global multicast scope. */
 	RTE_IPV6_MC_SCOPE_GLOBAL = 0x0e,
-} __rte_packed;
+};
 
 /**
  * Extract the IPv6 multicast scope value as defined in RFC 4291, section 2.7.
@@ -461,7 +461,7 @@ rte_ether_mcast_from_ipv6(struct rte_ether_addr *mac, const struct rte_ipv6_addr
 /**
  * IPv6 Header
  */
-struct __rte_aligned(2) rte_ipv6_hdr {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv6_hdr {
 	union {
 		rte_be32_t vtc_flow;        /**< IP version, traffic class & flow label. */
 		__extension__
@@ -484,7 +484,7 @@ struct __rte_aligned(2) rte_ipv6_hdr {
 	uint8_t  hop_limits;	/**< Hop limits. */
 	struct rte_ipv6_addr src_addr;	/**< IP address of source host. */
 	struct rte_ipv6_addr dst_addr;	/**< IP address of destination host(s). */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Check that the IPv6 header version field is valid according to RFC 8200 section 3.
@@ -508,7 +508,7 @@ static inline int rte_ipv6_check_version(const struct rte_ipv6_hdr *ip)
 /**
  * IPv6 Routing Extension Header
  */
-struct __rte_aligned(2) rte_ipv6_routing_ext {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv6_routing_ext {
 	uint8_t next_hdr;			/**< Protocol, next header. */
 	uint8_t hdr_len;			/**< Header length. */
 	uint8_t type;				/**< Extension header type. */
@@ -523,7 +523,7 @@ struct __rte_aligned(2) rte_ipv6_routing_ext {
 		};
 	};
 	/* Next are 128-bit IPv6 address fields to describe segments. */
-} __rte_packed;
+} __rte_packed_end;
 
 /* IPv6 vtc_flow: IPv / TC / flow_label */
 #define RTE_IPV6_HDR_FL_SHIFT 0
@@ -752,12 +752,12 @@ rte_ipv6_udptcp_cksum_mbuf_verify(const struct rte_mbuf *m,
 #define RTE_IPV6_SET_FRAG_DATA(fo, mf)	\
 	(((fo) & RTE_IPV6_EHDR_FO_MASK) | ((mf) & RTE_IPV6_EHDR_MF_MASK))
 
-struct __rte_aligned(2) rte_ipv6_fragment_ext {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv6_fragment_ext {
 	uint8_t next_header;	/**< Next header type */
 	uint8_t reserved;	/**< Reserved */
 	rte_be16_t frag_data;	/**< All fragmentation data */
 	rte_be32_t id;		/**< Packet ID */
-} __rte_packed;
+} __rte_packed_end;
 
 /* IPv6 fragment extension header size */
 #define RTE_IPV6_FRAG_HDR_SIZE	sizeof(struct rte_ipv6_fragment_ext)
diff --git a/lib/net/rte_l2tpv2.h b/lib/net/rte_l2tpv2.h
index ac16657856..728dc01506 100644
--- a/lib/net/rte_l2tpv2.h
+++ b/lib/net/rte_l2tpv2.h
@@ -125,7 +125,7 @@ struct rte_l2tpv2_common_hdr {
  * L2TPv2 message Header contains all options(length, ns, nr,
  * offset size, offset padding).
  */
-struct rte_l2tpv2_msg_with_all_options {
+struct __rte_packed_begin rte_l2tpv2_msg_with_all_options {
 	rte_be16_t length;		/**< length(16) */
 	rte_be16_t tunnel_id;		/**< tunnel ID(16) */
 	rte_be16_t session_id;		/**< session ID(16) */
@@ -133,20 +133,20 @@ struct rte_l2tpv2_msg_with_all_options {
 	rte_be16_t nr;			/**< Nr(16) */
 	rte_be16_t offset_size;		/**< offset size(16) */
 	uint8_t   *offset_padding;	/**< offset padding(variable length) */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * L2TPv2 message Header contains all options except length(ns, nr,
  * offset size, offset padding).
  */
-struct rte_l2tpv2_msg_without_length {
+struct __rte_packed_begin rte_l2tpv2_msg_without_length {
 	rte_be16_t tunnel_id;		/**< tunnel ID(16) */
 	rte_be16_t session_id;		/**< session ID(16) */
 	rte_be16_t ns;			/**< Ns(16) */
 	rte_be16_t nr;			/**< Nr(16) */
 	rte_be16_t offset_size;		/**< offset size(16) */
 	uint8_t   *offset_padding;	/**< offset padding(variable length) */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * L2TPv2 message Header contains all options except ns_nr(length,
@@ -176,12 +176,12 @@ struct rte_l2tpv2_msg_without_offset {
 /**
  * L2TPv2 message Header contains options offset size and offset padding.
  */
-struct rte_l2tpv2_msg_with_offset {
+struct __rte_packed_begin rte_l2tpv2_msg_with_offset {
 	rte_be16_t tunnel_id;		/**< tunnel ID(16) */
 	rte_be16_t session_id;		/**< session ID(16) */
 	rte_be16_t offset_size;		/**< offset size(16) */
 	uint8_t   *offset_padding;	/**< offset padding(variable length) */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * L2TPv2 message Header contains options ns and nr.
@@ -213,7 +213,7 @@ struct rte_l2tpv2_msg_without_all_options {
 /**
  * L2TPv2 Combined Message Header Format: Common Header + Options
  */
-struct rte_l2tpv2_combined_msg_hdr {
+struct __rte_packed_begin rte_l2tpv2_combined_msg_hdr {
 	struct rte_l2tpv2_common_hdr common; /**< common header */
 	union {
 		/** header with all options */
@@ -233,6 +233,6 @@ struct rte_l2tpv2_combined_msg_hdr {
 		/** header without all options */
 		struct rte_l2tpv2_msg_without_all_options type7;
 	};
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* _RTE_L2TPV2_H_ */
diff --git a/lib/net/rte_macsec.h b/lib/net/rte_macsec.h
index beeeb8effe..c694c37b4b 100644
--- a/lib/net/rte_macsec.h
+++ b/lib/net/rte_macsec.h
@@ -25,7 +25,7 @@
  * MACsec Header (SecTAG)
  */
 __extension__
-struct rte_macsec_hdr {
+struct __rte_packed_begin rte_macsec_hdr {
 	/**
 	 * Tag control information and Association number of secure channel.
 	 * Various bits of TCI and AN are masked using RTE_MACSEC_TCI_* and RTE_MACSEC_AN_MASK.
@@ -39,7 +39,7 @@ struct rte_macsec_hdr {
 	uint8_t short_length:6; /**< Short Length. */
 #endif
 	rte_be32_t packet_number; /**< Packet number to support replay protection. */
-} __rte_packed;
+} __rte_packed_end;
 
 /** SCI length in MACsec header if present. */
 #define RTE_MACSEC_SCI_LEN 8
@@ -48,8 +48,8 @@ struct rte_macsec_hdr {
  * MACsec SCI header (8 bytes) after the MACsec header
  * which is present if SC bit is set in tci_an.
  */
-struct rte_macsec_sci_hdr {
+struct __rte_packed_begin rte_macsec_sci_hdr {
 	uint8_t sci[RTE_MACSEC_SCI_LEN]; /**< Optional secure channel ID. */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_MACSEC_H */
diff --git a/lib/net/rte_mpls.h b/lib/net/rte_mpls.h
index 35a356efd3..53614a0b88 100644
--- a/lib/net/rte_mpls.h
+++ b/lib/net/rte_mpls.h
@@ -18,7 +18,7 @@
  * MPLS header.
  */
 __extension__
-struct rte_mpls_hdr {
+struct __rte_packed_begin rte_mpls_hdr {
 	rte_be16_t tag_msb; /**< Label(msb). */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t tag_lsb:4;  /**< Label(lsb). */
@@ -30,6 +30,6 @@ struct rte_mpls_hdr {
 	uint8_t tag_lsb:4;  /**< label(lsb) */
 #endif
 	uint8_t  ttl;       /**< Time to live. */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_MPLS_H_ */
diff --git a/lib/net/rte_pdcp_hdr.h b/lib/net/rte_pdcp_hdr.h
index c22b66bf93..2e8da1e1d3 100644
--- a/lib/net/rte_pdcp_hdr.h
+++ b/lib/net/rte_pdcp_hdr.h
@@ -56,7 +56,7 @@ enum rte_pdcp_pdu_type {
  * 6.2.2.1 Data PDU for SRBs
  */
 __extension__
-struct rte_pdcp_cp_data_pdu_sn_12_hdr {
+struct __rte_packed_begin rte_pdcp_cp_data_pdu_sn_12_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 	uint8_t r : 4;		/**< Reserved */
@@ -65,13 +65,13 @@ struct rte_pdcp_cp_data_pdu_sn_12_hdr {
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 #endif
 	uint8_t sn_7_0;		/**< Sequence number bits 0-7 */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * 6.2.2.2 Data PDU for DRBs and MRBs with 12 bits PDCP SN
  */
 __extension__
-struct rte_pdcp_up_data_pdu_sn_12_hdr {
+struct __rte_packed_begin rte_pdcp_up_data_pdu_sn_12_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 	uint8_t r : 3;		/**< Reserved */
@@ -82,13 +82,13 @@ struct rte_pdcp_up_data_pdu_sn_12_hdr {
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 #endif
 	uint8_t sn_7_0;		/**< Sequence number bits 0-7 */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * 6.2.2.3 Data PDU for DRBs and MRBs with 18 bits PDCP SN
  */
 __extension__
-struct rte_pdcp_up_data_pdu_sn_18_hdr {
+struct __rte_packed_begin rte_pdcp_up_data_pdu_sn_18_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t sn_17_16 : 2;	/**< Sequence number bits 16-17 */
 	uint8_t r : 5;		/**< Reserved */
@@ -100,13 +100,13 @@ struct rte_pdcp_up_data_pdu_sn_18_hdr {
 #endif
 	uint8_t sn_15_8;	/**< Sequence number bits 8-15 */
 	uint8_t sn_7_0;		/**< Sequence number bits 0-7 */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * 6.2.3.1 Control PDU for PDCP status report
  */
 __extension__
-struct rte_pdcp_up_ctrl_pdu_hdr {
+struct __rte_packed_begin rte_pdcp_up_ctrl_pdu_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t r : 4;		/**< Reserved */
 	uint8_t pdu_type : 3;	/**< Control PDU type */
@@ -134,6 +134,6 @@ struct rte_pdcp_up_ctrl_pdu_hdr {
 	 * in the Bitmap is 1.
 	 */
 	uint8_t bitmap[];
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_PDCP_HDR_H */
diff --git a/lib/net/rte_ppp.h b/lib/net/rte_ppp.h
index 63c72a9392..02bfb03c03 100644
--- a/lib/net/rte_ppp.h
+++ b/lib/net/rte_ppp.h
@@ -17,10 +17,10 @@
 /**
  * PPP Header
  */
-struct rte_ppp_hdr {
+struct __rte_packed_begin rte_ppp_hdr {
 	uint8_t addr; /**< PPP address(8) */
 	uint8_t ctrl; /**< PPP control(8) */
 	rte_be16_t proto_id; /**< PPP protocol identifier(16) */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* _RTE_PPP_H_ */
diff --git a/lib/net/rte_sctp.h b/lib/net/rte_sctp.h
index e757c57db3..73051b94fd 100644
--- a/lib/net/rte_sctp.h
+++ b/lib/net/rte_sctp.h
@@ -21,11 +21,11 @@
 /**
  * SCTP Header
  */
-struct rte_sctp_hdr {
+struct __rte_packed_begin rte_sctp_hdr {
 	rte_be16_t src_port; /**< Source port. */
 	rte_be16_t dst_port; /**< Destin port. */
 	rte_be32_t tag;      /**< Validation tag. */
 	rte_be32_t cksum;    /**< Checksum. */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_SCTP_H_ */
diff --git a/lib/net/rte_tcp.h b/lib/net/rte_tcp.h
index 1bcacbf038..fb0eb308f5 100644
--- a/lib/net/rte_tcp.h
+++ b/lib/net/rte_tcp.h
@@ -21,7 +21,7 @@
 /**
  * TCP Header
  */
-struct rte_tcp_hdr {
+struct __rte_packed_begin rte_tcp_hdr {
 	rte_be16_t src_port; /**< TCP source port. */
 	rte_be16_t dst_port; /**< TCP destination port. */
 	rte_be32_t sent_seq; /**< TX data sequence number. */
@@ -31,7 +31,7 @@ struct rte_tcp_hdr {
 	rte_be16_t rx_win;   /**< RX flow control window. */
 	rte_be16_t cksum;    /**< TCP checksum. */
 	rte_be16_t tcp_urp;  /**< TCP urgent pointer, if any. */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * TCP Flags
diff --git a/lib/net/rte_tls.h b/lib/net/rte_tls.h
index 595567e3e9..f27db3acb1 100644
--- a/lib/net/rte_tls.h
+++ b/lib/net/rte_tls.h
@@ -28,13 +28,13 @@
  * TLS Header
  */
 __extension__
-struct rte_tls_hdr {
+struct __rte_packed_begin rte_tls_hdr {
 	/** Content type of TLS packet. Defined as RTE_TLS_TYPE_*. */
 	uint8_t type;
 	/** TLS Version defined as RTE_TLS_VERSION*. */
 	rte_be16_t version;
 	/** The length (in bytes) of the following TLS packet. */
 	rte_be16_t length;
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_TLS_H */
diff --git a/lib/net/rte_udp.h b/lib/net/rte_udp.h
index c01dad9c9b..94f5304e6d 100644
--- a/lib/net/rte_udp.h
+++ b/lib/net/rte_udp.h
@@ -21,11 +21,11 @@
 /**
  * UDP Header
  */
-struct rte_udp_hdr {
+struct __rte_packed_begin rte_udp_hdr {
 	rte_be16_t src_port;    /**< UDP source port. */
 	rte_be16_t dst_port;    /**< UDP destination port. */
 	rte_be16_t dgram_len;   /**< UDP datagram length */
 	rte_be16_t dgram_cksum; /**< UDP datagram checksum */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_UDP_H_ */
diff --git a/lib/net/rte_vxlan.h b/lib/net/rte_vxlan.h
index bd1c89835e..f59829b182 100644
--- a/lib/net/rte_vxlan.h
+++ b/lib/net/rte_vxlan.h
@@ -27,13 +27,13 @@
  * Reserved fields (24 bits and 8 bits)
  */
 __extension__ /* no named member in struct */
-struct rte_vxlan_hdr {
+struct __rte_packed_begin rte_vxlan_hdr {
 	union {
 		rte_be32_t vx_flags; /**< flags (8 bits) + extensions (24 bits). */
-		struct {
+		struct __rte_packed_begin {
 			union {
 				uint8_t flags; /**< Default is I bit, others are extensions. */
-				struct {
+				struct __rte_packed_begin {
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 					uint8_t flag_g:1,     /**< GBP bit. */
 						flag_rsvd:1,  /*   Reserved. */
@@ -51,11 +51,11 @@ struct rte_vxlan_hdr {
 						flag_rsvd:1,
 						flag_g:1;
 #endif
-				} __rte_packed;
+				} __rte_packed_end;
 			}; /* end of 1st byte */
 			union {
 				uint8_t rsvd0[3]; /* Reserved for extensions. */
-				struct {
+				struct __rte_packed_begin {
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 					uint8_t rsvd0_gbp1:1, /*   Reserved. */
 						flag_d:1,     /**< GBP Don't Learn bit. */
@@ -71,7 +71,7 @@ struct rte_vxlan_hdr {
 #endif
 					union {
 						uint16_t policy_id; /**< GBP Identifier. */
-						struct {
+						struct __rte_packed_begin {
 							uint8_t rsvd0_gpe; /* Reserved. */
 							uint8_t proto; /**< GPE Next protocol. */
 								/* 0x01 : IPv4
@@ -79,23 +79,23 @@ struct rte_vxlan_hdr {
 								 * 0x03 : Ethernet
 								 * 0x04 : Network Service Header
 								 */
-						} __rte_packed;
+						} __rte_packed_end;
 					};
-				} __rte_packed;
+				} __rte_packed_end;
 			};
-		} __rte_packed;
+		} __rte_packed_end;
 	}; /* end of 1st 32-bit word */
 	union {
 		rte_be32_t vx_vni; /**< VNI (24 bits) + reserved (8 bits). */
-		struct {
+		struct __rte_packed_begin {
 			uint8_t    vni[3];   /**< VXLAN Identifier. */
 			union {
 				uint8_t    rsvd1;        /**< Reserved. */
 				uint8_t    last_rsvd;    /**< Reserved. */
 			};
-		} __rte_packed;
+		} __rte_packed_end;
 	}; /* end of 2nd 32-bit word */
-} __rte_packed;
+} __rte_packed_end;
 
 /** VXLAN tunnel header length. */
 #define RTE_ETHER_VXLAN_HLEN \
@@ -111,7 +111,7 @@ struct rte_vxlan_hdr {
  * Identifier and Reserved fields (16 bits and 8 bits).
  */
 __extension__ /* no named member in struct */
-struct rte_vxlan_gpe_hdr {
+struct __rte_packed_begin rte_vxlan_gpe_hdr {
 	union {
 		struct {
 			uint8_t vx_flags;    /**< flag (8). */
@@ -127,7 +127,7 @@ struct rte_vxlan_gpe_hdr {
 			uint8_t rsvd1;    /**< Reserved. */
 		};
 	};
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * @deprecated
-- 
2.47.0.vfs.0.3


^ permalink raw reply	[relevance 1%]

* Re: [PATCH RESEND v7 2/5] ethdev: fix skip valid port in probing callback
  2025-01-10  3:21  0%       ` lihuisong (C)
@ 2025-01-10 17:54  3%         ` Stephen Hemminger
  2025-01-13  2:32  0%           ` lihuisong (C)
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2025-01-10 17:54 UTC (permalink / raw)
  To: lihuisong (C)
  Cc: dev, fengchengwen, liuyonglong, andrew.rybchenko, Somnath Kotur,
	Ajit Khaparde, Dariusz Sosnowski, Suanming Mou, Matan Azrad,
	Ori Kam, Viacheslav Ovsiienko, ferruh.yigit, thomas

On Fri, 10 Jan 2025 11:21:26 +0800
"lihuisong (C)" <lihuisong@huawei.com> wrote:

> Hi Stephen,
> 
> Can you take a look at my below reply and reconsider this patch?
> 
> /Huisong
> 
> 在 2024/12/10 9:50, lihuisong (C) 写道:
> > Hi Ferruh, Stephen and Thomas,
> >
> > Can you take a look at this patch? After all, it is an issue in ethdev 
> > layer.
> > This also is the fruit we disscussed with Thomas and Ferruh before.
> > Please go back to this thread. If we don't need this patch, please let 
> > me know. I will drop it from my upstreaming list.
> >
> > /Huisong
> >
> >
> > 在 2024/9/29 13:52, Huisong Li 写道:  
> >> The event callback in application may use the macro 
> >> RTE_ETH_FOREACH_DEV to
> >> iterate over all enabled ports to do something(like, verifying the 
> >> port id
> >> validity) when receive a probing event. If the ethdev state of a port is
> >> not RTE_ETH_DEV_UNUSED, this port will be considered as a valid port.
> >>
> >> However, this state is set to RTE_ETH_DEV_ATTACHED after pushing probing
> >> event. It means that probing callback will skip this port. But this
> >> assignment can not move to front of probing notification. See
> >> commit be8cd210379a ("ethdev: fix port probing notification")
> >>
> >> So this patch has to add a new state, RTE_ETH_DEV_ALLOCATED. Set the 
> >> ethdev
> >> state to RTE_ETH_DEV_ALLOCATED before pushing probing event and set 
> >> it to
> >> RTE_ETH_DEV_ATTACHED after definitely probed. And this port is valid 
> >> if its
> >> device state is 'ALLOCATED' or 'ATTACHED'.
> >>
> >> In addition, the new state has to be placed behind 'REMOVED' to avoid 
> >> ABI
> >> break. Fortunately, this ethdev state is internal and applications 
> >> can not
> >> access it directly. So this patch encapsulates an API, 
> >> rte_eth_dev_is_used,
> >> for ethdev or PMD to call and eliminate concerns about using this state
> >> enum value comparison.
> >>
> >> Fixes: be8cd210379a ("ethdev: fix port probing notification")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> >> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
> >> ---
> >>   drivers/net/bnxt/bnxt_ethdev.c |  3 ++-
> >>   drivers/net/mlx5/mlx5.c        |  2 +-
> >>   lib/ethdev/ethdev_driver.c     | 13 ++++++++++---
> >>   lib/ethdev/ethdev_driver.h     | 12 ++++++++++++
> >>   lib/ethdev/ethdev_pci.h        |  2 +-
> >>   lib/ethdev/rte_class_eth.c     |  2 +-
> >>   lib/ethdev/rte_ethdev.c        |  4 ++--
> >>   lib/ethdev/rte_ethdev.h        |  4 +++-
> >>   lib/ethdev/version.map         |  1 +
> >>   9 files changed, 33 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/drivers/net/bnxt/bnxt_ethdev.c 
> >> b/drivers/net/bnxt/bnxt_ethdev.c
> >> index c6ad764813..7401dcd8b5 100644
> >> --- a/drivers/net/bnxt/bnxt_ethdev.c
> >> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> >> @@ -6612,7 +6612,8 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev)
> >>         PMD_DRV_LOG(DEBUG, "Calling Device uninit\n");
> >>   -    if (eth_dev->state != RTE_ETH_DEV_UNUSED)
> >> +
> >> +    if (rte_eth_dev_is_used(eth_dev->state))
> >>           bnxt_dev_close_op(eth_dev);
> >>         return 0;
> >> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> >> index 8d266b0e64..0df49e1f69 100644
> >> --- a/drivers/net/mlx5/mlx5.c
> >> +++ b/drivers/net/mlx5/mlx5.c
> >> @@ -3371,7 +3371,7 @@ mlx5_eth_find_next(uint16_t port_id, struct 
> >> rte_device *odev)
> >>       while (port_id < RTE_MAX_ETHPORTS) {
> >>           struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >>   -        if (dev->state != RTE_ETH_DEV_UNUSED &&
> >> +        if (rte_eth_dev_is_used(dev->state) &&
> >>               dev->device &&
> >>               (dev->device == odev ||
> >>                (dev->device->driver &&
> >> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> >> index c335a25a82..a87dbb00ff 100644
> >> --- a/lib/ethdev/ethdev_driver.c
> >> +++ b/lib/ethdev/ethdev_driver.c
> >> @@ -55,8 +55,8 @@ eth_dev_find_free_port(void)
> >>       for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> >>           /* Using shared name field to find a free port. */
> >>           if (eth_dev_shared_data->data[i].name[0] == '\0') {
> >> -            RTE_ASSERT(rte_eth_devices[i].state ==
> >> -                   RTE_ETH_DEV_UNUSED);
> >> +            RTE_ASSERT(!rte_eth_dev_is_used(
> >> +                    rte_eth_devices[i].state));
> >>               return i;
> >>           }
> >>       }
> >> @@ -221,11 +221,18 @@ rte_eth_dev_probing_finish(struct rte_eth_dev 
> >> *dev)
> >>       if (rte_eal_process_type() == RTE_PROC_SECONDARY)
> >>           eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, 
> >> dev);
> >>   +    dev->state = RTE_ETH_DEV_ALLOCATED;
> >>       rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_NEW, NULL);
> >>         dev->state = RTE_ETH_DEV_ATTACHED;
> >>   }
> >>   +bool rte_eth_dev_is_used(uint16_t dev_state)
> >> +{
> >> +    return dev_state == RTE_ETH_DEV_ALLOCATED ||
> >> +        dev_state == RTE_ETH_DEV_ATTACHED;
> >> +}
> >> +
> >>   int
> >>   rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
> >>   {
> >> @@ -243,7 +250,7 @@ rte_eth_dev_release_port(struct rte_eth_dev 
> >> *eth_dev)
> >>       if (ret != 0)
> >>           return ret;
> >>   -    if (eth_dev->state != RTE_ETH_DEV_UNUSED)
> >> +    if (rte_eth_dev_is_used(eth_dev->state))
> >>           rte_eth_dev_callback_process(eth_dev,
> >>                   RTE_ETH_EVENT_DESTROY, NULL);
> >>   diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >> index abed4784aa..aa35b65848 100644
> >> --- a/lib/ethdev/ethdev_driver.h
> >> +++ b/lib/ethdev/ethdev_driver.h
> >> @@ -1704,6 +1704,18 @@ int rte_eth_dev_callback_process(struct 
> >> rte_eth_dev *dev,
> >>   __rte_internal
> >>   void rte_eth_dev_probing_finish(struct rte_eth_dev *dev);
> >>   +/**
> >> + * Check if a Ethernet device state is used or not
> >> + *
> >> + * @param dev_state
> >> + *   The state of the Ethernet device
> >> + * @return
> >> + *   - true if the state of the Ethernet device is allocated or 
> >> attached
> >> + *   - false if this state is neither allocated nor attached
> >> + */
> >> +__rte_internal
> >> +bool rte_eth_dev_is_used(uint16_t dev_state);
> >> +
> >>   /**
> >>    * Create memzone for HW rings.
> >>    * malloc can't be used as the physical address is needed.
> >> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> >> index ec4f731270..05dec6716b 100644
> >> --- a/lib/ethdev/ethdev_pci.h
> >> +++ b/lib/ethdev/ethdev_pci.h
> >> @@ -179,7 +179,7 @@ rte_eth_dev_pci_generic_remove(struct 
> >> rte_pci_device *pci_dev,
> >>        * eth device has been released.
> >>        */
> >>       if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
> >> -        eth_dev->state == RTE_ETH_DEV_UNUSED)
> >> +        !rte_eth_dev_is_used(eth_dev->state))
> >>           return 0;
> >>         if (dev_uninit) {
> >> diff --git a/lib/ethdev/rte_class_eth.c b/lib/ethdev/rte_class_eth.c
> >> index b52f1dd9f2..81e70670d9 100644
> >> --- a/lib/ethdev/rte_class_eth.c
> >> +++ b/lib/ethdev/rte_class_eth.c
> >> @@ -118,7 +118,7 @@ eth_dev_match(const struct rte_eth_dev *edev,
> >>       const struct rte_kvargs *kvlist = arg->kvlist;
> >>       unsigned int pair;
> >>   -    if (edev->state == RTE_ETH_DEV_UNUSED)
> >> +    if (!rte_eth_dev_is_used(edev->state))
> >>           return -1;
> >>       if (arg->device != NULL && arg->device != edev->device)
> >>           return -1;
> >> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> >> index a1f7efa913..4dc66abb7b 100644
> >> --- a/lib/ethdev/rte_ethdev.c
> >> +++ b/lib/ethdev/rte_ethdev.c
> >> @@ -349,7 +349,7 @@ uint16_t
> >>   rte_eth_find_next(uint16_t port_id)
> >>   {
> >>       while (port_id < RTE_MAX_ETHPORTS &&
> >> -            rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED)
> >> + !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
> >>           port_id++;
> >>         if (port_id >= RTE_MAX_ETHPORTS)
> >> @@ -408,7 +408,7 @@ rte_eth_dev_is_valid_port(uint16_t port_id)
> >>       int is_valid;
> >>         if (port_id >= RTE_MAX_ETHPORTS ||
> >> -        (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
> >> +        !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
> >>           is_valid = 0;
> >>       else
> >>           is_valid = 1;
> >> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >> index a9f92006da..9cc37e8cde 100644
> >> --- a/lib/ethdev/rte_ethdev.h
> >> +++ b/lib/ethdev/rte_ethdev.h
> >> @@ -2083,10 +2083,12 @@ typedef uint16_t 
> >> (*rte_tx_callback_fn)(uint16_t port_id, uint16_t queue,
> >>   enum rte_eth_dev_state {
> >>       /** Device is unused before being probed. */
> >>       RTE_ETH_DEV_UNUSED = 0,
> >> -    /** Device is attached when allocated in probing. */
> >> +    /** Device is attached when definitely probed. */
> >>       RTE_ETH_DEV_ATTACHED,
> >>       /** Device is in removed state when plug-out is detected. */
> >>       RTE_ETH_DEV_REMOVED,
> >> +    /** Device is allocated and is set before reporting new event. */
> >> +    RTE_ETH_DEV_ALLOCATED,
> >>   };
> >>     struct rte_eth_dev_sriov {
> >> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> >> index f63dc32aa2..6ecf1ab89d 100644
> >> --- a/lib/ethdev/version.map
> >> +++ b/lib/ethdev/version.map
> >> @@ -349,6 +349,7 @@ INTERNAL {
> >>       rte_eth_dev_get_by_name;
> >>       rte_eth_dev_is_rx_hairpin_queue;
> >>       rte_eth_dev_is_tx_hairpin_queue;
> >> +    rte_eth_dev_is_used;
> >>       rte_eth_dev_probing_finish;
> >>       rte_eth_dev_release_port;
> >>       rte_eth_dev_internal_reset;  
> > .  

Please resubmit for 25.03 release.
But it looks like an API/ABI change since rte_eth_dev_state is visible
to applications.

A more detailed bug report would also help

^ permalink raw reply	[relevance 3%]

* Re: [PATCH RESEND v7 2/5] ethdev: fix skip valid port in probing callback
  2024-12-10  1:50  0%     ` lihuisong (C)
@ 2025-01-10  3:21  0%       ` lihuisong (C)
  2025-01-10 17:54  3%         ` Stephen Hemminger
  0 siblings, 1 reply; 200+ results
From: lihuisong (C) @ 2025-01-10  3:21 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, fengchengwen, liuyonglong, andrew.rybchenko, Somnath Kotur,
	Ajit Khaparde, Dariusz Sosnowski, Suanming Mou, Matan Azrad,
	Ori Kam, Viacheslav Ovsiienko, ferruh.yigit, thomas

Hi Stephen,

Can you take a look at my below reply and reconsider this patch?

/Huisong

在 2024/12/10 9:50, lihuisong (C) 写道:
> Hi Ferruh, Stephen and Thomas,
>
> Can you take a look at this patch? After all, it is an issue in ethdev 
> layer.
> This also is the fruit we disscussed with Thomas and Ferruh before.
> Please go back to this thread. If we don't need this patch, please let 
> me know. I will drop it from my upstreaming list.
>
> /Huisong
>
>
> 在 2024/9/29 13:52, Huisong Li 写道:
>> The event callback in application may use the macro 
>> RTE_ETH_FOREACH_DEV to
>> iterate over all enabled ports to do something(like, verifying the 
>> port id
>> validity) when receive a probing event. If the ethdev state of a port is
>> not RTE_ETH_DEV_UNUSED, this port will be considered as a valid port.
>>
>> However, this state is set to RTE_ETH_DEV_ATTACHED after pushing probing
>> event. It means that probing callback will skip this port. But this
>> assignment can not move to front of probing notification. See
>> commit be8cd210379a ("ethdev: fix port probing notification")
>>
>> So this patch has to add a new state, RTE_ETH_DEV_ALLOCATED. Set the 
>> ethdev
>> state to RTE_ETH_DEV_ALLOCATED before pushing probing event and set 
>> it to
>> RTE_ETH_DEV_ATTACHED after definitely probed. And this port is valid 
>> if its
>> device state is 'ALLOCATED' or 'ATTACHED'.
>>
>> In addition, the new state has to be placed behind 'REMOVED' to avoid 
>> ABI
>> break. Fortunately, this ethdev state is internal and applications 
>> can not
>> access it directly. So this patch encapsulates an API, 
>> rte_eth_dev_is_used,
>> for ethdev or PMD to call and eliminate concerns about using this state
>> enum value comparison.
>>
>> Fixes: be8cd210379a ("ethdev: fix port probing notification")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
>> ---
>>   drivers/net/bnxt/bnxt_ethdev.c |  3 ++-
>>   drivers/net/mlx5/mlx5.c        |  2 +-
>>   lib/ethdev/ethdev_driver.c     | 13 ++++++++++---
>>   lib/ethdev/ethdev_driver.h     | 12 ++++++++++++
>>   lib/ethdev/ethdev_pci.h        |  2 +-
>>   lib/ethdev/rte_class_eth.c     |  2 +-
>>   lib/ethdev/rte_ethdev.c        |  4 ++--
>>   lib/ethdev/rte_ethdev.h        |  4 +++-
>>   lib/ethdev/version.map         |  1 +
>>   9 files changed, 33 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/bnxt/bnxt_ethdev.c 
>> b/drivers/net/bnxt/bnxt_ethdev.c
>> index c6ad764813..7401dcd8b5 100644
>> --- a/drivers/net/bnxt/bnxt_ethdev.c
>> +++ b/drivers/net/bnxt/bnxt_ethdev.c
>> @@ -6612,7 +6612,8 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev)
>>         PMD_DRV_LOG(DEBUG, "Calling Device uninit\n");
>>   -    if (eth_dev->state != RTE_ETH_DEV_UNUSED)
>> +
>> +    if (rte_eth_dev_is_used(eth_dev->state))
>>           bnxt_dev_close_op(eth_dev);
>>         return 0;
>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
>> index 8d266b0e64..0df49e1f69 100644
>> --- a/drivers/net/mlx5/mlx5.c
>> +++ b/drivers/net/mlx5/mlx5.c
>> @@ -3371,7 +3371,7 @@ mlx5_eth_find_next(uint16_t port_id, struct 
>> rte_device *odev)
>>       while (port_id < RTE_MAX_ETHPORTS) {
>>           struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>   -        if (dev->state != RTE_ETH_DEV_UNUSED &&
>> +        if (rte_eth_dev_is_used(dev->state) &&
>>               dev->device &&
>>               (dev->device == odev ||
>>                (dev->device->driver &&
>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>> index c335a25a82..a87dbb00ff 100644
>> --- a/lib/ethdev/ethdev_driver.c
>> +++ b/lib/ethdev/ethdev_driver.c
>> @@ -55,8 +55,8 @@ eth_dev_find_free_port(void)
>>       for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
>>           /* Using shared name field to find a free port. */
>>           if (eth_dev_shared_data->data[i].name[0] == '\0') {
>> -            RTE_ASSERT(rte_eth_devices[i].state ==
>> -                   RTE_ETH_DEV_UNUSED);
>> +            RTE_ASSERT(!rte_eth_dev_is_used(
>> +                    rte_eth_devices[i].state));
>>               return i;
>>           }
>>       }
>> @@ -221,11 +221,18 @@ rte_eth_dev_probing_finish(struct rte_eth_dev 
>> *dev)
>>       if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>>           eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, 
>> dev);
>>   +    dev->state = RTE_ETH_DEV_ALLOCATED;
>>       rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_NEW, NULL);
>>         dev->state = RTE_ETH_DEV_ATTACHED;
>>   }
>>   +bool rte_eth_dev_is_used(uint16_t dev_state)
>> +{
>> +    return dev_state == RTE_ETH_DEV_ALLOCATED ||
>> +        dev_state == RTE_ETH_DEV_ATTACHED;
>> +}
>> +
>>   int
>>   rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>>   {
>> @@ -243,7 +250,7 @@ rte_eth_dev_release_port(struct rte_eth_dev 
>> *eth_dev)
>>       if (ret != 0)
>>           return ret;
>>   -    if (eth_dev->state != RTE_ETH_DEV_UNUSED)
>> +    if (rte_eth_dev_is_used(eth_dev->state))
>>           rte_eth_dev_callback_process(eth_dev,
>>                   RTE_ETH_EVENT_DESTROY, NULL);
>>   diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>> index abed4784aa..aa35b65848 100644
>> --- a/lib/ethdev/ethdev_driver.h
>> +++ b/lib/ethdev/ethdev_driver.h
>> @@ -1704,6 +1704,18 @@ int rte_eth_dev_callback_process(struct 
>> rte_eth_dev *dev,
>>   __rte_internal
>>   void rte_eth_dev_probing_finish(struct rte_eth_dev *dev);
>>   +/**
>> + * Check if a Ethernet device state is used or not
>> + *
>> + * @param dev_state
>> + *   The state of the Ethernet device
>> + * @return
>> + *   - true if the state of the Ethernet device is allocated or 
>> attached
>> + *   - false if this state is neither allocated nor attached
>> + */
>> +__rte_internal
>> +bool rte_eth_dev_is_used(uint16_t dev_state);
>> +
>>   /**
>>    * Create memzone for HW rings.
>>    * malloc can't be used as the physical address is needed.
>> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
>> index ec4f731270..05dec6716b 100644
>> --- a/lib/ethdev/ethdev_pci.h
>> +++ b/lib/ethdev/ethdev_pci.h
>> @@ -179,7 +179,7 @@ rte_eth_dev_pci_generic_remove(struct 
>> rte_pci_device *pci_dev,
>>        * eth device has been released.
>>        */
>>       if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
>> -        eth_dev->state == RTE_ETH_DEV_UNUSED)
>> +        !rte_eth_dev_is_used(eth_dev->state))
>>           return 0;
>>         if (dev_uninit) {
>> diff --git a/lib/ethdev/rte_class_eth.c b/lib/ethdev/rte_class_eth.c
>> index b52f1dd9f2..81e70670d9 100644
>> --- a/lib/ethdev/rte_class_eth.c
>> +++ b/lib/ethdev/rte_class_eth.c
>> @@ -118,7 +118,7 @@ eth_dev_match(const struct rte_eth_dev *edev,
>>       const struct rte_kvargs *kvlist = arg->kvlist;
>>       unsigned int pair;
>>   -    if (edev->state == RTE_ETH_DEV_UNUSED)
>> +    if (!rte_eth_dev_is_used(edev->state))
>>           return -1;
>>       if (arg->device != NULL && arg->device != edev->device)
>>           return -1;
>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>> index a1f7efa913..4dc66abb7b 100644
>> --- a/lib/ethdev/rte_ethdev.c
>> +++ b/lib/ethdev/rte_ethdev.c
>> @@ -349,7 +349,7 @@ uint16_t
>>   rte_eth_find_next(uint16_t port_id)
>>   {
>>       while (port_id < RTE_MAX_ETHPORTS &&
>> -            rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED)
>> + !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
>>           port_id++;
>>         if (port_id >= RTE_MAX_ETHPORTS)
>> @@ -408,7 +408,7 @@ rte_eth_dev_is_valid_port(uint16_t port_id)
>>       int is_valid;
>>         if (port_id >= RTE_MAX_ETHPORTS ||
>> -        (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
>> +        !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
>>           is_valid = 0;
>>       else
>>           is_valid = 1;
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>> index a9f92006da..9cc37e8cde 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -2083,10 +2083,12 @@ typedef uint16_t 
>> (*rte_tx_callback_fn)(uint16_t port_id, uint16_t queue,
>>   enum rte_eth_dev_state {
>>       /** Device is unused before being probed. */
>>       RTE_ETH_DEV_UNUSED = 0,
>> -    /** Device is attached when allocated in probing. */
>> +    /** Device is attached when definitely probed. */
>>       RTE_ETH_DEV_ATTACHED,
>>       /** Device is in removed state when plug-out is detected. */
>>       RTE_ETH_DEV_REMOVED,
>> +    /** Device is allocated and is set before reporting new event. */
>> +    RTE_ETH_DEV_ALLOCATED,
>>   };
>>     struct rte_eth_dev_sriov {
>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>> index f63dc32aa2..6ecf1ab89d 100644
>> --- a/lib/ethdev/version.map
>> +++ b/lib/ethdev/version.map
>> @@ -349,6 +349,7 @@ INTERNAL {
>>       rte_eth_dev_get_by_name;
>>       rte_eth_dev_is_rx_hairpin_queue;
>>       rte_eth_dev_is_tx_hairpin_queue;
>> +    rte_eth_dev_is_used;
>>       rte_eth_dev_probing_finish;
>>       rte_eth_dev_release_port;
>>       rte_eth_dev_internal_reset;
> .

^ permalink raw reply	[relevance 0%]

* RE: [PATCH 1/2] lib/ipsec: compile ipsec on Windows
  @ 2025-01-09 15:31  3%   ` Konstantin Ananyev
  0 siblings, 0 replies; 200+ results
From: Konstantin Ananyev @ 2025-01-09 15:31 UTC (permalink / raw)
  To: Andre Muezerie, Konstantin Ananyev, Vladimir Medvedkin; +Cc: dev



> Removed VLA for compatibility with MSVC (which does not support VLAs).
> Used alloca when a constant fixed length that can be used instead is
> not known.
> 
> Implementation for rte_ipsec_pkt_crypto_group and
> rte_ipsec_ses_from_crypto was moved to new file
> lib\ipsec\ipsec_group.c because these functions get exported in a
> shared library (lib\ipsec\version.map).
> 
> Implementation for rte_ipsec_pkt_crypto_prepare and
> rte_ipsec_pkt_process was moved to new file lib\ipsec\ipsec.c because
> these functions get exported in a shared library
> (lib\ipsec\version.map).

Hmm... not sure I understood the rationale.
To me making inline functions not-inline first of all means ABI/API breakage,
plus it most likely will make things slower.


^ permalink raw reply	[relevance 3%]

* Re: [PATCH v8 27/29] lib/net: replace packed attributes
  2025-01-08 12:01  3%     ` David Marchand
@ 2025-01-09  2:49  0%       ` Andre Muezerie
  0 siblings, 0 replies; 200+ results
From: Andre Muezerie @ 2025-01-09  2:49 UTC (permalink / raw)
  To: David Marchand; +Cc: roretzla, dev, Thomas Monjalon, Robin Jarry

On Wed, Jan 08, 2025 at 01:01:23PM +0100, David Marchand wrote:
> On Tue, Dec 31, 2024 at 7:40 PM Andre Muezerie
> <andremue@linux.microsoft.com> wrote:
> > diff --git a/lib/net/rte_ip6.h b/lib/net/rte_ip6.h
> > index 992ab5ee1f..92558a124a 100644
> > --- a/lib/net/rte_ip6.h
> > +++ b/lib/net/rte_ip6.h
> > @@ -358,7 +358,7 @@ enum rte_ipv6_mc_scope {
> >         RTE_IPV6_MC_SCOPE_ORGLOCAL = 0x08,
> >         /** Global multicast scope. */
> >         RTE_IPV6_MC_SCOPE_GLOBAL = 0x0e,
> > -} __rte_packed;
> > +};
> >
> >  /**
> >   * Extract the IPv6 multicast scope value as defined in RFC 4291, section 2.7.
> 
> Cc: Robin for info.
> 
> This change affects the storage size of a variable of this type (at
> least with gcc).
> I think it is ok from an ABI pov: there is one (inline) helper using
> this type, and nothing else in DPDK takes a IPv6 multicast scope as
> input.
> 
> However, it deserves a mention in the commitlog (maybe a separate
> commit to highlight it?).
> 
> 
> -- 
> David Marchand

Makes sense. I added a note about that to the commit message for that patch in the v10 series.

Thanks,
Andre Muezerie

^ permalink raw reply	[relevance 0%]

* [PATCH v10 27/30] lib/net: replace packed attributes
  @ 2025-01-09  2:46  1%   ` Andre Muezerie
  0 siblings, 0 replies; 200+ results
From: Andre Muezerie @ 2025-01-09  2:46 UTC (permalink / raw)
  To: roretzla
  Cc: aman.deep.singh, anatoly.burakov, bruce.richardson, byron.marohn,
	conor.walsh, cristian.dumitrescu, david.hunt, dev, dsosnowski,
	gakhil, jerinj, jingjing.wu, kirill.rybalchenko,
	konstantin.v.ananyev, matan, mb, orika, radu.nicolau,
	ruifeng.wang, sameh.gobriel, sivaprasad.tummala, skori, stephen,
	suanmingm, vattunuru, viacheslavo, vladimir.medvedkin,
	yipeng1.wang, Andre Muezerie

MSVC struct packing is not compatible with GCC. Replace macro
__rte_packed with __rte_packed_begin to push existing pack value
and set packing to 1-byte and macro __rte_packed_end to restore
the pack value prior to the push.

Macro __rte_packed_end is deliberately utilized to trigger a
MSVC compiler warning if no existing packing has been pushed allowing
easy identification of locations where the __rte_packed_begin is
missing.

This change affects the storage size of a variable of enum
rte_ipv6_mc_scope (at least with gcc). It should be OK from an ABI POV
though: there is one (inline) helper using this type, and nothing else
in DPDK takes a IPv6 multicast scope as input.

Signed-off-by: Andre Muezerie <andremue@linux.microsoft.com>
---
 lib/net/rte_arp.h      |  8 ++++----
 lib/net/rte_dtls.h     |  4 ++--
 lib/net/rte_esp.h      |  8 ++++----
 lib/net/rte_geneve.h   |  4 ++--
 lib/net/rte_gre.h      | 16 ++++++++--------
 lib/net/rte_gtp.h      | 20 ++++++++++----------
 lib/net/rte_ib.h       |  4 ++--
 lib/net/rte_icmp.h     | 12 ++++++------
 lib/net/rte_ip4.h      |  4 ++--
 lib/net/rte_ip6.h      | 14 +++++++-------
 lib/net/rte_l2tpv2.h   | 16 ++++++++--------
 lib/net/rte_macsec.h   |  8 ++++----
 lib/net/rte_mpls.h     |  4 ++--
 lib/net/rte_pdcp_hdr.h | 16 ++++++++--------
 lib/net/rte_ppp.h      |  4 ++--
 lib/net/rte_sctp.h     |  4 ++--
 lib/net/rte_tcp.h      |  4 ++--
 lib/net/rte_tls.h      |  4 ++--
 lib/net/rte_udp.h      |  4 ++--
 lib/net/rte_vxlan.h    | 28 ++++++++++++++--------------
 20 files changed, 93 insertions(+), 93 deletions(-)

diff --git a/lib/net/rte_arp.h b/lib/net/rte_arp.h
index 668cea1704..e885a71292 100644
--- a/lib/net/rte_arp.h
+++ b/lib/net/rte_arp.h
@@ -21,17 +21,17 @@ extern "C" {
 /**
  * ARP header IPv4 payload.
  */
-struct __rte_aligned(2) rte_arp_ipv4 {
+struct __rte_aligned(2) __rte_packed_begin rte_arp_ipv4 {
 	struct rte_ether_addr arp_sha;  /**< sender hardware address */
 	rte_be32_t            arp_sip;  /**< sender IP address */
 	struct rte_ether_addr arp_tha;  /**< target hardware address */
 	rte_be32_t            arp_tip;  /**< target IP address */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ARP header.
  */
-struct __rte_aligned(2) rte_arp_hdr {
+struct __rte_aligned(2) __rte_packed_begin rte_arp_hdr {
 	rte_be16_t arp_hardware; /**< format of hardware address */
 #define RTE_ARP_HRD_ETHER     1  /**< ARP Ethernet address format */
 
@@ -47,7 +47,7 @@ struct __rte_aligned(2) rte_arp_hdr {
 #define	RTE_ARP_OP_INVREPLY   9  /**< response identifying peer */
 
 	struct rte_arp_ipv4 arp_data;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Make a RARP packet based on MAC addr.
diff --git a/lib/net/rte_dtls.h b/lib/net/rte_dtls.h
index 246cd8a72d..1dd95ce899 100644
--- a/lib/net/rte_dtls.h
+++ b/lib/net/rte_dtls.h
@@ -30,7 +30,7 @@
  * DTLS Header
  */
 __extension__
-struct rte_dtls_hdr {
+struct __rte_packed_begin rte_dtls_hdr {
 	/** Content type of DTLS packet. Defined as RTE_DTLS_TYPE_*. */
 	uint8_t type;
 	/** DTLS Version defined as RTE_DTLS_VERSION*. */
@@ -48,6 +48,6 @@ struct rte_dtls_hdr {
 #endif
 	/** The length (in bytes) of the following DTLS packet. */
 	rte_be16_t length;
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_DTLS_H */
diff --git a/lib/net/rte_esp.h b/lib/net/rte_esp.h
index 745a9847fe..2a0002f4d9 100644
--- a/lib/net/rte_esp.h
+++ b/lib/net/rte_esp.h
@@ -16,17 +16,17 @@
 /**
  * ESP Header
  */
-struct rte_esp_hdr {
+struct __rte_packed_begin rte_esp_hdr {
 	rte_be32_t spi;  /**< Security Parameters Index */
 	rte_be32_t seq;  /**< packet sequence number */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ESP Trailer
  */
-struct rte_esp_tail {
+struct __rte_packed_begin rte_esp_tail {
 	uint8_t pad_len;     /**< number of pad bytes (0-255) */
 	uint8_t next_proto;  /**< IPv4 or IPv6 or next layer header */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_ESP_H_ */
diff --git a/lib/net/rte_geneve.h b/lib/net/rte_geneve.h
index eb2c85f1e9..f962c587ee 100644
--- a/lib/net/rte_geneve.h
+++ b/lib/net/rte_geneve.h
@@ -34,7 +34,7 @@
  * More-bits (optional) variable length options.
  */
 __extension__
-struct rte_geneve_hdr {
+struct __rte_packed_begin rte_geneve_hdr {
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t ver:2;		/**< Version. */
 	uint8_t opt_len:6;	/**< Options length. */
@@ -52,7 +52,7 @@ struct rte_geneve_hdr {
 	uint8_t vni[3];		/**< Virtual network identifier. */
 	uint8_t reserved2;	/**< Reserved. */
 	uint32_t opts[];	/**< Variable length options. */
-} __rte_packed;
+} __rte_packed_end;
 
 /* GENEVE ETH next protocol types */
 #define RTE_GENEVE_TYPE_ETH	0x6558 /**< Ethernet Protocol. */
diff --git a/lib/net/rte_gre.h b/lib/net/rte_gre.h
index 1483e1b42d..768c4ce7b5 100644
--- a/lib/net/rte_gre.h
+++ b/lib/net/rte_gre.h
@@ -23,7 +23,7 @@
  * GRE Header
  */
 __extension__
-struct rte_gre_hdr {
+struct __rte_packed_begin rte_gre_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint16_t res2:4; /**< Reserved */
 	uint16_t s:1;    /**< Sequence Number Present bit */
@@ -42,28 +42,28 @@ struct rte_gre_hdr {
 	uint16_t ver:3;  /**< Version Number */
 #endif
 	rte_be16_t proto;  /**< Protocol Type */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional field checksum in GRE header
  */
-struct rte_gre_hdr_opt_checksum_rsvd {
+struct __rte_packed_begin rte_gre_hdr_opt_checksum_rsvd {
 	rte_be16_t checksum;
 	rte_be16_t reserved1;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional field key in GRE header
  */
-struct rte_gre_hdr_opt_key {
+struct __rte_packed_begin rte_gre_hdr_opt_key {
 	rte_be32_t key;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional field sequence in GRE header
  */
-struct rte_gre_hdr_opt_sequence {
+struct __rte_packed_begin rte_gre_hdr_opt_sequence {
 	rte_be32_t sequence;
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_GRE_H_ */
diff --git a/lib/net/rte_gtp.h b/lib/net/rte_gtp.h
index ab06e23a6e..0332d35c16 100644
--- a/lib/net/rte_gtp.h
+++ b/lib/net/rte_gtp.h
@@ -24,7 +24,7 @@
  * No optional fields and next extension header.
  */
 __extension__
-struct rte_gtp_hdr {
+struct __rte_packed_begin rte_gtp_hdr {
 	union {
 		uint8_t gtp_hdr_info; /**< GTP header info */
 		struct {
@@ -48,21 +48,21 @@ struct rte_gtp_hdr {
 	uint8_t msg_type;     /**< GTP message type */
 	rte_be16_t plen;      /**< Total payload length */
 	rte_be32_t teid;      /**< Tunnel endpoint ID */
-} __rte_packed;
+} __rte_packed_end;
 
 /* Optional word of GTP header, present if any of E, S, PN is set. */
-struct rte_gtp_hdr_ext_word {
+struct __rte_packed_begin rte_gtp_hdr_ext_word {
 	rte_be16_t sqn;	      /**< Sequence Number. */
 	uint8_t npdu;	      /**< N-PDU number. */
 	uint8_t next_ext;     /**< Next Extension Header Type. */
-}  __rte_packed;
+}  __rte_packed_end;
 
 /**
  * Optional extension for GTP with next_ext set to 0x85
  * defined based on RFC 38415-g30.
  */
 __extension__
-struct rte_gtp_psc_generic_hdr {
+struct __rte_packed_begin rte_gtp_psc_generic_hdr {
 	uint8_t ext_hdr_len;	/**< PDU ext hdr len in multiples of 4 bytes */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t type:4;		/**< PDU type */
@@ -78,14 +78,14 @@ struct rte_gtp_psc_generic_hdr {
 	uint8_t spare:2;	/**< type specific spare bits */
 #endif
 	uint8_t data[0];	/**< variable length data fields */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional extension for GTP with next_ext set to 0x85
  * type0 defined based on RFC 38415-g30
  */
 __extension__
-struct rte_gtp_psc_type0_hdr {
+struct __rte_packed_begin rte_gtp_psc_type0_hdr {
 	uint8_t ext_hdr_len;	/**< PDU ext hdr len in multiples of 4 bytes */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t type:4;		/**< PDU type */
@@ -105,14 +105,14 @@ struct rte_gtp_psc_type0_hdr {
 	uint8_t ppp:1;		/**< Paging policy presence */
 #endif
 	uint8_t data[0];	/**< variable length data fields */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Optional extension for GTP with next_ext set to 0x85
  * type1 defined based on RFC 38415-g30
  */
 __extension__
-struct rte_gtp_psc_type1_hdr {
+struct __rte_packed_begin rte_gtp_psc_type1_hdr {
 	uint8_t ext_hdr_len;	/**< PDU ext hdr len in multiples of 4 bytes */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t type:4;		/**< PDU type */
@@ -134,7 +134,7 @@ struct rte_gtp_psc_type1_hdr {
 	uint8_t n_delay_ind:1;	/**< N3/N9 delay result presence */
 #endif
 	uint8_t data[0];	/**< variable length data fields */
-} __rte_packed;
+} __rte_packed_end;
 
 /** GTP header length */
 #define RTE_ETHER_GTP_HLEN \
diff --git a/lib/net/rte_ib.h b/lib/net/rte_ib.h
index a551f3753f..f1b455cea0 100644
--- a/lib/net/rte_ib.h
+++ b/lib/net/rte_ib.h
@@ -22,7 +22,7 @@
  * IB Specification Vol 1-Release-1.4.
  */
 __extension__
-struct rte_ib_bth {
+struct __rte_packed_begin rte_ib_bth {
 	uint8_t	opcode;		/**< Opcode. */
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t	tver:4;		/**< Transport Header Version. */
@@ -54,7 +54,7 @@ struct rte_ib_bth {
 	uint8_t	rsvd1:7;	/**< Reserved. */
 #endif
 	uint8_t	psn[3];		/**< Packet Sequence Number */
-} __rte_packed;
+} __rte_packed_end;
 
 /** RoCEv2 default port. */
 #define RTE_ROCEV2_DEFAULT_PORT 4791
diff --git a/lib/net/rte_icmp.h b/lib/net/rte_icmp.h
index e69d68ab6e..cca73b3733 100644
--- a/lib/net/rte_icmp.h
+++ b/lib/net/rte_icmp.h
@@ -21,33 +21,33 @@
 /**
  * ICMP base header
  */
-struct rte_icmp_base_hdr {
+struct __rte_packed_begin rte_icmp_base_hdr {
 	uint8_t type;
 	uint8_t code;
 	rte_be16_t checksum;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ICMP echo header
  */
-struct rte_icmp_echo_hdr {
+struct __rte_packed_begin rte_icmp_echo_hdr {
 	struct rte_icmp_base_hdr base;
 	rte_be16_t identifier;
 	rte_be16_t sequence;
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * ICMP Header
  *
  * @see rte_icmp_echo_hdr which is similar.
  */
-struct rte_icmp_hdr {
+struct __rte_packed_begin rte_icmp_hdr {
 	uint8_t  icmp_type;     /* ICMP packet type. */
 	uint8_t  icmp_code;     /* ICMP packet code. */
 	rte_be16_t icmp_cksum;  /* ICMP packet checksum. */
 	rte_be16_t icmp_ident;  /* ICMP packet identifier. */
 	rte_be16_t icmp_seq_nb; /* ICMP packet sequence number. */
-} __rte_packed;
+} __rte_packed_end;
 
 /* ICMP packet types */
 #define RTE_ICMP_TYPE_ECHO_REPLY 0
diff --git a/lib/net/rte_ip4.h b/lib/net/rte_ip4.h
index f9b8333332..d4b38c513c 100644
--- a/lib/net/rte_ip4.h
+++ b/lib/net/rte_ip4.h
@@ -39,7 +39,7 @@ extern "C" {
 /**
  * IPv4 Header
  */
-struct __rte_aligned(2) rte_ipv4_hdr {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv4_hdr {
 	__extension__
 	union {
 		uint8_t version_ihl;    /**< version and header length */
@@ -62,7 +62,7 @@ struct __rte_aligned(2) rte_ipv4_hdr {
 	rte_be16_t hdr_checksum;	/**< header checksum */
 	rte_be32_t src_addr;		/**< source address */
 	rte_be32_t dst_addr;		/**< destination address */
-} __rte_packed;
+} __rte_packed_end;
 
 /** Create IPv4 address */
 #define RTE_IPV4(a, b, c, d) ((uint32_t)(((a) & 0xff) << 24) | \
diff --git a/lib/net/rte_ip6.h b/lib/net/rte_ip6.h
index 992ab5ee1f..92558a124a 100644
--- a/lib/net/rte_ip6.h
+++ b/lib/net/rte_ip6.h
@@ -358,7 +358,7 @@ enum rte_ipv6_mc_scope {
 	RTE_IPV6_MC_SCOPE_ORGLOCAL = 0x08,
 	/** Global multicast scope. */
 	RTE_IPV6_MC_SCOPE_GLOBAL = 0x0e,
-} __rte_packed;
+};
 
 /**
  * Extract the IPv6 multicast scope value as defined in RFC 4291, section 2.7.
@@ -461,7 +461,7 @@ rte_ether_mcast_from_ipv6(struct rte_ether_addr *mac, const struct rte_ipv6_addr
 /**
  * IPv6 Header
  */
-struct __rte_aligned(2) rte_ipv6_hdr {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv6_hdr {
 	union {
 		rte_be32_t vtc_flow;        /**< IP version, traffic class & flow label. */
 		__extension__
@@ -484,7 +484,7 @@ struct __rte_aligned(2) rte_ipv6_hdr {
 	uint8_t  hop_limits;	/**< Hop limits. */
 	struct rte_ipv6_addr src_addr;	/**< IP address of source host. */
 	struct rte_ipv6_addr dst_addr;	/**< IP address of destination host(s). */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * Check that the IPv6 header version field is valid according to RFC 8200 section 3.
@@ -508,7 +508,7 @@ static inline int rte_ipv6_check_version(const struct rte_ipv6_hdr *ip)
 /**
  * IPv6 Routing Extension Header
  */
-struct __rte_aligned(2) rte_ipv6_routing_ext {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv6_routing_ext {
 	uint8_t next_hdr;			/**< Protocol, next header. */
 	uint8_t hdr_len;			/**< Header length. */
 	uint8_t type;				/**< Extension header type. */
@@ -523,7 +523,7 @@ struct __rte_aligned(2) rte_ipv6_routing_ext {
 		};
 	};
 	/* Next are 128-bit IPv6 address fields to describe segments. */
-} __rte_packed;
+} __rte_packed_end;
 
 /* IPv6 vtc_flow: IPv / TC / flow_label */
 #define RTE_IPV6_HDR_FL_SHIFT 0
@@ -752,12 +752,12 @@ rte_ipv6_udptcp_cksum_mbuf_verify(const struct rte_mbuf *m,
 #define RTE_IPV6_SET_FRAG_DATA(fo, mf)	\
 	(((fo) & RTE_IPV6_EHDR_FO_MASK) | ((mf) & RTE_IPV6_EHDR_MF_MASK))
 
-struct __rte_aligned(2) rte_ipv6_fragment_ext {
+struct __rte_aligned(2) __rte_packed_begin rte_ipv6_fragment_ext {
 	uint8_t next_header;	/**< Next header type */
 	uint8_t reserved;	/**< Reserved */
 	rte_be16_t frag_data;	/**< All fragmentation data */
 	rte_be32_t id;		/**< Packet ID */
-} __rte_packed;
+} __rte_packed_end;
 
 /* IPv6 fragment extension header size */
 #define RTE_IPV6_FRAG_HDR_SIZE	sizeof(struct rte_ipv6_fragment_ext)
diff --git a/lib/net/rte_l2tpv2.h b/lib/net/rte_l2tpv2.h
index ac16657856..728dc01506 100644
--- a/lib/net/rte_l2tpv2.h
+++ b/lib/net/rte_l2tpv2.h
@@ -125,7 +125,7 @@ struct rte_l2tpv2_common_hdr {
  * L2TPv2 message Header contains all options(length, ns, nr,
  * offset size, offset padding).
  */
-struct rte_l2tpv2_msg_with_all_options {
+struct __rte_packed_begin rte_l2tpv2_msg_with_all_options {
 	rte_be16_t length;		/**< length(16) */
 	rte_be16_t tunnel_id;		/**< tunnel ID(16) */
 	rte_be16_t session_id;		/**< session ID(16) */
@@ -133,20 +133,20 @@ struct rte_l2tpv2_msg_with_all_options {
 	rte_be16_t nr;			/**< Nr(16) */
 	rte_be16_t offset_size;		/**< offset size(16) */
 	uint8_t   *offset_padding;	/**< offset padding(variable length) */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * L2TPv2 message Header contains all options except length(ns, nr,
  * offset size, offset padding).
  */
-struct rte_l2tpv2_msg_without_length {
+struct __rte_packed_begin rte_l2tpv2_msg_without_length {
 	rte_be16_t tunnel_id;		/**< tunnel ID(16) */
 	rte_be16_t session_id;		/**< session ID(16) */
 	rte_be16_t ns;			/**< Ns(16) */
 	rte_be16_t nr;			/**< Nr(16) */
 	rte_be16_t offset_size;		/**< offset size(16) */
 	uint8_t   *offset_padding;	/**< offset padding(variable length) */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * L2TPv2 message Header contains all options except ns_nr(length,
@@ -176,12 +176,12 @@ struct rte_l2tpv2_msg_without_offset {
 /**
  * L2TPv2 message Header contains options offset size and offset padding.
  */
-struct rte_l2tpv2_msg_with_offset {
+struct __rte_packed_begin rte_l2tpv2_msg_with_offset {
 	rte_be16_t tunnel_id;		/**< tunnel ID(16) */
 	rte_be16_t session_id;		/**< session ID(16) */
 	rte_be16_t offset_size;		/**< offset size(16) */
 	uint8_t   *offset_padding;	/**< offset padding(variable length) */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * L2TPv2 message Header contains options ns and nr.
@@ -213,7 +213,7 @@ struct rte_l2tpv2_msg_without_all_options {
 /**
  * L2TPv2 Combined Message Header Format: Common Header + Options
  */
-struct rte_l2tpv2_combined_msg_hdr {
+struct __rte_packed_begin rte_l2tpv2_combined_msg_hdr {
 	struct rte_l2tpv2_common_hdr common; /**< common header */
 	union {
 		/** header with all options */
@@ -233,6 +233,6 @@ struct rte_l2tpv2_combined_msg_hdr {
 		/** header without all options */
 		struct rte_l2tpv2_msg_without_all_options type7;
 	};
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* _RTE_L2TPV2_H_ */
diff --git a/lib/net/rte_macsec.h b/lib/net/rte_macsec.h
index beeeb8effe..c694c37b4b 100644
--- a/lib/net/rte_macsec.h
+++ b/lib/net/rte_macsec.h
@@ -25,7 +25,7 @@
  * MACsec Header (SecTAG)
  */
 __extension__
-struct rte_macsec_hdr {
+struct __rte_packed_begin rte_macsec_hdr {
 	/**
 	 * Tag control information and Association number of secure channel.
 	 * Various bits of TCI and AN are masked using RTE_MACSEC_TCI_* and RTE_MACSEC_AN_MASK.
@@ -39,7 +39,7 @@ struct rte_macsec_hdr {
 	uint8_t short_length:6; /**< Short Length. */
 #endif
 	rte_be32_t packet_number; /**< Packet number to support replay protection. */
-} __rte_packed;
+} __rte_packed_end;
 
 /** SCI length in MACsec header if present. */
 #define RTE_MACSEC_SCI_LEN 8
@@ -48,8 +48,8 @@ struct rte_macsec_hdr {
  * MACsec SCI header (8 bytes) after the MACsec header
  * which is present if SC bit is set in tci_an.
  */
-struct rte_macsec_sci_hdr {
+struct __rte_packed_begin rte_macsec_sci_hdr {
 	uint8_t sci[RTE_MACSEC_SCI_LEN]; /**< Optional secure channel ID. */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_MACSEC_H */
diff --git a/lib/net/rte_mpls.h b/lib/net/rte_mpls.h
index 35a356efd3..53614a0b88 100644
--- a/lib/net/rte_mpls.h
+++ b/lib/net/rte_mpls.h
@@ -18,7 +18,7 @@
  * MPLS header.
  */
 __extension__
-struct rte_mpls_hdr {
+struct __rte_packed_begin rte_mpls_hdr {
 	rte_be16_t tag_msb; /**< Label(msb). */
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 	uint8_t tag_lsb:4;  /**< Label(lsb). */
@@ -30,6 +30,6 @@ struct rte_mpls_hdr {
 	uint8_t tag_lsb:4;  /**< label(lsb) */
 #endif
 	uint8_t  ttl;       /**< Time to live. */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_MPLS_H_ */
diff --git a/lib/net/rte_pdcp_hdr.h b/lib/net/rte_pdcp_hdr.h
index c22b66bf93..2e8da1e1d3 100644
--- a/lib/net/rte_pdcp_hdr.h
+++ b/lib/net/rte_pdcp_hdr.h
@@ -56,7 +56,7 @@ enum rte_pdcp_pdu_type {
  * 6.2.2.1 Data PDU for SRBs
  */
 __extension__
-struct rte_pdcp_cp_data_pdu_sn_12_hdr {
+struct __rte_packed_begin rte_pdcp_cp_data_pdu_sn_12_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 	uint8_t r : 4;		/**< Reserved */
@@ -65,13 +65,13 @@ struct rte_pdcp_cp_data_pdu_sn_12_hdr {
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 #endif
 	uint8_t sn_7_0;		/**< Sequence number bits 0-7 */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * 6.2.2.2 Data PDU for DRBs and MRBs with 12 bits PDCP SN
  */
 __extension__
-struct rte_pdcp_up_data_pdu_sn_12_hdr {
+struct __rte_packed_begin rte_pdcp_up_data_pdu_sn_12_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 	uint8_t r : 3;		/**< Reserved */
@@ -82,13 +82,13 @@ struct rte_pdcp_up_data_pdu_sn_12_hdr {
 	uint8_t sn_11_8 : 4;	/**< Sequence number bits 8-11 */
 #endif
 	uint8_t sn_7_0;		/**< Sequence number bits 0-7 */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * 6.2.2.3 Data PDU for DRBs and MRBs with 18 bits PDCP SN
  */
 __extension__
-struct rte_pdcp_up_data_pdu_sn_18_hdr {
+struct __rte_packed_begin rte_pdcp_up_data_pdu_sn_18_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t sn_17_16 : 2;	/**< Sequence number bits 16-17 */
 	uint8_t r : 5;		/**< Reserved */
@@ -100,13 +100,13 @@ struct rte_pdcp_up_data_pdu_sn_18_hdr {
 #endif
 	uint8_t sn_15_8;	/**< Sequence number bits 8-15 */
 	uint8_t sn_7_0;		/**< Sequence number bits 0-7 */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * 6.2.3.1 Control PDU for PDCP status report
  */
 __extension__
-struct rte_pdcp_up_ctrl_pdu_hdr {
+struct __rte_packed_begin rte_pdcp_up_ctrl_pdu_hdr {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 	uint8_t r : 4;		/**< Reserved */
 	uint8_t pdu_type : 3;	/**< Control PDU type */
@@ -134,6 +134,6 @@ struct rte_pdcp_up_ctrl_pdu_hdr {
 	 * in the Bitmap is 1.
 	 */
 	uint8_t bitmap[];
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_PDCP_HDR_H */
diff --git a/lib/net/rte_ppp.h b/lib/net/rte_ppp.h
index 63c72a9392..02bfb03c03 100644
--- a/lib/net/rte_ppp.h
+++ b/lib/net/rte_ppp.h
@@ -17,10 +17,10 @@
 /**
  * PPP Header
  */
-struct rte_ppp_hdr {
+struct __rte_packed_begin rte_ppp_hdr {
 	uint8_t addr; /**< PPP address(8) */
 	uint8_t ctrl; /**< PPP control(8) */
 	rte_be16_t proto_id; /**< PPP protocol identifier(16) */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* _RTE_PPP_H_ */
diff --git a/lib/net/rte_sctp.h b/lib/net/rte_sctp.h
index e757c57db3..73051b94fd 100644
--- a/lib/net/rte_sctp.h
+++ b/lib/net/rte_sctp.h
@@ -21,11 +21,11 @@
 /**
  * SCTP Header
  */
-struct rte_sctp_hdr {
+struct __rte_packed_begin rte_sctp_hdr {
 	rte_be16_t src_port; /**< Source port. */
 	rte_be16_t dst_port; /**< Destin port. */
 	rte_be32_t tag;      /**< Validation tag. */
 	rte_be32_t cksum;    /**< Checksum. */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_SCTP_H_ */
diff --git a/lib/net/rte_tcp.h b/lib/net/rte_tcp.h
index 1bcacbf038..fb0eb308f5 100644
--- a/lib/net/rte_tcp.h
+++ b/lib/net/rte_tcp.h
@@ -21,7 +21,7 @@
 /**
  * TCP Header
  */
-struct rte_tcp_hdr {
+struct __rte_packed_begin rte_tcp_hdr {
 	rte_be16_t src_port; /**< TCP source port. */
 	rte_be16_t dst_port; /**< TCP destination port. */
 	rte_be32_t sent_seq; /**< TX data sequence number. */
@@ -31,7 +31,7 @@ struct rte_tcp_hdr {
 	rte_be16_t rx_win;   /**< RX flow control window. */
 	rte_be16_t cksum;    /**< TCP checksum. */
 	rte_be16_t tcp_urp;  /**< TCP urgent pointer, if any. */
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * TCP Flags
diff --git a/lib/net/rte_tls.h b/lib/net/rte_tls.h
index 595567e3e9..f27db3acb1 100644
--- a/lib/net/rte_tls.h
+++ b/lib/net/rte_tls.h
@@ -28,13 +28,13 @@
  * TLS Header
  */
 __extension__
-struct rte_tls_hdr {
+struct __rte_packed_begin rte_tls_hdr {
 	/** Content type of TLS packet. Defined as RTE_TLS_TYPE_*. */
 	uint8_t type;
 	/** TLS Version defined as RTE_TLS_VERSION*. */
 	rte_be16_t version;
 	/** The length (in bytes) of the following TLS packet. */
 	rte_be16_t length;
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_TLS_H */
diff --git a/lib/net/rte_udp.h b/lib/net/rte_udp.h
index c01dad9c9b..94f5304e6d 100644
--- a/lib/net/rte_udp.h
+++ b/lib/net/rte_udp.h
@@ -21,11 +21,11 @@
 /**
  * UDP Header
  */
-struct rte_udp_hdr {
+struct __rte_packed_begin rte_udp_hdr {
 	rte_be16_t src_port;    /**< UDP source port. */
 	rte_be16_t dst_port;    /**< UDP destination port. */
 	rte_be16_t dgram_len;   /**< UDP datagram length */
 	rte_be16_t dgram_cksum; /**< UDP datagram checksum */
-} __rte_packed;
+} __rte_packed_end;
 
 #endif /* RTE_UDP_H_ */
diff --git a/lib/net/rte_vxlan.h b/lib/net/rte_vxlan.h
index bd1c89835e..f59829b182 100644
--- a/lib/net/rte_vxlan.h
+++ b/lib/net/rte_vxlan.h
@@ -27,13 +27,13 @@
  * Reserved fields (24 bits and 8 bits)
  */
 __extension__ /* no named member in struct */
-struct rte_vxlan_hdr {
+struct __rte_packed_begin rte_vxlan_hdr {
 	union {
 		rte_be32_t vx_flags; /**< flags (8 bits) + extensions (24 bits). */
-		struct {
+		struct __rte_packed_begin {
 			union {
 				uint8_t flags; /**< Default is I bit, others are extensions. */
-				struct {
+				struct __rte_packed_begin {
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 					uint8_t flag_g:1,     /**< GBP bit. */
 						flag_rsvd:1,  /*   Reserved. */
@@ -51,11 +51,11 @@ struct rte_vxlan_hdr {
 						flag_rsvd:1,
 						flag_g:1;
 #endif
-				} __rte_packed;
+				} __rte_packed_end;
 			}; /* end of 1st byte */
 			union {
 				uint8_t rsvd0[3]; /* Reserved for extensions. */
-				struct {
+				struct __rte_packed_begin {
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
 					uint8_t rsvd0_gbp1:1, /*   Reserved. */
 						flag_d:1,     /**< GBP Don't Learn bit. */
@@ -71,7 +71,7 @@ struct rte_vxlan_hdr {
 #endif
 					union {
 						uint16_t policy_id; /**< GBP Identifier. */
-						struct {
+						struct __rte_packed_begin {
 							uint8_t rsvd0_gpe; /* Reserved. */
 							uint8_t proto; /**< GPE Next protocol. */
 								/* 0x01 : IPv4
@@ -79,23 +79,23 @@ struct rte_vxlan_hdr {
 								 * 0x03 : Ethernet
 								 * 0x04 : Network Service Header
 								 */
-						} __rte_packed;
+						} __rte_packed_end;
 					};
-				} __rte_packed;
+				} __rte_packed_end;
 			};
-		} __rte_packed;
+		} __rte_packed_end;
 	}; /* end of 1st 32-bit word */
 	union {
 		rte_be32_t vx_vni; /**< VNI (24 bits) + reserved (8 bits). */
-		struct {
+		struct __rte_packed_begin {
 			uint8_t    vni[3];   /**< VXLAN Identifier. */
 			union {
 				uint8_t    rsvd1;        /**< Reserved. */
 				uint8_t    last_rsvd;    /**< Reserved. */
 			};
-		} __rte_packed;
+		} __rte_packed_end;
 	}; /* end of 2nd 32-bit word */
-} __rte_packed;
+} __rte_packed_end;
 
 /** VXLAN tunnel header length. */
 #define RTE_ETHER_VXLAN_HLEN \
@@ -111,7 +111,7 @@ struct rte_vxlan_hdr {
  * Identifier and Reserved fields (16 bits and 8 bits).
  */
 __extension__ /* no named member in struct */
-struct rte_vxlan_gpe_hdr {
+struct __rte_packed_begin rte_vxlan_gpe_hdr {
 	union {
 		struct {
 			uint8_t vx_flags;    /**< flag (8). */
@@ -127,7 +127,7 @@ struct rte_vxlan_gpe_hdr {
 			uint8_t rsvd1;    /**< Reserved. */
 		};
 	};
-} __rte_packed;
+} __rte_packed_end;
 
 /**
  * @deprecated
-- 
2.47.0.vfs.0.3


^ permalink raw reply	[relevance 1%]

* Re: [PATCH v8 27/29] lib/net: replace packed attributes
  @ 2025-01-08 12:01  3%     ` David Marchand
  2025-01-09  2:49  0%       ` Andre Muezerie
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2025-01-08 12:01 UTC (permalink / raw)
  To: Andre Muezerie; +Cc: roretzla, dev, Thomas Monjalon, Robin Jarry

On Tue, Dec 31, 2024 at 7:40 PM Andre Muezerie
<andremue@linux.microsoft.com> wrote:
> diff --git a/lib/net/rte_ip6.h b/lib/net/rte_ip6.h
> index 992ab5ee1f..92558a124a 100644
> --- a/lib/net/rte_ip6.h
> +++ b/lib/net/rte_ip6.h
> @@ -358,7 +358,7 @@ enum rte_ipv6_mc_scope {
>         RTE_IPV6_MC_SCOPE_ORGLOCAL = 0x08,
>         /** Global multicast scope. */
>         RTE_IPV6_MC_SCOPE_GLOBAL = 0x0e,
> -} __rte_packed;
> +};
>
>  /**
>   * Extract the IPv6 multicast scope value as defined in RFC 4291, section 2.7.

Cc: Robin for info.

This change affects the storage size of a variable of this type (at
least with gcc).
I think it is ok from an ABI pov: there is one (inline) helper using
this type, and nothing else in DPDK takes a IPv6 multicast scope as
input.

However, it deserves a mention in the commitlog (maybe a separate
commit to highlight it?).


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* [PATCH v2] ring: add the second version of the RTS interface
  2025-01-05 15:13  5% ` [PATCH v2] " Huichao Cai
@ 2025-01-08  1:41  3%   ` Huichao Cai
  2025-01-14 15:04  0%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Huichao Cai @ 2025-01-08  1:41 UTC (permalink / raw)
  To: thomas; +Cc: dev, honnappa.nagarahalli, konstantin.v.ananyev

Hi,Thomas
    This patch adds a field to the ABI structure.I have added the suppress_type
field in the file libabigail.abignore, but "ci/github-robot: Build" still reported
an error, could you please advise on how to fill in the suppress_type field?


^ permalink raw reply	[relevance 3%]

* [v2 3/4] crypto/virtio: add vhost backend to virtio_user
  @ 2025-01-07 18:44  1% ` Gowrishankar Muthukrishnan
  2025-02-06 13:14  0%   ` Maxime Coquelin
  0 siblings, 1 reply; 200+ results
From: Gowrishankar Muthukrishnan @ 2025-01-07 18:44 UTC (permalink / raw)
  To: dev, Akhil Goyal, Maxime Coquelin, Chenbo Xia, Fan Zhang, Jay Zhou
  Cc: jerinj, anoobj, David Marchand, Gowrishankar Muthukrishnan

Add vhost backend to virtio_user crypto.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
---
 drivers/crypto/virtio/meson.build             |   7 +
 drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
 drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
 drivers/crypto/virtio/virtio_pci.h            |   7 +
 drivers/crypto/virtio/virtio_ring.h           |   6 -
 .../crypto/virtio/virtio_user/vhost_vdpa.c    | 312 +++++++
 .../virtio/virtio_user/virtio_user_dev.c      | 776 ++++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.h      |  88 ++
 drivers/crypto/virtio/virtio_user_cryptodev.c | 587 +++++++++++++
 9 files changed, 1815 insertions(+), 28 deletions(-)
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
 create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c

diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
index 8181c8296f..e5bce54cca 100644
--- a/drivers/crypto/virtio/meson.build
+++ b/drivers/crypto/virtio/meson.build
@@ -16,3 +16,10 @@ sources = files(
         'virtio_rxtx.c',
         'virtqueue.c',
 )
+
+if is_linux
+    sources += files('virtio_user_cryptodev.c',
+        'virtio_user/vhost_vdpa.c',
+        'virtio_user/virtio_user_dev.c')
+    deps += ['bus_vdev', 'common_virtio']
+endif
diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
index d3db4f898e..c9f20cb338 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.c
+++ b/drivers/crypto/virtio/virtio_cryptodev.c
@@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
 	return 0;
 }
 
-/*
- * This function is based on probe() function
- * It returns 0 on success.
- */
-static int
-crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
-		struct rte_cryptodev_pmd_init_params *init_params)
+int
+crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev)
 {
-	struct rte_cryptodev *cryptodev;
 	struct virtio_crypto_hw *hw;
 
-	PMD_INIT_FUNC_TRACE();
-
-	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
-					init_params);
-	if (cryptodev == NULL)
-		return -ENODEV;
-
 	cryptodev->driver_id = cryptodev_virtio_driver_id;
 	cryptodev->dev_ops = &virtio_crypto_dev_ops;
 
@@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
 	hw->dev_id = cryptodev->data->dev_id;
 	hw->virtio_dev_capabilities = virtio_capabilities;
 
-	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
-		cryptodev->data->dev_id, pci_dev->id.vendor_id,
-		pci_dev->id.device_id);
+	if (pci_dev) {
+		/* pci device init */
+		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
+			cryptodev->data->dev_id, pci_dev->id.vendor_id,
+			pci_dev->id.device_id);
 
-	/* pci device init */
-	if (vtpci_cryptodev_init(pci_dev, hw))
+		if (vtpci_cryptodev_init(pci_dev, hw))
+			return -1;
+	}
+
+	if (virtio_crypto_init_device(cryptodev, features) < 0)
 		return -1;
 
-	if (virtio_crypto_init_device(cryptodev,
-			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
+	return 0;
+}
+
+/*
+ * This function is based on probe() function
+ * It returns 0 on success.
+ */
+static int
+crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
+		struct rte_cryptodev_pmd_init_params *init_params)
+{
+	struct rte_cryptodev *cryptodev;
+
+	PMD_INIT_FUNC_TRACE();
+
+	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
+					init_params);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
+			pci_dev) < 0)
 		return -1;
 
 	rte_cryptodev_pmd_probing_finish(cryptodev);
diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
index b4bdd9800b..95a1e09dca 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.h
+++ b/drivers/crypto/virtio/virtio_cryptodev.h
@@ -74,4 +74,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
 		struct rte_crypto_op **tx_pkts,
 		uint16_t nb_pkts);
 
+int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev);
+
 #endif /* _VIRTIO_CRYPTODEV_H_ */
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
index 79945cb88e..c75777e005 100644
--- a/drivers/crypto/virtio/virtio_pci.h
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -20,6 +20,9 @@ struct virtqueue;
 #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
 #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
 
+/* VirtIO device IDs. */
+#define VIRTIO_ID_CRYPTO  20
+
 /* VirtIO ABI version, this must match exactly. */
 #define VIRTIO_PCI_ABI_VERSION 0
 
@@ -56,8 +59,12 @@ struct virtqueue;
 #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
 #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
 #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
 #define VIRTIO_CONFIG_STATUS_FAILED    0x80
 
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_VRING_ALIGN 4096
+
 /*
  * Each virtqueue indirect descriptor list must be physically contiguous.
  * To allow us to malloc(9) each list individually, limit the number
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
index c74d1172b7..4b418f6e60 100644
--- a/drivers/crypto/virtio/virtio_ring.h
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
 				sizeof(struct vring_packed_desc_event)), align);
 }
 
-static inline void
-vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
-{
-	vring_init_split(vr, p, 0, align, num);
-}
-
 /*
  * The following is used with VIRTIO_RING_F_EVENT_IDX.
  * Assuming a given event_idx value from the other size, if we have
diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
new file mode 100644
index 0000000000..41696c4095
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
@@ -0,0 +1,312 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include <rte_memory.h>
+
+#include "virtio_user/vhost.h"
+#include "virtio_user/vhost_logs.h"
+
+#include "virtio_user_dev.h"
+#include "../virtio_pci.h"
+
+struct vhost_vdpa_data {
+	int vhostfd;
+	uint64_t protocol_features;
+};
+
+#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
+	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
+	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
+
+/* vhost kernel & vdpa ioctls */
+#define VHOST_VIRTIO 0xAF
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
+#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
+#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
+#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
+#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
+#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	uint64_t iova;
+	uint64_t size;
+	uint64_t uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	uint8_t perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+#define VHOST_IOTLB_BATCH_BEGIN    5
+#define VHOST_IOTLB_BATCH_END      6
+	uint8_t type;
+};
+
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_vdpa_config {
+	uint32_t off;
+	uint32_t len;
+	uint8_t buf[];
+};
+
+struct vhost_msg {
+	uint32_t type;
+	uint32_t reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		uint8_t padding[64];
+	};
+};
+
+
+static int
+vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
+{
+	int ret;
+
+	ret = ioctl(fd, request, arg);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
+				request, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
+}
+
+static int
+vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int ret;
+
+	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get features");
+		return -1;
+	}
+
+	/* Negotiated vDPA backend features */
+	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to get backend features");
+		return -1;
+	}
+
+	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
+
+	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to set backend features");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
+}
+
+/**
+ * Set up environment to talk with a vhost vdpa backend.
+ *
+ * @return
+ *   - (-1) if fail to set up;
+ *   - (>=0) if successful.
+ */
+static int
+vhost_vdpa_setup(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data;
+	uint32_t did = (uint32_t)-1;
+
+	data = malloc(sizeof(*data));
+	if (!data) {
+		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
+		return -1;
+	}
+
+	data->vhostfd = open(dev->path, O_RDWR);
+	if (data->vhostfd < 0) {
+		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
+				dev->path, strerror(errno));
+		free(data);
+		return -1;
+	}
+
+	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
+			did != VIRTIO_ID_CRYPTO) {
+		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
+		close(data->vhostfd);
+		free(data);
+		return -1;
+	}
+
+	dev->backend_data = data;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
+{
+	struct vhost_vring_state state = {
+		.index = dev->max_queue_pairs,
+		.num   = enable,
+	};
+
+	return vhost_vdpa_set_vring_enable(dev, &state);
+}
+
+static int
+vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
+				uint16_t pair_idx,
+				int enable)
+{
+	struct vhost_vring_state state = {
+		.index = pair_idx,
+		.num   = enable,
+	};
+
+	if (dev->qp_enabled[pair_idx] == enable)
+		return 0;
+
+	if (vhost_vdpa_set_vring_enable(dev, &state))
+		return -1;
+
+	dev->qp_enabled[pair_idx] = enable;
+	return 0;
+}
+
+static int
+vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
+{
+	/* TODO: It is W/A until a cleaner approach to find cpt status */
+	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
+{
+	int nr_vrings = dev->max_queue_pairs;
+
+	return nr_vrings;
+}
+
+static int
+vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
+{
+	int i, nr_vrings;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	for (i = 0; i < nr_vrings; i++) {
+		if (dev->notify_area[i])
+			munmap(dev->notify_area[i], getpagesize());
+	}
+	free(dev->notify_area);
+	dev->notify_area = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int nr_vrings, i, page_size = getpagesize();
+	uint16_t **notify_area;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	/* CQ is another vring */
+	nr_vrings++;
+
+	notify_area = malloc(nr_vrings * sizeof(*notify_area));
+	if (!notify_area) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
+		return -1;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
+					data->vhostfd, i * page_size);
+		if (notify_area[i] == MAP_FAILED) {
+			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
+					dev->path, i);
+			i--;
+			goto map_err;
+		}
+	}
+	dev->notify_area = notify_area;
+
+	return 0;
+
+map_err:
+	for (; i >= 0; i--)
+		munmap(notify_area[i], page_size);
+	free(notify_area);
+
+	return -1;
+}
+
+struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
+	.setup = vhost_vdpa_setup,
+	.get_features = vhost_vdpa_get_features,
+	.cvq_enable = vhost_vdpa_cvq_enable,
+	.enable_qp = vhost_vdpa_enable_queue_pair,
+	.update_link_state = vhost_vdpa_update_link_state,
+	.map_notification_area = vhost_vdpa_map_notification_area,
+	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
+};
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
new file mode 100644
index 0000000000..ac53ca78d4
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
@@ -0,0 +1,776 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+
+#include <rte_alarm.h>
+#include <rte_string_fns.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_io.h>
+
+#include "virtio_user/vhost.h"
+#include "virtio_user/vhost_logs.h"
+#include "virtio_logs.h"
+
+#include "cryptodev_pmd.h"
+#include "virtio_crypto.h"
+#include "virtio_cvq.h"
+#include "virtio_user_dev.h"
+#include "virtqueue.h"
+
+#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
+
+const char * const crypto_virtio_user_backend_strings[] = {
+	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
+	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
+};
+
+static int
+virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	if (dev->kickfds[queue_sel] >= 0) {
+		close(dev->kickfds[queue_sel]);
+		dev->kickfds[queue_sel] = -1;
+	}
+
+	if (dev->callfds[queue_sel] >= 0) {
+		close(dev->callfds[queue_sel]);
+		dev->callfds[queue_sel] = -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* May use invalid flag, but some backend uses kickfd and
+	 * callfd as criteria to judge if dev is alive. so finally we
+	 * use real event_fd.
+	 */
+	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->callfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->kickfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	struct vhost_vring_state state;
+	int ret;
+
+	state.index = queue_sel;
+	ret = dev->ops->get_vring_base(dev, &state);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
+	 * firstly because vhost depends on this msg to allocate virtqueue
+	 * pair.
+	 */
+	struct vhost_vring_file file;
+	int ret;
+
+	file.index = queue_sel;
+	file.fd = dev->callfds[queue_sel];
+	ret = dev->ops->set_vring_call(dev, &file);
+	if (ret < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	int ret;
+	struct vhost_vring_file file;
+	struct vhost_vring_state state;
+	struct vring *vring = &dev->vrings.split[queue_sel];
+	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
+	uint64_t desc_addr, avail_addr, used_addr;
+	struct vhost_vring_addr addr = {
+		.index = queue_sel,
+		.log_guest_addr = 0,
+		.flags = 0, /* disable log */
+	};
+
+	if (queue_sel == dev->max_queue_pairs) {
+		if (!dev->scvq) {
+			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
+					dev->path);
+			goto err;
+		}
+
+		/* Use shadow control queue information */
+		vring = &dev->scvq->vq_split.ring;
+		pq_vring = &dev->scvq->vq_packed.ring;
+	}
+
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+		desc_addr = pq_vring->desc_iova;
+		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
+		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
+						VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	} else {
+		desc_addr = vring->desc_iova;
+		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
+		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
+					VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	}
+
+	state.index = queue_sel;
+	state.num = vring->num;
+	ret = dev->ops->set_vring_num(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	state.index = queue_sel;
+	state.num = 0; /* no reservation */
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+		state.num |= (1 << 15);
+	ret = dev->ops->set_vring_base(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	ret = dev->ops->set_vring_addr(dev, &addr);
+	if (ret < 0)
+		goto err;
+
+	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
+	 * lastly because vhost depends on this msg to judge if
+	 * virtio is ready.
+	 */
+	file.index = queue_sel;
+	file.fd = dev->kickfds[queue_sel];
+	ret = dev->ops->set_vring_kick(dev, &file);
+	if (ret < 0)
+		goto err;
+
+	return 0;
+err:
+	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
+
+	return -1;
+}
+
+static int
+virtio_user_foreach_queue(struct virtio_user_dev *dev,
+			int (*fn)(struct virtio_user_dev *, uint32_t))
+{
+	uint32_t i, nr_vq;
+
+	nr_vq = dev->max_queue_pairs;
+
+	for (i = 0; i < nr_vq; i++)
+		if (fn(dev, i) < 0)
+			return -1;
+
+	return 0;
+}
+
+int
+crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
+{
+	uint64_t features;
+	int ret = -1;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 0: tell vhost to create queues */
+	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
+		goto error;
+
+	features = dev->features;
+
+	ret = dev->ops->set_features(dev, features);
+	if (ret < 0)
+		goto error;
+	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
+error:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return ret;
+}
+
+int
+crypto_virtio_user_start_device(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	/*
+	 * XXX workaround!
+	 *
+	 * We need to make sure that the locks will be
+	 * taken in the correct order to avoid deadlocks.
+	 *
+	 * Before releasing this lock, this thread should
+	 * not trigger any memory hotplug events.
+	 *
+	 * This is a temporary workaround, and should be
+	 * replaced when we get proper supports from the
+	 * memory subsystem in the future.
+	 */
+	rte_mcfg_mem_read_lock();
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 2: share memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto error;
+
+	/* Step 3: kick queues */
+	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
+	if (ret < 0)
+		goto error;
+
+	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
+	if (ret < 0)
+		goto error;
+
+	/* Step 4: enable queues */
+	for (int i = 0; i < dev->max_queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto error;
+	}
+
+	dev->started = true;
+
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	return 0;
+error:
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
+
+	/* TODO: free resource here or caller to check */
+	return -1;
+}
+
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
+{
+	uint32_t i;
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	if (!dev->started)
+		goto out;
+
+	for (i = 0; i < dev->max_queue_pairs; ++i) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	if (dev->scvq) {
+		ret = dev->ops->cvq_enable(dev, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	/* Stop the backend. */
+	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
+		goto err;
+
+	dev->started = false;
+
+out:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return 0;
+err:
+	pthread_mutex_unlock(&dev->mutex);
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
+
+	return -1;
+}
+
+static int
+virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
+{
+	int ret;
+
+	if (!dev->ops->get_config) {
+		dev->max_queue_pairs = user_max_qp;
+		return 0;
+	}
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
+			offsetof(struct virtio_crypto_config, max_dataqueues),
+			sizeof(uint16_t));
+	if (ret) {
+		/*
+		 * We need to know the max queue pair from the device so that
+		 * the control queue gets the right index.
+		 */
+		dev->max_queue_pairs = 1;
+		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
+
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
+{
+	struct virtio_crypto_config config;
+	int ret;
+
+	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
+	dev->cipher_algo = 0;
+	dev->auth_algo = 0;
+	dev->akcipher_algo = 0;
+
+	if (!dev->ops->get_config)
+		return 0;
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
+	if (ret) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
+		return ret;
+	}
+
+	dev->crypto_services = config.crypto_services;
+	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
+						config.cipher_algo_l;
+	dev->hash_algo = config.hash_algo;
+	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
+						config.mac_algo_l;
+	dev->aead_algo = config.aead_algo;
+	dev->akcipher_algo = config.akcipher_algo;
+	return 0;
+}
+
+static int
+virtio_user_dev_init_notify(struct virtio_user_dev *dev)
+{
+
+	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
+		goto err;
+
+	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
+		if (dev->ops->map_notification_area &&
+				dev->ops->map_notification_area(dev))
+			goto err;
+
+	return 0;
+err:
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	return -1;
+}
+
+static void
+virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
+{
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	if (dev->ops->unmap_notification_area && dev->notify_area)
+		dev->ops->unmap_notification_area(dev);
+}
+
+static void
+virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
+			const void *addr,
+			size_t len __rte_unused,
+			void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+	struct rte_memseg_list *msl;
+	uint16_t i;
+	int ret = 0;
+
+	/* ignore externally allocated memory */
+	msl = rte_mem_virt2memseg_list(addr);
+	if (msl->external)
+		return;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	if (dev->started == false)
+		goto exit;
+
+	/* Step 1: pause the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* Step 2: update memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto exit;
+
+	/* Step 3: resume the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto exit;
+	}
+
+exit:
+	pthread_mutex_unlock(&dev->mutex);
+
+	if (ret < 0)
+		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
+}
+
+static int
+virtio_user_dev_setup(struct virtio_user_dev *dev)
+{
+	if (dev->is_server) {
+		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
+			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
+			return -1;
+		}
+	}
+
+	switch (dev->backend_type) {
+	case VIRTIO_USER_BACKEND_VHOST_VDPA:
+		dev->ops = &virtio_ops_vdpa;
+		dev->ops->setup = virtio_crypto_ops_vdpa.setup;
+		dev->ops->get_features = virtio_crypto_ops_vdpa.get_features;
+		dev->ops->cvq_enable = virtio_crypto_ops_vdpa.cvq_enable;
+		dev->ops->enable_qp = virtio_crypto_ops_vdpa.enable_qp;
+		dev->ops->update_link_state = virtio_crypto_ops_vdpa.update_link_state;
+		dev->ops->map_notification_area = virtio_crypto_ops_vdpa.map_notification_area;
+		dev->ops->unmap_notification_area = virtio_crypto_ops_vdpa.unmap_notification_area;
+		break;
+	default:
+		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_alloc_vrings(struct virtio_user_dev *dev)
+{
+	int i, size, nr_vrings;
+	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
+
+	nr_vrings = dev->max_queue_pairs + 1;
+
+	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
+	if (!dev->callfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
+		return -1;
+	}
+
+	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
+	if (!dev->kickfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
+		goto free_callfds;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		dev->callfds[i] = -1;
+		dev->kickfds[i] = -1;
+	}
+
+	if (packed_ring)
+		size = sizeof(*dev->vrings.packed);
+	else
+		size = sizeof(*dev->vrings.split);
+	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
+	if (!dev->vrings.ptr) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
+		goto free_kickfds;
+	}
+
+	if (packed_ring) {
+		dev->packed_queues = rte_zmalloc("virtio_user_dev",
+				nr_vrings * sizeof(*dev->packed_queues), 0);
+		if (!dev->packed_queues) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
+					dev->path);
+			goto free_vrings;
+		}
+	}
+
+	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
+			nr_vrings * sizeof(*dev->qp_enabled), 0);
+	if (!dev->qp_enabled) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
+		goto free_packed_queues;
+	}
+
+	return 0;
+
+free_packed_queues:
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+free_vrings:
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+free_kickfds:
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+free_callfds:
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+
+	return -1;
+}
+
+static void
+virtio_user_free_vrings(struct virtio_user_dev *dev)
+{
+	rte_free(dev->qp_enabled);
+	dev->qp_enabled = NULL;
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+}
+
+#define VIRTIO_USER_SUPPORTED_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+int
+crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server)
+{
+	uint64_t backend_features;
+
+	pthread_mutex_init(&dev->mutex, NULL);
+	strlcpy(dev->path, path, PATH_MAX);
+
+	dev->started = 0;
+	dev->queue_pairs = 1; /* mq disabled by default */
+	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
+	dev->queue_size = queue_size;
+	dev->is_server = server;
+	dev->frontend_features = 0;
+	dev->unsupported_features = 0;
+	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
+	dev->hw.modern = 1;
+
+	if (virtio_user_dev_setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->set_owner(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
+		goto destroy;
+	}
+
+	if (dev->ops->get_backend_features(&backend_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
+		goto destroy;
+	}
+
+	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
+
+	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_cipher_services(dev)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
+		goto destroy;
+	}
+
+	dev->frontend_features &= ~dev->unsupported_features;
+	dev->device_features &= ~dev->unsupported_features;
+
+	if (virtio_user_alloc_vrings(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_notify(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
+		goto free_vrings;
+	}
+
+	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
+				virtio_user_mem_event_cb, dev)) {
+		if (rte_errno != ENOTSUP) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
+					dev->path);
+			goto notify_uninit;
+		}
+	}
+
+	return 0;
+
+notify_uninit:
+	virtio_user_dev_uninit_notify(dev);
+free_vrings:
+	virtio_user_free_vrings(dev);
+destroy:
+	dev->ops->destroy(dev);
+
+	return -1;
+}
+
+void
+crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
+{
+	crypto_virtio_user_stop_device(dev);
+
+	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
+
+	virtio_user_dev_uninit_notify(dev);
+
+	virtio_user_free_vrings(dev);
+
+	if (dev->is_server)
+		unlink(dev->path);
+
+	dev->ops->destroy(dev);
+}
+
+#define CVQ_MAX_DATA_DESCS 32
+
+static inline void *
+virtio_user_iova2virt(struct virtio_user_dev *dev __rte_unused, rte_iova_t iova)
+{
+	if (rte_eal_iova_mode() == RTE_IOVA_VA)
+		return (void *)(uintptr_t)iova;
+	else
+		return rte_mem_iova2virt(iova);
+}
+
+static inline int
+desc_is_avail(struct vring_packed_desc *desc, bool wrap_counter)
+{
+	uint16_t flags = rte_atomic_load_explicit(&desc->flags, rte_memory_order_acquire);
+
+	return wrap_counter == !!(flags & VRING_PACKED_DESC_F_AVAIL) &&
+		wrap_counter != !!(flags & VRING_PACKED_DESC_F_USED);
+}
+
+int
+crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	dev->status = status;
+	ret = dev->ops->set_status(dev, status);
+	if (ret && ret != -ENOTSUP)
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
+{
+	int ret;
+	uint8_t status;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	ret = dev->ops->get_status(dev, &status);
+	if (!ret) {
+		dev->status = status;
+		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
+			"\t-RESET: %u "
+			"\t-ACKNOWLEDGE: %u "
+			"\t-DRIVER: %u "
+			"\t-DRIVER_OK: %u "
+			"\t-FEATURES_OK: %u "
+			"\t-DEVICE_NEED_RESET: %u "
+			"\t-FAILED: %u",
+			dev->status,
+			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
+	} else if (ret != -ENOTSUP) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
+	}
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
+{
+	if (dev->ops->update_link_state)
+		return dev->ops->update_link_state(dev);
+
+	return 0;
+}
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
new file mode 100644
index 0000000000..ef648fd14b
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell.
+ */
+
+#ifndef _VIRTIO_USER_DEV_H
+#define _VIRTIO_USER_DEV_H
+
+#include <limits.h>
+#include <stdbool.h>
+
+#include "../virtio_pci.h"
+#include "../virtio_ring.h"
+
+extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
+
+enum virtio_user_backend_type {
+	VIRTIO_USER_BACKEND_UNKNOWN,
+	VIRTIO_USER_BACKEND_VHOST_USER,
+	VIRTIO_USER_BACKEND_VHOST_VDPA,
+};
+
+struct virtio_user_queue {
+	uint16_t used_idx;
+	bool avail_wrap_counter;
+	bool used_wrap_counter;
+};
+
+struct virtio_user_dev {
+	union {
+		struct virtio_crypto_hw hw;
+		uint8_t dummy[256];
+	};
+
+	void		*backend_data;
+	uint16_t	**notify_area;
+	char		path[PATH_MAX];
+	bool		hw_cvq;
+	uint16_t	max_queue_pairs;
+	uint64_t	device_features; /* supported features by device */
+	bool		*qp_enabled;
+
+	enum virtio_user_backend_type backend_type;
+	bool		is_server;  /* server or client mode */
+
+	int		*callfds;
+	int		*kickfds;
+	uint16_t	queue_pairs;
+	uint32_t	queue_size;
+	uint64_t	features; /* the negotiated features with driver,
+				   * and will be sync with device
+				   */
+	uint64_t	frontend_features; /* enabled frontend features */
+	uint64_t	unsupported_features; /* unsupported features mask */
+	uint8_t		status;
+	uint32_t	crypto_status;
+	uint32_t	crypto_services;
+	uint64_t	cipher_algo;
+	uint32_t	hash_algo;
+	uint64_t	auth_algo;
+	uint32_t	aead_algo;
+	uint32_t	akcipher_algo;
+
+	union {
+		void			*ptr;
+		struct vring		*split;
+		struct vring_packed	*packed;
+	} vrings;
+
+	struct virtio_user_queue *packed_queues;
+
+	struct virtio_user_backend_ops *ops;
+	pthread_mutex_t	mutex;
+	bool		started;
+
+	struct virtqueue	*scvq;
+};
+
+int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
+int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server);
+void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
+int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+extern const char * const crypto_virtio_user_backend_strings[];
+#endif
diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
new file mode 100644
index 0000000000..606639b872
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
@@ -0,0 +1,587 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Marvell
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <bus_vdev_driver.h>
+#include <rte_cryptodev.h>
+#include <cryptodev_pmd.h>
+#include <rte_alarm.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+
+#include "virtio_user/virtio_user_dev.h"
+#include "virtio_user/vhost.h"
+#include "virtio_user/vhost_logs.h"
+#include "virtio_cryptodev.h"
+#include "virtio_logs.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
+
+static void
+virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		     void *dst, int length __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (offset == offsetof(struct virtio_crypto_config, status)) {
+		crypto_virtio_user_dev_update_link_state(dev);
+		*(uint32_t *)dst = dev->crypto_status;
+	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
+		*(uint16_t *)dst = dev->max_queue_pairs;
+	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
+		*(uint32_t *)dst = dev->crypto_services;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
+		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
+		*(uint32_t *)dst = dev->cipher_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
+		*(uint32_t *)dst = dev->hash_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
+		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
+		*(uint32_t *)dst = dev->auth_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
+		*(uint32_t *)dst = dev->aead_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
+		*(uint32_t *)dst = dev->akcipher_algo;
+}
+
+static void
+virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		      const void *src, int length)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(src);
+
+	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
+		    offset, length);
+}
+
+static void
+virtio_user_reset(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
+		crypto_virtio_user_stop_device(dev);
+}
+
+static void
+virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint8_t old_status = dev->status;
+
+	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
+			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
+		crypto_virtio_user_dev_set_features(dev);
+		/* Feature negotiation should be only done in probe time.
+		 * So we skip any more request here.
+		 */
+		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
+	}
+
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
+		if (crypto_virtio_user_start_device(dev)) {
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
+		virtio_user_reset(hw);
+	}
+
+	crypto_virtio_user_dev_set_status(dev, status);
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
+		if (dev->ops->cvq_enable(dev, 1) < 0) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	}
+}
+
+static uint8_t
+virtio_user_get_status(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	crypto_virtio_user_dev_update_status(dev);
+
+	return dev->status;
+}
+
+#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+static uint64_t
+virtio_user_get_features(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* unmask feature bits defined in vhost user protocol */
+	return (dev->device_features | dev->frontend_features) &
+		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
+}
+
+static void
+virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	dev->features = features & (dev->device_features | dev->frontend_features);
+}
+
+static uint8_t
+virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
+{
+	/* rxq interrupts and config interrupt are separated in virtio-user,
+	 * here we only report config change.
+	 */
+	return VIRTIO_PCI_CAP_ISR_CFG;
+}
+
+static uint16_t
+virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
+		    uint16_t vec __rte_unused)
+{
+	return 0;
+}
+
+static uint16_t
+virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
+			  struct virtqueue *vq __rte_unused,
+			  uint16_t vec)
+{
+	/* pretend we have done that */
+	return vec;
+}
+
+/* This function is to get the queue size, aka, number of descs, of a specified
+ * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
+ * max supported queues.
+ */
+static uint16_t
+virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* Currently, each queue has same queue size */
+	return dev->queue_size;
+}
+
+static void
+virtio_user_setup_queue_packed(struct virtqueue *vq,
+			       struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	struct vring_packed *vring;
+	uint64_t desc_addr;
+	uint64_t avail_addr;
+	uint64_t used_addr;
+	uint16_t i;
+
+	vring  = &dev->vrings.packed[queue_idx];
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries *
+		sizeof(struct vring_packed_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr +
+			   sizeof(struct vring_packed_desc_event),
+			   VIRTIO_VRING_ALIGN);
+	vring->num = vq->vq_nentries;
+	vring->desc_iova = vq->vq_ring_mem;
+	vring->desc = (void *)(uintptr_t)desc_addr;
+	vring->driver = (void *)(uintptr_t)avail_addr;
+	vring->device = (void *)(uintptr_t)used_addr;
+	dev->packed_queues[queue_idx].avail_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_idx = 0;
+
+	for (i = 0; i < vring->num; i++)
+		vring->desc[i].flags = 0;
+}
+
+static void
+virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	uint64_t desc_addr, avail_addr, used_addr;
+
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_VRING_ALIGN);
+
+	dev->vrings.split[queue_idx].num = vq->vq_nentries;
+	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
+	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
+	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
+	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
+}
+
+static int
+virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (vtpci_with_packed_queue(hw))
+		virtio_user_setup_queue_packed(vq, dev);
+	else
+		virtio_user_setup_queue_split(vq, dev);
+
+	if (dev->notify_area)
+		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
+
+	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
+		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
+
+	return 0;
+}
+
+static void
+virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	/* For legacy devices, write 0 to VIRTIO_PCI_QUEUE_PFN port, QEMU
+	 * correspondingly stops the ioeventfds, and reset the status of
+	 * the device.
+	 * For modern devices, set queue desc, avail, used in PCI bar to 0,
+	 * not see any more behavior in QEMU.
+	 *
+	 * Here we just care about what information to deliver to vhost-user
+	 * or vhost-kernel. So we just close ioeventfd for now.
+	 */
+
+	RTE_SET_USED(hw);
+	RTE_SET_USED(vq);
+}
+
+static void
+virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint64_t notify_data = 1;
+
+	if (!dev->notify_area) {
+		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
+			  sizeof(notify_data)) < 0)
+			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
+				    strerror(errno));
+		return;
+	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
+		rte_write16(vq->vq_queue_index, vq->notify_addr);
+		return;
+	}
+
+	if (vtpci_with_packed_queue(hw)) {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:30]: avail index
+		 * Bit[31]: avail wrap counter
+		 */
+		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
+				VRING_PACKED_DESC_F_AVAIL)) << 31) |
+				((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	} else {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:31]: avail index
+		 */
+		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	}
+	rte_write32(notify_data, vq->notify_addr);
+}
+
+const struct virtio_pci_ops crypto_virtio_user_ops = {
+	.read_dev_cfg	= virtio_user_read_dev_config,
+	.write_dev_cfg	= virtio_user_write_dev_config,
+	.reset		= virtio_user_reset,
+	.get_status	= virtio_user_get_status,
+	.set_status	= virtio_user_set_status,
+	.get_features	= virtio_user_get_features,
+	.set_features	= virtio_user_set_features,
+	.get_isr	= virtio_user_get_isr,
+	.set_config_irq	= virtio_user_set_config_irq,
+	.set_queue_irq	= virtio_user_set_queue_irq,
+	.get_queue_num	= virtio_user_get_queue_num,
+	.setup_queue	= virtio_user_setup_queue,
+	.del_queue	= virtio_user_del_queue,
+	.notify_queue	= virtio_user_notify_queue,
+};
+
+static const char * const valid_args[] = {
+#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
+	VIRTIO_USER_ARG_QUEUES_NUM,
+#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
+	VIRTIO_USER_ARG_QUEUE_SIZE,
+#define VIRTIO_USER_ARG_PATH           "path"
+	VIRTIO_USER_ARG_PATH,
+#define VIRTIO_USER_ARG_SERVER_MODE    "server"
+	VIRTIO_USER_ARG_SERVER_MODE,
+	NULL
+};
+
+#define VIRTIO_USER_DEF_Q_NUM	1
+#define VIRTIO_USER_DEF_Q_SZ	256
+#define VIRTIO_USER_DEF_SERVER_MODE	0
+
+static int
+get_string_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	uint64_t integer = 0;
+	if (!value || !extra_args)
+		return -EINVAL;
+	errno = 0;
+	integer = strtoull(value, NULL, 0);
+	/* extra_args keeps default value, it should be replaced
+	 * only in case of successful parsing of the 'value' arg
+	 */
+	if (errno == 0)
+		*(uint64_t *)extra_args = integer;
+	return -errno;
+}
+
+static struct rte_cryptodev *
+virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev_pmd_init_params init_params = {
+		.name = "",
+		.private_data_size = sizeof(struct virtio_user_dev),
+	};
+	struct rte_cryptodev_data *data;
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	struct virtio_crypto_hw *hw;
+
+	init_params.socket_id = vdev->device.numa_node;
+	init_params.private_data_size = sizeof(struct virtio_user_dev);
+	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
+	if (cryptodev == NULL) {
+		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
+		return NULL;
+	}
+
+	data = cryptodev->data;
+	dev = data->dev_private;
+	hw = &dev->hw;
+
+	hw->dev_id = data->dev_id;
+	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
+
+	return cryptodev;
+}
+
+static void
+virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
+{
+	rte_cryptodev_pmd_destroy(cryptodev);
+}
+
+static int
+virtio_user_pmd_probe(struct rte_vdev_device *vdev)
+{
+	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
+	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
+	struct rte_cryptodev *cryptodev = NULL;
+	struct rte_kvargs *kvlist = NULL;
+	struct virtio_user_dev *dev;
+	char *path = NULL;
+	int ret;
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
+
+	if (!kvlist) {
+		PMD_INIT_LOG(ERR, "error when parsing param");
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
+					&get_string_arg, &path) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_PATH);
+			goto end;
+		}
+	} else {
+		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
+				VIRTIO_USER_ARG_PATH);
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
+					&get_integer_arg, &queues) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUES_NUM);
+			goto end;
+		}
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
+					&get_integer_arg, &queue_size) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUE_SIZE);
+			goto end;
+		}
+	}
+
+	cryptodev = virtio_user_cryptodev_alloc(vdev);
+	if (!cryptodev) {
+		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
+		goto end;
+	}
+
+	dev = cryptodev->data->dev_private;
+	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
+			server_mode) < 0) {
+		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
+			NULL) < 0) {
+		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
+		crypto_virtio_user_dev_uninit(dev);
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	rte_cryptodev_pmd_probing_finish(cryptodev);
+
+	ret = 0;
+end:
+	rte_kvargs_free(kvlist);
+	free(path);
+	return ret;
+}
+
+static int
+virtio_user_pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev *cryptodev;
+	const char *name;
+	int devid;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	PMD_DRV_LOG(INFO, "Removing %s", name);
+
+	devid = rte_cryptodev_get_dev_id(name);
+	if (devid < 0)
+		return -EINVAL;
+
+	rte_cryptodev_stop(devid);
+
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
+		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_map)
+		return dev->ops->dma_map(dev, addr, iova, len);
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_unmap)
+		return dev->ops->dma_unmap(dev, addr, iova, len);
+
+	return 0;
+}
+
+static struct rte_vdev_driver virtio_user_driver = {
+	.probe = virtio_user_pmd_probe,
+	.remove = virtio_user_pmd_remove,
+	.dma_map = virtio_user_pmd_dma_map,
+	.dma_unmap = virtio_user_pmd_dma_unmap,
+};
+
+static struct cryptodev_driver virtio_crypto_drv;
+
+RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
+RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
+	virtio_user_driver.driver,
+	cryptodev_virtio_driver_id);
+RTE_PMD_REGISTER_ALIAS(crypto_virtio_user, crypto_virtio);
+RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
+	"path=<path> "
+	"queues=<int> "
+	"queue_size=<int>");
-- 
2.25.1


^ permalink raw reply	[relevance 1%]

* RE: [PATCH v16 1/4] lib: add generic support for reading PMU events
  2024-12-06 18:15  3%       ` Konstantin Ananyev
@ 2025-01-07  7:45  0%         ` Tomasz Duszynski
  0 siblings, 0 replies; 200+ results
From: Tomasz Duszynski @ 2025-01-07  7:45 UTC (permalink / raw)
  To: Konstantin Ananyev, Thomas Monjalon
  Cc: Ruifeng.Wang, bruce.richardson, david.marchand, dev, Jerin Jacob,
	konstantin.v.ananyev, mattias.ronnblom, mb, roretzla, stephen,
	zhoumin

>> Add support for programming PMU counters and reading their values in
>> runtime bypassing kernel completely.
>>
>> This is especially useful in cases where CPU cores are isolated i.e
>> run dedicated tasks. In such cases one cannot use standard perf
>> utility without sacrificing latency and performance.
>>
>> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
>> ---
>
>Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>
>As future possible enhancements - I think it would be useful to make control-
>path API MT safe, plus probably try to hide some of the exposed internal
>structures (rte_pmu_event_group, etc.) inside .c (to minimize surface for
>possible ABI breakage).
>

Thanks. Yes sure, that series is not one time-addition. It will be improved over time. 

>> --
>> 2.34.1


^ permalink raw reply	[relevance 0%]

* [PATCH v2] ring: add the second version of the RTS interface
  2025-01-05  9:57  5% [PATCH] ring: add the second version of the RTS interface Huichao Cai
@ 2025-01-05 15:13  5% ` Huichao Cai
  2025-01-08  1:41  3%   ` Huichao Cai
  0 siblings, 1 reply; 200+ results
From: Huichao Cai @ 2025-01-05 15:13 UTC (permalink / raw)
  To: honnappa.nagarahalli, konstantin.v.ananyev, thomas; +Cc: dev

The timing of the update of the RTS enqueues/dequeues tail is
limited to the last enqueues/dequeues, which reduces concurrency,
so the RTS interface of the V2 version is added, which makes the tail
of the enqueues/dequeues not limited to the last enqueues/dequeues
and thus enables timely updates to increase concurrency.

Add some corresponding test cases.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 app/test/meson.build                   |   1 +
 app/test/test_ring.c                   |  26 +++
 app/test/test_ring_rts_v2_stress.c     |  32 ++++
 app/test/test_ring_stress.c            |   3 +
 app/test/test_ring_stress.h            |   1 +
 devtools/libabigail.abignore           |   6 +
 doc/guides/rel_notes/release_25_03.rst |   2 +
 lib/ring/rte_ring.c                    |  54 ++++++-
 lib/ring/rte_ring.h                    |  12 ++
 lib/ring/rte_ring_core.h               |   9 ++
 lib/ring/rte_ring_elem.h               |  18 +++
 lib/ring/rte_ring_rts.h                | 216 ++++++++++++++++++++++++-
 lib/ring/rte_ring_rts_elem_pvt.h       | 168 +++++++++++++++++++
 13 files changed, 538 insertions(+), 10 deletions(-)
 create mode 100644 app/test/test_ring_rts_v2_stress.c

diff --git a/app/test/meson.build b/app/test/meson.build
index d5cb6a7f7a..e3d8cef3fa 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -166,6 +166,7 @@ source_file_deps = {
     'test_ring_mt_peek_stress_zc.c': ['ptr_compress'],
     'test_ring_perf.c': ['ptr_compress'],
     'test_ring_rts_stress.c': ['ptr_compress'],
+    'test_ring_rts_v2_stress.c': ['ptr_compress'],
     'test_ring_st_peek_stress.c': ['ptr_compress'],
     'test_ring_st_peek_stress_zc.c': ['ptr_compress'],
     'test_ring_stress.c': ['ptr_compress'],
diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index ba1fec1de3..094f14b859 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -284,6 +284,19 @@ static const struct {
 			.felem = rte_ring_dequeue_bulk_elem,
 		},
 	},
+	{
+		.desc = "MP_RTS/MC_RTS V2 sync mode",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ,
+		.enq = {
+			.flegacy = rte_ring_enqueue_bulk,
+			.felem = rte_ring_enqueue_bulk_elem,
+		},
+		.deq = {
+			.flegacy = rte_ring_dequeue_bulk,
+			.felem = rte_ring_dequeue_bulk_elem,
+		},
+	},
 	{
 		.desc = "MP_HTS/MC_HTS sync mode",
 		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
@@ -349,6 +362,19 @@ static const struct {
 			.felem = rte_ring_dequeue_burst_elem,
 		},
 	},
+	{
+		.desc = "MP_RTS/MC_RTS V2 sync mode",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ,
+		.enq = {
+			.flegacy = rte_ring_enqueue_burst,
+			.felem = rte_ring_enqueue_burst_elem,
+		},
+		.deq = {
+			.flegacy = rte_ring_dequeue_burst,
+			.felem = rte_ring_dequeue_burst_elem,
+		},
+	},
 	{
 		.desc = "MP_HTS/MC_HTS sync mode",
 		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
diff --git a/app/test/test_ring_rts_v2_stress.c b/app/test/test_ring_rts_v2_stress.c
new file mode 100644
index 0000000000..6079366a7d
--- /dev/null
+++ b/app/test/test_ring_rts_v2_stress.c
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "test_ring_stress_impl.h"
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	return rte_ring_mc_rts_v2_dequeue_bulk(r, obj, n, avail);
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	return rte_ring_mp_rts_v2_enqueue_bulk(r, obj, n, free);
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num,
+		RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ);
+}
+
+const struct test test_ring_rts_v2_stress = {
+	.name = "MT_RTS_V2",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c
index 1af45e0fc8..94085acd5e 100644
--- a/app/test/test_ring_stress.c
+++ b/app/test/test_ring_stress.c
@@ -43,6 +43,9 @@ test_ring_stress(void)
 	n += test_ring_rts_stress.nb_case;
 	k += run_test(&test_ring_rts_stress);
 
+	n += test_ring_rts_v2_stress.nb_case;
+	k += run_test(&test_ring_rts_v2_stress);
+
 	n += test_ring_hts_stress.nb_case;
 	k += run_test(&test_ring_hts_stress);
 
diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h
index 416d68c9a0..505957f6fb 100644
--- a/app/test/test_ring_stress.h
+++ b/app/test/test_ring_stress.h
@@ -34,6 +34,7 @@ struct test {
 
 extern const struct test test_ring_mpmc_stress;
 extern const struct test test_ring_rts_stress;
+extern const struct test test_ring_rts_v2_stress;
 extern const struct test test_ring_hts_stress;
 extern const struct test test_ring_mt_peek_stress;
 extern const struct test test_ring_mt_peek_stress_zc;
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..d4dd99a99e 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,9 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+       type_kind = struct
+       name = rte_ring_rts_cache
+[suppress_type]
+       name = rte_ring_rts_headtail
+       has_data_member_inserted_between = {offset_of(head), end}
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 426dfcd982..f73bc9e397 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -102,6 +102,8 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* ring: Added ``rte_ring_rts_cache`` structure and ``rts_cache`` field to the
+  ``rte_ring_rts_headtail`` structure.
 
 Known Issues
 ------------
diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c
index aebb6d6728..ada1ae88fa 100644
--- a/lib/ring/rte_ring.c
+++ b/lib/ring/rte_ring.c
@@ -43,7 +43,8 @@ EAL_REGISTER_TAILQ(rte_ring_tailq)
 /* mask of all valid flag values to ring_create() */
 #define RING_F_MASK (RING_F_SP_ENQ | RING_F_SC_DEQ | RING_F_EXACT_SZ | \
 		     RING_F_MP_RTS_ENQ | RING_F_MC_RTS_DEQ |	       \
-		     RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ)
+		     RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ |	       \
+		     RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ)
 
 /* true if x is a power of 2 */
 #define POWEROF2(x) ((((x)-1) & (x)) == 0)
@@ -106,6 +107,7 @@ reset_headtail(void *p)
 		ht->tail = 0;
 		break;
 	case RTE_RING_SYNC_MT_RTS:
+	case RTE_RING_SYNC_MT_RTS_V2:
 		ht_rts->head.raw = 0;
 		ht_rts->tail.raw = 0;
 		break;
@@ -135,9 +137,11 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	enum rte_ring_sync_type *cons_st)
 {
 	static const uint32_t prod_st_flags =
-		(RING_F_SP_ENQ | RING_F_MP_RTS_ENQ | RING_F_MP_HTS_ENQ);
+		(RING_F_SP_ENQ | RING_F_MP_RTS_ENQ | RING_F_MP_HTS_ENQ |
+		RING_F_MP_RTS_V2_ENQ);
 	static const uint32_t cons_st_flags =
-		(RING_F_SC_DEQ | RING_F_MC_RTS_DEQ | RING_F_MC_HTS_DEQ);
+		(RING_F_SC_DEQ | RING_F_MC_RTS_DEQ | RING_F_MC_HTS_DEQ |
+		RING_F_MC_RTS_V2_DEQ);
 
 	switch (flags & prod_st_flags) {
 	case 0:
@@ -152,6 +156,9 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	case RING_F_MP_HTS_ENQ:
 		*prod_st = RTE_RING_SYNC_MT_HTS;
 		break;
+	case RING_F_MP_RTS_V2_ENQ:
+		*prod_st = RTE_RING_SYNC_MT_RTS_V2;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -169,6 +176,9 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	case RING_F_MC_HTS_DEQ:
 		*cons_st = RTE_RING_SYNC_MT_HTS;
 		break;
+	case RING_F_MC_RTS_V2_DEQ:
+		*cons_st = RTE_RING_SYNC_MT_RTS_V2;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -239,6 +249,28 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
 	if (flags & RING_F_MC_RTS_DEQ)
 		rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
 
+	/* set default values for head-tail distance and allocate memory to cache */
+	if (flags & RING_F_MP_RTS_V2_ENQ) {
+		rte_ring_set_prod_htd_max(r, r->capacity / HTD_MAX_DEF);
+		r->rts_prod.rts_cache = (struct rte_ring_rts_cache *)rte_zmalloc(
+			"RTS_PROD_CACHE", sizeof(struct rte_ring_rts_cache) * r->size, 0);
+		if (r->rts_prod.rts_cache == NULL) {
+			RING_LOG(ERR, "Cannot reserve memory for rts prod cache");
+			return -ENOMEM;
+		}
+	}
+	if (flags & RING_F_MC_RTS_V2_DEQ) {
+		rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
+		r->rts_cons.rts_cache = (struct rte_ring_rts_cache *)rte_zmalloc(
+			"RTS_CONS_CACHE", sizeof(struct rte_ring_rts_cache) * r->size, 0);
+		if (r->rts_cons.rts_cache == NULL) {
+			if (flags & RING_F_MP_RTS_V2_ENQ)
+				rte_free(r->rts_prod.rts_cache);
+			RING_LOG(ERR, "Cannot reserve memory for rts cons cache");
+			return -ENOMEM;
+		}
+	}
+
 	return 0;
 }
 
@@ -293,9 +325,14 @@ rte_ring_create_elem(const char *name, unsigned int esize, unsigned int count,
 					 mz_flags, alignof(typeof(*r)));
 	if (mz != NULL) {
 		r = mz->addr;
-		/* no need to check return value here, we already checked the
-		 * arguments above */
-		rte_ring_init(r, name, requested_count, flags);
+
+		if (rte_ring_init(r, name, requested_count, flags)) {
+			rte_free(te);
+			if (rte_memzone_free(mz) != 0)
+				RING_LOG(ERR, "Cannot free memory for ring");
+			rte_mcfg_tailq_write_unlock();
+			return NULL;
+		}
 
 		te->data = (void *) r;
 		r->memzone = mz;
@@ -358,6 +395,11 @@ rte_ring_free(struct rte_ring *r)
 
 	rte_mcfg_tailq_write_unlock();
 
+	if (r->flags & RING_F_MP_RTS_V2_ENQ)
+		rte_free(r->rts_prod.rts_cache);
+	if (r->flags & RING_F_MC_RTS_V2_DEQ)
+		rte_free(r->rts_cons.rts_cache);
+
 	if (rte_memzone_free(r->memzone) != 0)
 		RING_LOG(ERR, "Cannot free memory");
 
diff --git a/lib/ring/rte_ring.h b/lib/ring/rte_ring.h
index 11ca69c73d..2b35ce038e 100644
--- a/lib/ring/rte_ring.h
+++ b/lib/ring/rte_ring.h
@@ -89,6 +89,9 @@ ssize_t rte_ring_get_memsize(unsigned int count);
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -101,6 +104,9 @@ ssize_t rte_ring_get_memsize(unsigned int count);
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
@@ -149,6 +155,9 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -161,6 +170,9 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
diff --git a/lib/ring/rte_ring_core.h b/lib/ring/rte_ring_core.h
index 6cd6ce9884..9e627d26c1 100644
--- a/lib/ring/rte_ring_core.h
+++ b/lib/ring/rte_ring_core.h
@@ -55,6 +55,7 @@ enum rte_ring_sync_type {
 	RTE_RING_SYNC_ST,     /**< single thread only */
 	RTE_RING_SYNC_MT_RTS, /**< multi-thread relaxed tail sync */
 	RTE_RING_SYNC_MT_HTS, /**< multi-thread head/tail sync */
+	RTE_RING_SYNC_MT_RTS_V2, /**< multi-thread relaxed tail sync v2 */
 };
 
 /**
@@ -82,11 +83,16 @@ union __rte_ring_rts_poscnt {
 	} val;
 };
 
+struct rte_ring_rts_cache {
+	volatile RTE_ATOMIC(uint32_t) num;      /**< Number of objs. */
+};
+
 struct rte_ring_rts_headtail {
 	volatile union __rte_ring_rts_poscnt tail;
 	enum rte_ring_sync_type sync_type;  /**< sync type of prod/cons */
 	uint32_t htd_max;   /**< max allowed distance between head/tail */
 	volatile union __rte_ring_rts_poscnt head;
+	struct rte_ring_rts_cache *rts_cache; /**< Cache of prod/cons */
 };
 
 union __rte_ring_hts_pos {
@@ -163,4 +169,7 @@ struct rte_ring {
 #define RING_F_MP_HTS_ENQ 0x0020 /**< The default enqueue is "MP HTS". */
 #define RING_F_MC_HTS_DEQ 0x0040 /**< The default dequeue is "MC HTS". */
 
+#define RING_F_MP_RTS_V2_ENQ 0x0080 /**< The default enqueue is "MP RTS V2". */
+#define RING_F_MC_RTS_V2_DEQ 0x0100 /**< The default dequeue is "MC RTS V2". */
+
 #endif /* _RTE_RING_CORE_H_ */
diff --git a/lib/ring/rte_ring_elem.h b/lib/ring/rte_ring_elem.h
index b96bfc003f..1352709f94 100644
--- a/lib/ring/rte_ring_elem.h
+++ b/lib/ring/rte_ring_elem.h
@@ -71,6 +71,9 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -83,6 +86,9 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
@@ -203,6 +209,9 @@ rte_ring_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mp_hts_enqueue_bulk_elem(r, obj_table, esize, n,
 			free_space);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mp_rts_v2_enqueue_bulk_elem(r, obj_table, esize, n,
+			free_space);
 	}
 
 	/* valid ring should never reach this point */
@@ -385,6 +394,9 @@ rte_ring_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mc_hts_dequeue_bulk_elem(r, obj_table, esize,
 			n, available);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mc_rts_v2_dequeue_bulk_elem(r, obj_table, esize,
+			n, available);
 	}
 
 	/* valid ring should never reach this point */
@@ -571,6 +583,9 @@ rte_ring_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mp_hts_enqueue_burst_elem(r, obj_table, esize,
 			n, free_space);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mp_rts_v2_enqueue_burst_elem(r, obj_table, esize,
+			n, free_space);
 	}
 
 	/* valid ring should never reach this point */
@@ -681,6 +696,9 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mc_hts_dequeue_burst_elem(r, obj_table, esize,
 			n, available);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mc_rts_v2_dequeue_burst_elem(r, obj_table, esize,
+			n, available);
 	}
 
 	/* valid ring should never reach this point */
diff --git a/lib/ring/rte_ring_rts.h b/lib/ring/rte_ring_rts.h
index d7a3863c83..b47e400452 100644
--- a/lib/ring/rte_ring_rts.h
+++ b/lib/ring/rte_ring_rts.h
@@ -84,6 +84,33 @@ rte_ring_mp_rts_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
 			RTE_RING_QUEUE_FIXED, free_space);
 }
 
+/**
+ * Enqueue several objects on the RTS ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   The number of objects enqueued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	return __rte_ring_do_rts_v2_enqueue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_FIXED, free_space);
+}
+
 /**
  * Dequeue several objects from an RTS ring (multi-consumers safe).
  *
@@ -111,6 +138,33 @@ rte_ring_mc_rts_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
 			RTE_RING_QUEUE_FIXED, available);
 }
 
+/**
+ * Dequeue several objects from an RTS ring (multi-consumers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects that will be filled.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects dequeued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	return __rte_ring_do_rts_v2_dequeue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_FIXED, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -138,6 +192,33 @@ rte_ring_mp_rts_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
 			RTE_RING_QUEUE_VARIABLE, free_space);
 }
 
+/**
+ * Enqueue several objects on the RTS ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   - n: Actual number of objects enqueued.
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	return __rte_ring_do_rts_v2_enqueue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_VARIABLE, free_space);
+}
+
 /**
  * Dequeue several objects from an RTS  ring (multi-consumers safe).
  * When the requested objects are more than the available objects,
@@ -167,6 +248,35 @@ rte_ring_mc_rts_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 			RTE_RING_QUEUE_VARIABLE, available);
 }
 
+/**
+ * Dequeue several objects from an RTS  ring (multi-consumers safe).
+ * When the requested objects are more than the available objects,
+ * only dequeue the actual number of objects.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects that will be filled.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   - n: Actual number of objects dequeued, 0 if ring is empty
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	return __rte_ring_do_rts_v2_dequeue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_VARIABLE, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -213,6 +323,52 @@ rte_ring_mc_rts_dequeue_bulk(struct rte_ring *r, void **obj_table,
 			sizeof(uintptr_t), n, available);
 }
 
+/**
+ * Enqueue several objects on the RTS V2 ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   The number of objects enqueued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
+			 unsigned int n, unsigned int *free_space)
+{
+	return rte_ring_mp_rts_v2_enqueue_bulk_elem(r, obj_table,
+			sizeof(uintptr_t), n, free_space);
+}
+
+/**
+ * Dequeue several objects from an RTS V2 ring (multi-consumers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects) that will be filled.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects dequeued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
+{
+	return rte_ring_mc_rts_v2_dequeue_bulk_elem(r, obj_table,
+			sizeof(uintptr_t), n, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -261,6 +417,54 @@ rte_ring_mc_rts_dequeue_burst(struct rte_ring *r, void **obj_table,
 			sizeof(uintptr_t), n, available);
 }
 
+/**
+ * Enqueue several objects on the RTS V2 ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   - n: Actual number of objects enqueued.
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_burst(struct rte_ring *r, void * const *obj_table,
+			 unsigned int n, unsigned int *free_space)
+{
+	return rte_ring_mp_rts_v2_enqueue_burst_elem(r, obj_table,
+			sizeof(uintptr_t), n, free_space);
+}
+
+/**
+ * Dequeue several objects from an RTS V2 ring (multi-consumers safe).
+ * When the requested objects are more than the available objects,
+ * only dequeue the actual number of objects.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects) that will be filled.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   - n: Actual number of objects dequeued, 0 if ring is empty
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
+{
+	return rte_ring_mc_rts_v2_dequeue_burst_elem(r, obj_table,
+			sizeof(uintptr_t), n, available);
+}
+
 /**
  * Return producer max Head-Tail-Distance (HTD).
  *
@@ -273,7 +477,8 @@ rte_ring_mc_rts_dequeue_burst(struct rte_ring *r, void **obj_table,
 static inline uint32_t
 rte_ring_get_prod_htd_max(const struct rte_ring *r)
 {
-	if (r->prod.sync_type == RTE_RING_SYNC_MT_RTS)
+	if ((r->prod.sync_type == RTE_RING_SYNC_MT_RTS) ||
+			(r->prod.sync_type == RTE_RING_SYNC_MT_RTS_V2))
 		return r->rts_prod.htd_max;
 	return UINT32_MAX;
 }
@@ -292,7 +497,8 @@ rte_ring_get_prod_htd_max(const struct rte_ring *r)
 static inline int
 rte_ring_set_prod_htd_max(struct rte_ring *r, uint32_t v)
 {
-	if (r->prod.sync_type != RTE_RING_SYNC_MT_RTS)
+	if ((r->prod.sync_type != RTE_RING_SYNC_MT_RTS) &&
+			(r->prod.sync_type != RTE_RING_SYNC_MT_RTS_V2))
 		return -ENOTSUP;
 
 	r->rts_prod.htd_max = v;
@@ -311,7 +517,8 @@ rte_ring_set_prod_htd_max(struct rte_ring *r, uint32_t v)
 static inline uint32_t
 rte_ring_get_cons_htd_max(const struct rte_ring *r)
 {
-	if (r->cons.sync_type == RTE_RING_SYNC_MT_RTS)
+	if ((r->cons.sync_type == RTE_RING_SYNC_MT_RTS) ||
+			(r->cons.sync_type == RTE_RING_SYNC_MT_RTS_V2))
 		return r->rts_cons.htd_max;
 	return UINT32_MAX;
 }
@@ -330,7 +537,8 @@ rte_ring_get_cons_htd_max(const struct rte_ring *r)
 static inline int
 rte_ring_set_cons_htd_max(struct rte_ring *r, uint32_t v)
 {
-	if (r->cons.sync_type != RTE_RING_SYNC_MT_RTS)
+	if ((r->cons.sync_type != RTE_RING_SYNC_MT_RTS) &&
+			(r->cons.sync_type != RTE_RING_SYNC_MT_RTS_V2))
 		return -ENOTSUP;
 
 	r->rts_cons.htd_max = v;
diff --git a/lib/ring/rte_ring_rts_elem_pvt.h b/lib/ring/rte_ring_rts_elem_pvt.h
index 122650346b..4ce22a93ed 100644
--- a/lib/ring/rte_ring_rts_elem_pvt.h
+++ b/lib/ring/rte_ring_rts_elem_pvt.h
@@ -46,6 +46,92 @@ __rte_ring_rts_update_tail(struct rte_ring_rts_headtail *ht)
 			rte_memory_order_release, rte_memory_order_acquire) == 0);
 }
 
+/**
+ * @file rte_ring_rts_elem_pvt.h
+ * It is not recommended to include this file directly,
+ * include <rte_ring.h> instead.
+ * Contains internal helper functions for Relaxed Tail Sync (RTS) ring mode.
+ * For more information please refer to <rte_ring_rts.h>.
+ */
+
+/**
+ * @internal This function updates tail values.
+ */
+static __rte_always_inline void
+__rte_ring_rts_v2_update_tail(struct rte_ring_rts_headtail *ht,
+	uint32_t old_tail, uint32_t num, uint32_t mask)
+{
+	union __rte_ring_rts_poscnt ot, nt;
+
+	ot.val.cnt = nt.val.cnt = 0;
+	ot.val.pos = old_tail;
+	nt.val.pos = old_tail + num;
+
+	/*
+	 * If the tail is equal to the current enqueues/dequeues, update
+	 * the tail with new value and then continue to try to update the
+	 * tail until the num of the cache is 0, otherwise write the num of
+	 * the current enqueues/dequeues to the cache.
+	 */
+
+	if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+				(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+				rte_memory_order_release, rte_memory_order_acquire) == 0) {
+		ot.val.pos = old_tail;
+
+		/*
+		 * Write the num of the current enqueues/dequeues to the
+		 * corresponding cache.
+		 */
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			num, rte_memory_order_release);
+
+		/*
+		 * There may be competition with another enqueues/dequeues
+		 * for the update tail. The winner continues to try to update
+		 * the tail, and the loser exits.
+		 */
+		if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+					(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+					rte_memory_order_release, rte_memory_order_acquire) == 0)
+			return;
+
+		/*
+		 * Set the corresponding cache to 0 for next use.
+		 */
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			0, rte_memory_order_release);
+	}
+
+	/*
+	 * Try to update the tail until the num of the corresponding cache is 0.
+	 * Getting here means that the current enqueues/dequeues is trying to update
+	 * the tail of another enqueues/dequeues.
+	 */
+	while (1) {
+		num = rte_atomic_load_explicit(&ht->rts_cache[nt.val.pos & mask].num,
+			rte_memory_order_acquire);
+		if (num == 0)
+			break;
+
+		ot.val.pos = nt.val.pos;
+		nt.val.pos += num;
+
+		/*
+		 * There may be competition with another enqueues/dequeues
+		 * for the update tail. The winner continues to try to update
+		 * the tail, and the loser exits.
+		 */
+		if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+					(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+					rte_memory_order_release, rte_memory_order_acquire) == 0)
+			return;
+
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			0, rte_memory_order_release);
+	};
+}
+
 /**
  * @internal This function waits till head/tail distance wouldn't
  * exceed pre-defined max value.
@@ -218,6 +304,47 @@ __rte_ring_do_rts_enqueue_elem(struct rte_ring *r, const void *obj_table,
 	return n;
 }
 
+/**
+ * @internal Enqueue several objects on the RTS ring.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param free_space
+ *   returns the amount of space after the enqueue operation has finished
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_rts_v2_enqueue_elem(struct rte_ring *r, const void *obj_table,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	uint32_t *free_space)
+{
+	uint32_t free, head;
+
+	n =  __rte_ring_rts_move_prod_head(r, n, behavior, &head, &free);
+
+	if (n != 0) {
+		__rte_ring_enqueue_elems(r, head, obj_table, esize, n);
+		__rte_ring_rts_v2_update_tail(&r->rts_prod, head, n, r->mask);
+	}
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
 /**
  * @internal Dequeue several objects from the RTS ring.
  *
@@ -259,4 +386,45 @@ __rte_ring_do_rts_dequeue_elem(struct rte_ring *r, void *obj_table,
 	return n;
 }
 
+/**
+ * @internal Dequeue several objects from the RTS ring.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to pull from the ring.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param available
+ *   returns the number of remaining ring entries after the dequeue has finished
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_rts_v2_dequeue_elem(struct rte_ring *r, void *obj_table,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	uint32_t *available)
+{
+	uint32_t entries, head;
+
+	n = __rte_ring_rts_move_cons_head(r, n, behavior, &head, &entries);
+
+	if (n != 0) {
+		__rte_ring_dequeue_elems(r, head, obj_table, esize, n);
+		__rte_ring_rts_v2_update_tail(&r->rts_cons, head, n, r->mask);
+	}
+
+	if (available != NULL)
+		*available = entries - n;
+	return n;
+}
+
 #endif /* _RTE_RING_RTS_ELEM_PVT_H_ */
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* [PATCH v2] ring: add the second version of the RTS interface
@ 2025-01-05 15:09  5% Huichao Cai
  0 siblings, 0 replies; 200+ results
From: Huichao Cai @ 2025-01-05 15:09 UTC (permalink / raw)
  To: honnappa.nagarahalli, konstantin.v.ananyev, thomas; +Cc: dev

The timing of the update of the RTS enqueues/dequeues tail is
limited to the last enqueues/dequeues, which reduces concurrency,
so the RTS interface of the V2 version is added, which makes the tail
of the enqueues/dequeues not limited to the last enqueues/dequeues
and thus enables timely updates to increase concurrency.

Add some corresponding test cases.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 app/test/meson.build                   |   1 +
 app/test/test_ring.c                   |  26 +++
 app/test/test_ring_rts_v2_stress.c     |  32 ++++
 app/test/test_ring_stress.c            |   3 +
 app/test/test_ring_stress.h            |   1 +
 devtools/libabigail.abignore           |   6 +
 doc/guides/rel_notes/release_25_03.rst |   2 +
 lib/ring/rte_ring.c                    |  54 ++++++-
 lib/ring/rte_ring.h                    |  12 ++
 lib/ring/rte_ring_core.h               |   9 ++
 lib/ring/rte_ring_elem.h               |  18 +++
 lib/ring/rte_ring_rts.h                | 216 ++++++++++++++++++++++++-
 lib/ring/rte_ring_rts_elem_pvt.h       | 168 +++++++++++++++++++
 13 files changed, 538 insertions(+), 10 deletions(-)
 create mode 100644 app/test/test_ring_rts_v2_stress.c

diff --git a/app/test/meson.build b/app/test/meson.build
index d5cb6a7f7a..e3d8cef3fa 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -166,6 +166,7 @@ source_file_deps = {
     'test_ring_mt_peek_stress_zc.c': ['ptr_compress'],
     'test_ring_perf.c': ['ptr_compress'],
     'test_ring_rts_stress.c': ['ptr_compress'],
+    'test_ring_rts_v2_stress.c': ['ptr_compress'],
     'test_ring_st_peek_stress.c': ['ptr_compress'],
     'test_ring_st_peek_stress_zc.c': ['ptr_compress'],
     'test_ring_stress.c': ['ptr_compress'],
diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index ba1fec1de3..094f14b859 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -284,6 +284,19 @@ static const struct {
 			.felem = rte_ring_dequeue_bulk_elem,
 		},
 	},
+	{
+		.desc = "MP_RTS/MC_RTS V2 sync mode",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ,
+		.enq = {
+			.flegacy = rte_ring_enqueue_bulk,
+			.felem = rte_ring_enqueue_bulk_elem,
+		},
+		.deq = {
+			.flegacy = rte_ring_dequeue_bulk,
+			.felem = rte_ring_dequeue_bulk_elem,
+		},
+	},
 	{
 		.desc = "MP_HTS/MC_HTS sync mode",
 		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
@@ -349,6 +362,19 @@ static const struct {
 			.felem = rte_ring_dequeue_burst_elem,
 		},
 	},
+	{
+		.desc = "MP_RTS/MC_RTS V2 sync mode",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ,
+		.enq = {
+			.flegacy = rte_ring_enqueue_burst,
+			.felem = rte_ring_enqueue_burst_elem,
+		},
+		.deq = {
+			.flegacy = rte_ring_dequeue_burst,
+			.felem = rte_ring_dequeue_burst_elem,
+		},
+	},
 	{
 		.desc = "MP_HTS/MC_HTS sync mode",
 		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
diff --git a/app/test/test_ring_rts_v2_stress.c b/app/test/test_ring_rts_v2_stress.c
new file mode 100644
index 0000000000..6079366a7d
--- /dev/null
+++ b/app/test/test_ring_rts_v2_stress.c
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "test_ring_stress_impl.h"
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	return rte_ring_mc_rts_v2_dequeue_bulk(r, obj, n, avail);
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	return rte_ring_mp_rts_v2_enqueue_bulk(r, obj, n, free);
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num,
+		RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ);
+}
+
+const struct test test_ring_rts_v2_stress = {
+	.name = "MT_RTS_V2",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c
index 1af45e0fc8..94085acd5e 100644
--- a/app/test/test_ring_stress.c
+++ b/app/test/test_ring_stress.c
@@ -43,6 +43,9 @@ test_ring_stress(void)
 	n += test_ring_rts_stress.nb_case;
 	k += run_test(&test_ring_rts_stress);
 
+	n += test_ring_rts_v2_stress.nb_case;
+	k += run_test(&test_ring_rts_v2_stress);
+
 	n += test_ring_hts_stress.nb_case;
 	k += run_test(&test_ring_hts_stress);
 
diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h
index 416d68c9a0..505957f6fb 100644
--- a/app/test/test_ring_stress.h
+++ b/app/test/test_ring_stress.h
@@ -34,6 +34,7 @@ struct test {
 
 extern const struct test test_ring_mpmc_stress;
 extern const struct test test_ring_rts_stress;
+extern const struct test test_ring_rts_v2_stress;
 extern const struct test test_ring_hts_stress;
 extern const struct test test_ring_mt_peek_stress;
 extern const struct test test_ring_mt_peek_stress_zc;
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..d4dd99a99e 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,9 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+       type_kind = struct
+       name = rte_ring_rts_cache
+[suppress_type]
+       name = rte_ring_rts_headtail
+       has_data_member_inserted_between = {offset_of(head), end}
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 426dfcd982..f73bc9e397 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -102,6 +102,8 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* ring: Added ``rte_ring_rts_cache`` structure and ``rts_cache`` field to the
+  ``rte_ring_rts_headtail`` structure.
 
 Known Issues
 ------------
diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c
index aebb6d6728..ada1ae88fa 100644
--- a/lib/ring/rte_ring.c
+++ b/lib/ring/rte_ring.c
@@ -43,7 +43,8 @@ EAL_REGISTER_TAILQ(rte_ring_tailq)
 /* mask of all valid flag values to ring_create() */
 #define RING_F_MASK (RING_F_SP_ENQ | RING_F_SC_DEQ | RING_F_EXACT_SZ | \
 		     RING_F_MP_RTS_ENQ | RING_F_MC_RTS_DEQ |	       \
-		     RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ)
+		     RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ |	       \
+		     RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ)
 
 /* true if x is a power of 2 */
 #define POWEROF2(x) ((((x)-1) & (x)) == 0)
@@ -106,6 +107,7 @@ reset_headtail(void *p)
 		ht->tail = 0;
 		break;
 	case RTE_RING_SYNC_MT_RTS:
+	case RTE_RING_SYNC_MT_RTS_V2:
 		ht_rts->head.raw = 0;
 		ht_rts->tail.raw = 0;
 		break;
@@ -135,9 +137,11 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	enum rte_ring_sync_type *cons_st)
 {
 	static const uint32_t prod_st_flags =
-		(RING_F_SP_ENQ | RING_F_MP_RTS_ENQ | RING_F_MP_HTS_ENQ);
+		(RING_F_SP_ENQ | RING_F_MP_RTS_ENQ | RING_F_MP_HTS_ENQ |
+		RING_F_MP_RTS_V2_ENQ);
 	static const uint32_t cons_st_flags =
-		(RING_F_SC_DEQ | RING_F_MC_RTS_DEQ | RING_F_MC_HTS_DEQ);
+		(RING_F_SC_DEQ | RING_F_MC_RTS_DEQ | RING_F_MC_HTS_DEQ |
+		RING_F_MC_RTS_V2_DEQ);
 
 	switch (flags & prod_st_flags) {
 	case 0:
@@ -152,6 +156,9 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	case RING_F_MP_HTS_ENQ:
 		*prod_st = RTE_RING_SYNC_MT_HTS;
 		break;
+	case RING_F_MP_RTS_V2_ENQ:
+		*prod_st = RTE_RING_SYNC_MT_RTS_V2;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -169,6 +176,9 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	case RING_F_MC_HTS_DEQ:
 		*cons_st = RTE_RING_SYNC_MT_HTS;
 		break;
+	case RING_F_MC_RTS_V2_DEQ:
+		*cons_st = RTE_RING_SYNC_MT_RTS_V2;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -239,6 +249,28 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
 	if (flags & RING_F_MC_RTS_DEQ)
 		rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
 
+	/* set default values for head-tail distance and allocate memory to cache */
+	if (flags & RING_F_MP_RTS_V2_ENQ) {
+		rte_ring_set_prod_htd_max(r, r->capacity / HTD_MAX_DEF);
+		r->rts_prod.rts_cache = (struct rte_ring_rts_cache *)rte_zmalloc(
+			"RTS_PROD_CACHE", sizeof(struct rte_ring_rts_cache) * r->size, 0);
+		if (r->rts_prod.rts_cache == NULL) {
+			RING_LOG(ERR, "Cannot reserve memory for rts prod cache");
+			return -ENOMEM;
+		}
+	}
+	if (flags & RING_F_MC_RTS_V2_DEQ) {
+		rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
+		r->rts_cons.rts_cache = (struct rte_ring_rts_cache *)rte_zmalloc(
+			"RTS_CONS_CACHE", sizeof(struct rte_ring_rts_cache) * r->size, 0);
+		if (r->rts_cons.rts_cache == NULL) {
+			if (flags & RING_F_MP_RTS_V2_ENQ)
+				rte_free(r->rts_prod.rts_cache);
+			RING_LOG(ERR, "Cannot reserve memory for rts cons cache");
+			return -ENOMEM;
+		}
+	}
+
 	return 0;
 }
 
@@ -293,9 +325,14 @@ rte_ring_create_elem(const char *name, unsigned int esize, unsigned int count,
 					 mz_flags, alignof(typeof(*r)));
 	if (mz != NULL) {
 		r = mz->addr;
-		/* no need to check return value here, we already checked the
-		 * arguments above */
-		rte_ring_init(r, name, requested_count, flags);
+
+		if (rte_ring_init(r, name, requested_count, flags)) {
+			rte_free(te);
+			if (rte_memzone_free(mz) != 0)
+				RING_LOG(ERR, "Cannot free memory for ring");
+			rte_mcfg_tailq_write_unlock();
+			return NULL;
+		}
 
 		te->data = (void *) r;
 		r->memzone = mz;
@@ -358,6 +395,11 @@ rte_ring_free(struct rte_ring *r)
 
 	rte_mcfg_tailq_write_unlock();
 
+	if (r->flags & RING_F_MP_RTS_V2_ENQ)
+		rte_free(r->rts_prod.rts_cache);
+	if (r->flags & RING_F_MC_RTS_V2_DEQ)
+		rte_free(r->rts_cons.rts_cache);
+
 	if (rte_memzone_free(r->memzone) != 0)
 		RING_LOG(ERR, "Cannot free memory");
 
diff --git a/lib/ring/rte_ring.h b/lib/ring/rte_ring.h
index 11ca69c73d..2b35ce038e 100644
--- a/lib/ring/rte_ring.h
+++ b/lib/ring/rte_ring.h
@@ -89,6 +89,9 @@ ssize_t rte_ring_get_memsize(unsigned int count);
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -101,6 +104,9 @@ ssize_t rte_ring_get_memsize(unsigned int count);
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
@@ -149,6 +155,9 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -161,6 +170,9 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
diff --git a/lib/ring/rte_ring_core.h b/lib/ring/rte_ring_core.h
index 6cd6ce9884..9e627d26c1 100644
--- a/lib/ring/rte_ring_core.h
+++ b/lib/ring/rte_ring_core.h
@@ -55,6 +55,7 @@ enum rte_ring_sync_type {
 	RTE_RING_SYNC_ST,     /**< single thread only */
 	RTE_RING_SYNC_MT_RTS, /**< multi-thread relaxed tail sync */
 	RTE_RING_SYNC_MT_HTS, /**< multi-thread head/tail sync */
+	RTE_RING_SYNC_MT_RTS_V2, /**< multi-thread relaxed tail sync v2 */
 };
 
 /**
@@ -82,11 +83,16 @@ union __rte_ring_rts_poscnt {
 	} val;
 };
 
+struct rte_ring_rts_cache {
+	volatile RTE_ATOMIC(uint32_t) num;      /**< Number of objs. */
+};
+
 struct rte_ring_rts_headtail {
 	volatile union __rte_ring_rts_poscnt tail;
 	enum rte_ring_sync_type sync_type;  /**< sync type of prod/cons */
 	uint32_t htd_max;   /**< max allowed distance between head/tail */
 	volatile union __rte_ring_rts_poscnt head;
+	struct rte_ring_rts_cache *rts_cache; /**< Cache of prod/cons */
 };
 
 union __rte_ring_hts_pos {
@@ -163,4 +169,7 @@ struct rte_ring {
 #define RING_F_MP_HTS_ENQ 0x0020 /**< The default enqueue is "MP HTS". */
 #define RING_F_MC_HTS_DEQ 0x0040 /**< The default dequeue is "MC HTS". */
 
+#define RING_F_MP_RTS_V2_ENQ 0x0080 /**< The default enqueue is "MP RTS V2". */
+#define RING_F_MC_RTS_V2_DEQ 0x0100 /**< The default dequeue is "MC RTS V2". */
+
 #endif /* _RTE_RING_CORE_H_ */
diff --git a/lib/ring/rte_ring_elem.h b/lib/ring/rte_ring_elem.h
index b96bfc003f..1352709f94 100644
--- a/lib/ring/rte_ring_elem.h
+++ b/lib/ring/rte_ring_elem.h
@@ -71,6 +71,9 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -83,6 +86,9 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
@@ -203,6 +209,9 @@ rte_ring_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mp_hts_enqueue_bulk_elem(r, obj_table, esize, n,
 			free_space);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mp_rts_v2_enqueue_bulk_elem(r, obj_table, esize, n,
+			free_space);
 	}
 
 	/* valid ring should never reach this point */
@@ -385,6 +394,9 @@ rte_ring_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mc_hts_dequeue_bulk_elem(r, obj_table, esize,
 			n, available);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mc_rts_v2_dequeue_bulk_elem(r, obj_table, esize,
+			n, available);
 	}
 
 	/* valid ring should never reach this point */
@@ -571,6 +583,9 @@ rte_ring_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mp_hts_enqueue_burst_elem(r, obj_table, esize,
 			n, free_space);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mp_rts_v2_enqueue_burst_elem(r, obj_table, esize,
+			n, free_space);
 	}
 
 	/* valid ring should never reach this point */
@@ -681,6 +696,9 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mc_hts_dequeue_burst_elem(r, obj_table, esize,
 			n, available);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mc_rts_v2_dequeue_burst_elem(r, obj_table, esize,
+			n, available);
 	}
 
 	/* valid ring should never reach this point */
diff --git a/lib/ring/rte_ring_rts.h b/lib/ring/rte_ring_rts.h
index d7a3863c83..b47e400452 100644
--- a/lib/ring/rte_ring_rts.h
+++ b/lib/ring/rte_ring_rts.h
@@ -84,6 +84,33 @@ rte_ring_mp_rts_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
 			RTE_RING_QUEUE_FIXED, free_space);
 }
 
+/**
+ * Enqueue several objects on the RTS ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   The number of objects enqueued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	return __rte_ring_do_rts_v2_enqueue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_FIXED, free_space);
+}
+
 /**
  * Dequeue several objects from an RTS ring (multi-consumers safe).
  *
@@ -111,6 +138,33 @@ rte_ring_mc_rts_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
 			RTE_RING_QUEUE_FIXED, available);
 }
 
+/**
+ * Dequeue several objects from an RTS ring (multi-consumers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects that will be filled.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects dequeued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	return __rte_ring_do_rts_v2_dequeue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_FIXED, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -138,6 +192,33 @@ rte_ring_mp_rts_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
 			RTE_RING_QUEUE_VARIABLE, free_space);
 }
 
+/**
+ * Enqueue several objects on the RTS ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   - n: Actual number of objects enqueued.
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	return __rte_ring_do_rts_v2_enqueue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_VARIABLE, free_space);
+}
+
 /**
  * Dequeue several objects from an RTS  ring (multi-consumers safe).
  * When the requested objects are more than the available objects,
@@ -167,6 +248,35 @@ rte_ring_mc_rts_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 			RTE_RING_QUEUE_VARIABLE, available);
 }
 
+/**
+ * Dequeue several objects from an RTS  ring (multi-consumers safe).
+ * When the requested objects are more than the available objects,
+ * only dequeue the actual number of objects.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects that will be filled.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   - n: Actual number of objects dequeued, 0 if ring is empty
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	return __rte_ring_do_rts_v2_dequeue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_VARIABLE, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -213,6 +323,52 @@ rte_ring_mc_rts_dequeue_bulk(struct rte_ring *r, void **obj_table,
 			sizeof(uintptr_t), n, available);
 }
 
+/**
+ * Enqueue several objects on the RTS V2 ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   The number of objects enqueued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
+			 unsigned int n, unsigned int *free_space)
+{
+	return rte_ring_mp_rts_v2_enqueue_bulk_elem(r, obj_table,
+			sizeof(uintptr_t), n, free_space);
+}
+
+/**
+ * Dequeue several objects from an RTS V2 ring (multi-consumers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects) that will be filled.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects dequeued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
+{
+	return rte_ring_mc_rts_v2_dequeue_bulk_elem(r, obj_table,
+			sizeof(uintptr_t), n, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -261,6 +417,54 @@ rte_ring_mc_rts_dequeue_burst(struct rte_ring *r, void **obj_table,
 			sizeof(uintptr_t), n, available);
 }
 
+/**
+ * Enqueue several objects on the RTS V2 ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   - n: Actual number of objects enqueued.
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_burst(struct rte_ring *r, void * const *obj_table,
+			 unsigned int n, unsigned int *free_space)
+{
+	return rte_ring_mp_rts_v2_enqueue_burst_elem(r, obj_table,
+			sizeof(uintptr_t), n, free_space);
+}
+
+/**
+ * Dequeue several objects from an RTS V2 ring (multi-consumers safe).
+ * When the requested objects are more than the available objects,
+ * only dequeue the actual number of objects.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects) that will be filled.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   - n: Actual number of objects dequeued, 0 if ring is empty
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
+{
+	return rte_ring_mc_rts_v2_dequeue_burst_elem(r, obj_table,
+			sizeof(uintptr_t), n, available);
+}
+
 /**
  * Return producer max Head-Tail-Distance (HTD).
  *
@@ -273,7 +477,8 @@ rte_ring_mc_rts_dequeue_burst(struct rte_ring *r, void **obj_table,
 static inline uint32_t
 rte_ring_get_prod_htd_max(const struct rte_ring *r)
 {
-	if (r->prod.sync_type == RTE_RING_SYNC_MT_RTS)
+	if ((r->prod.sync_type == RTE_RING_SYNC_MT_RTS) ||
+			(r->prod.sync_type == RTE_RING_SYNC_MT_RTS_V2))
 		return r->rts_prod.htd_max;
 	return UINT32_MAX;
 }
@@ -292,7 +497,8 @@ rte_ring_get_prod_htd_max(const struct rte_ring *r)
 static inline int
 rte_ring_set_prod_htd_max(struct rte_ring *r, uint32_t v)
 {
-	if (r->prod.sync_type != RTE_RING_SYNC_MT_RTS)
+	if ((r->prod.sync_type != RTE_RING_SYNC_MT_RTS) &&
+			(r->prod.sync_type != RTE_RING_SYNC_MT_RTS_V2))
 		return -ENOTSUP;
 
 	r->rts_prod.htd_max = v;
@@ -311,7 +517,8 @@ rte_ring_set_prod_htd_max(struct rte_ring *r, uint32_t v)
 static inline uint32_t
 rte_ring_get_cons_htd_max(const struct rte_ring *r)
 {
-	if (r->cons.sync_type == RTE_RING_SYNC_MT_RTS)
+	if ((r->cons.sync_type == RTE_RING_SYNC_MT_RTS) ||
+			(r->cons.sync_type == RTE_RING_SYNC_MT_RTS_V2))
 		return r->rts_cons.htd_max;
 	return UINT32_MAX;
 }
@@ -330,7 +537,8 @@ rte_ring_get_cons_htd_max(const struct rte_ring *r)
 static inline int
 rte_ring_set_cons_htd_max(struct rte_ring *r, uint32_t v)
 {
-	if (r->cons.sync_type != RTE_RING_SYNC_MT_RTS)
+	if ((r->cons.sync_type != RTE_RING_SYNC_MT_RTS) &&
+			(r->cons.sync_type != RTE_RING_SYNC_MT_RTS_V2))
 		return -ENOTSUP;
 
 	r->rts_cons.htd_max = v;
diff --git a/lib/ring/rte_ring_rts_elem_pvt.h b/lib/ring/rte_ring_rts_elem_pvt.h
index 122650346b..4ce22a93ed 100644
--- a/lib/ring/rte_ring_rts_elem_pvt.h
+++ b/lib/ring/rte_ring_rts_elem_pvt.h
@@ -46,6 +46,92 @@ __rte_ring_rts_update_tail(struct rte_ring_rts_headtail *ht)
 			rte_memory_order_release, rte_memory_order_acquire) == 0);
 }
 
+/**
+ * @file rte_ring_rts_elem_pvt.h
+ * It is not recommended to include this file directly,
+ * include <rte_ring.h> instead.
+ * Contains internal helper functions for Relaxed Tail Sync (RTS) ring mode.
+ * For more information please refer to <rte_ring_rts.h>.
+ */
+
+/**
+ * @internal This function updates tail values.
+ */
+static __rte_always_inline void
+__rte_ring_rts_v2_update_tail(struct rte_ring_rts_headtail *ht,
+	uint32_t old_tail, uint32_t num, uint32_t mask)
+{
+	union __rte_ring_rts_poscnt ot, nt;
+
+	ot.val.cnt = nt.val.cnt = 0;
+	ot.val.pos = old_tail;
+	nt.val.pos = old_tail + num;
+
+	/*
+	 * If the tail is equal to the current enqueues/dequeues, update
+	 * the tail with new value and then continue to try to update the
+	 * tail until the num of the cache is 0, otherwise write the num of
+	 * the current enqueues/dequeues to the cache.
+	 */
+
+	if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+				(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+				rte_memory_order_release, rte_memory_order_acquire) == 0) {
+		ot.val.pos = old_tail;
+
+		/*
+		 * Write the num of the current enqueues/dequeues to the
+		 * corresponding cache.
+		 */
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			num, rte_memory_order_release);
+
+		/*
+		 * There may be competition with another enqueues/dequeues
+		 * for the update tail. The winner continues to try to update
+		 * the tail, and the loser exits.
+		 */
+		if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+					(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+					rte_memory_order_release, rte_memory_order_acquire) == 0)
+			return;
+
+		/*
+		 * Set the corresponding cache to 0 for next use.
+		 */
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			0, rte_memory_order_release);
+	}
+
+	/*
+	 * Try to update the tail until the num of the corresponding cache is 0.
+	 * Getting here means that the current enqueues/dequeues is trying to update
+	 * the tail of another enqueues/dequeues.
+	 */
+	while (1) {
+		num = rte_atomic_load_explicit(&ht->rts_cache[nt.val.pos & mask].num,
+			rte_memory_order_acquire);
+		if (num == 0)
+			break;
+
+		ot.val.pos = nt.val.pos;
+		nt.val.pos += num;
+
+		/*
+		 * There may be competition with another enqueues/dequeues
+		 * for the update tail. The winner continues to try to update
+		 * the tail, and the loser exits.
+		 */
+		if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+					(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+					rte_memory_order_release, rte_memory_order_acquire) == 0)
+			return;
+
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			0, rte_memory_order_release);
+	};
+}
+
 /**
  * @internal This function waits till head/tail distance wouldn't
  * exceed pre-defined max value.
@@ -218,6 +304,47 @@ __rte_ring_do_rts_enqueue_elem(struct rte_ring *r, const void *obj_table,
 	return n;
 }
 
+/**
+ * @internal Enqueue several objects on the RTS ring.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param free_space
+ *   returns the amount of space after the enqueue operation has finished
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_rts_v2_enqueue_elem(struct rte_ring *r, const void *obj_table,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	uint32_t *free_space)
+{
+	uint32_t free, head;
+
+	n =  __rte_ring_rts_move_prod_head(r, n, behavior, &head, &free);
+
+	if (n != 0) {
+		__rte_ring_enqueue_elems(r, head, obj_table, esize, n);
+		__rte_ring_rts_v2_update_tail(&r->rts_prod, head, n, r->mask);
+	}
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
 /**
  * @internal Dequeue several objects from the RTS ring.
  *
@@ -259,4 +386,45 @@ __rte_ring_do_rts_dequeue_elem(struct rte_ring *r, void *obj_table,
 	return n;
 }
 
+/**
+ * @internal Dequeue several objects from the RTS ring.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to pull from the ring.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param available
+ *   returns the number of remaining ring entries after the dequeue has finished
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_rts_v2_dequeue_elem(struct rte_ring *r, void *obj_table,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	uint32_t *available)
+{
+	uint32_t entries, head;
+
+	n = __rte_ring_rts_move_cons_head(r, n, behavior, &head, &entries);
+
+	if (n != 0) {
+		__rte_ring_dequeue_elems(r, head, obj_table, esize, n);
+		__rte_ring_rts_v2_update_tail(&r->rts_cons, head, n, r->mask);
+	}
+
+	if (available != NULL)
+		*available = entries - n;
+	return n;
+}
+
 #endif /* _RTE_RING_RTS_ELEM_PVT_H_ */
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* [PATCH] ring: add the second version of the RTS interface
@ 2025-01-05  9:57  5% Huichao Cai
  2025-01-05 15:13  5% ` [PATCH v2] " Huichao Cai
  0 siblings, 1 reply; 200+ results
From: Huichao Cai @ 2025-01-05  9:57 UTC (permalink / raw)
  To: honnappa.nagarahalli, konstantin.v.ananyev; +Cc: dev

The timing of the update of the RTS enqueues/dequeues tail is
limited to the last enqueues/dequeues, which reduces concurrency,
so the RTS interface of the V2 version is added, which makes the tail
of the enqueues/dequeues not limited to the last enqueues/dequeues
and thus enables timely updates to increase concurrency.

Add some corresponding test cases.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 app/test/meson.build                   |   1 +
 app/test/test_ring.c                   |  26 +++
 app/test/test_ring_rts_v2_stress.c     |  32 ++++
 app/test/test_ring_stress.c            |   3 +
 app/test/test_ring_stress.h            |   1 +
 devtools/libabigail.abignore           |   3 +
 doc/guides/rel_notes/release_25_03.rst |   2 +
 lib/ring/rte_ring.c                    |  53 +++++-
 lib/ring/rte_ring.h                    |  12 ++
 lib/ring/rte_ring_core.h               |   9 ++
 lib/ring/rte_ring_elem.h               |  18 +++
 lib/ring/rte_ring_rts.h                | 216 ++++++++++++++++++++++++-
 lib/ring/rte_ring_rts_elem_pvt.h       | 168 +++++++++++++++++++
 13 files changed, 534 insertions(+), 10 deletions(-)
 create mode 100644 app/test/test_ring_rts_v2_stress.c

diff --git a/app/test/meson.build b/app/test/meson.build
index d5cb6a7f7a..e3d8cef3fa 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -166,6 +166,7 @@ source_file_deps = {
     'test_ring_mt_peek_stress_zc.c': ['ptr_compress'],
     'test_ring_perf.c': ['ptr_compress'],
     'test_ring_rts_stress.c': ['ptr_compress'],
+    'test_ring_rts_v2_stress.c': ['ptr_compress'],
     'test_ring_st_peek_stress.c': ['ptr_compress'],
     'test_ring_st_peek_stress_zc.c': ['ptr_compress'],
     'test_ring_stress.c': ['ptr_compress'],
diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index ba1fec1de3..094f14b859 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -284,6 +284,19 @@ static const struct {
 			.felem = rte_ring_dequeue_bulk_elem,
 		},
 	},
+	{
+		.desc = "MP_RTS/MC_RTS V2 sync mode",
+		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ,
+		.enq = {
+			.flegacy = rte_ring_enqueue_bulk,
+			.felem = rte_ring_enqueue_bulk_elem,
+		},
+		.deq = {
+			.flegacy = rte_ring_dequeue_bulk,
+			.felem = rte_ring_dequeue_bulk_elem,
+		},
+	},
 	{
 		.desc = "MP_HTS/MC_HTS sync mode",
 		.api_type = TEST_RING_ELEM_BULK | TEST_RING_THREAD_DEF,
@@ -349,6 +362,19 @@ static const struct {
 			.felem = rte_ring_dequeue_burst_elem,
 		},
 	},
+	{
+		.desc = "MP_RTS/MC_RTS V2 sync mode",
+		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
+		.create_flags = RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ,
+		.enq = {
+			.flegacy = rte_ring_enqueue_burst,
+			.felem = rte_ring_enqueue_burst_elem,
+		},
+		.deq = {
+			.flegacy = rte_ring_dequeue_burst,
+			.felem = rte_ring_dequeue_burst_elem,
+		},
+	},
 	{
 		.desc = "MP_HTS/MC_HTS sync mode",
 		.api_type = TEST_RING_ELEM_BURST | TEST_RING_THREAD_DEF,
diff --git a/app/test/test_ring_rts_v2_stress.c b/app/test/test_ring_rts_v2_stress.c
new file mode 100644
index 0000000000..6079366a7d
--- /dev/null
+++ b/app/test/test_ring_rts_v2_stress.c
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include "test_ring_stress_impl.h"
+
+static inline uint32_t
+_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n,
+	uint32_t *avail)
+{
+	return rte_ring_mc_rts_v2_dequeue_bulk(r, obj, n, avail);
+}
+
+static inline uint32_t
+_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t n,
+	uint32_t *free)
+{
+	return rte_ring_mp_rts_v2_enqueue_bulk(r, obj, n, free);
+}
+
+static int
+_st_ring_init(struct rte_ring *r, const char *name, uint32_t num)
+{
+	return rte_ring_init(r, name, num,
+		RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ);
+}
+
+const struct test test_ring_rts_v2_stress = {
+	.name = "MT_RTS_V2",
+	.nb_case = RTE_DIM(tests),
+	.cases = tests,
+};
diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c
index 1af45e0fc8..94085acd5e 100644
--- a/app/test/test_ring_stress.c
+++ b/app/test/test_ring_stress.c
@@ -43,6 +43,9 @@ test_ring_stress(void)
 	n += test_ring_rts_stress.nb_case;
 	k += run_test(&test_ring_rts_stress);
 
+	n += test_ring_rts_v2_stress.nb_case;
+	k += run_test(&test_ring_rts_v2_stress);
+
 	n += test_ring_hts_stress.nb_case;
 	k += run_test(&test_ring_hts_stress);
 
diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h
index 416d68c9a0..505957f6fb 100644
--- a/app/test/test_ring_stress.h
+++ b/app/test/test_ring_stress.h
@@ -34,6 +34,7 @@ struct test {
 
 extern const struct test test_ring_mpmc_stress;
 extern const struct test test_ring_rts_stress;
+extern const struct test test_ring_rts_v2_stress;
 extern const struct test test_ring_hts_stress;
 extern const struct test test_ring_mt_peek_stress;
 extern const struct test test_ring_mt_peek_stress_zc;
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..0a0f305acb 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,6 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+       name = rte_ring_rts_headtail
+       has_data_member_inserted_between = {offset_of(head), end}
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 426dfcd982..f73bc9e397 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -102,6 +102,8 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* ring: Added ``rte_ring_rts_cache`` structure and ``rts_cache`` field to the
+  ``rte_ring_rts_headtail`` structure.
 
 Known Issues
 ------------
diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c
index aebb6d6728..df84592300 100644
--- a/lib/ring/rte_ring.c
+++ b/lib/ring/rte_ring.c
@@ -43,7 +43,8 @@ EAL_REGISTER_TAILQ(rte_ring_tailq)
 /* mask of all valid flag values to ring_create() */
 #define RING_F_MASK (RING_F_SP_ENQ | RING_F_SC_DEQ | RING_F_EXACT_SZ | \
 		     RING_F_MP_RTS_ENQ | RING_F_MC_RTS_DEQ |	       \
-		     RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ)
+		     RING_F_MP_HTS_ENQ | RING_F_MC_HTS_DEQ |	       \
+		     RING_F_MP_RTS_V2_ENQ | RING_F_MC_RTS_V2_DEQ)
 
 /* true if x is a power of 2 */
 #define POWEROF2(x) ((((x)-1) & (x)) == 0)
@@ -106,6 +107,7 @@ reset_headtail(void *p)
 		ht->tail = 0;
 		break;
 	case RTE_RING_SYNC_MT_RTS:
+	case RTE_RING_SYNC_MT_RTS_V2:
 		ht_rts->head.raw = 0;
 		ht_rts->tail.raw = 0;
 		break;
@@ -135,9 +137,11 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	enum rte_ring_sync_type *cons_st)
 {
 	static const uint32_t prod_st_flags =
-		(RING_F_SP_ENQ | RING_F_MP_RTS_ENQ | RING_F_MP_HTS_ENQ);
+		(RING_F_SP_ENQ | RING_F_MP_RTS_ENQ | RING_F_MP_HTS_ENQ |
+		RING_F_MP_RTS_V2_ENQ);
 	static const uint32_t cons_st_flags =
-		(RING_F_SC_DEQ | RING_F_MC_RTS_DEQ | RING_F_MC_HTS_DEQ);
+		(RING_F_SC_DEQ | RING_F_MC_RTS_DEQ | RING_F_MC_HTS_DEQ |
+		RING_F_MC_RTS_V2_DEQ);
 
 	switch (flags & prod_st_flags) {
 	case 0:
@@ -152,6 +156,9 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	case RING_F_MP_HTS_ENQ:
 		*prod_st = RTE_RING_SYNC_MT_HTS;
 		break;
+	case RING_F_MP_RTS_V2_ENQ:
+		*prod_st = RTE_RING_SYNC_MT_RTS_V2;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -169,6 +176,9 @@ get_sync_type(uint32_t flags, enum rte_ring_sync_type *prod_st,
 	case RING_F_MC_HTS_DEQ:
 		*cons_st = RTE_RING_SYNC_MT_HTS;
 		break;
+	case RING_F_MC_RTS_V2_DEQ:
+		*cons_st = RTE_RING_SYNC_MT_RTS_V2;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -239,6 +249,28 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
 	if (flags & RING_F_MC_RTS_DEQ)
 		rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
 
+	/* set default values for head-tail distance and allocate memory to cache */
+	if (flags & RING_F_MP_RTS_V2_ENQ) {
+		rte_ring_set_prod_htd_max(r, r->capacity / HTD_MAX_DEF);
+		r->rts_prod.rts_cache = (struct rte_ring_rts_cache *)rte_zmalloc(
+			"RTS_PROD_CACHE", sizeof(struct rte_ring_rts_cache) * r->size, 0);
+		if (r->rts_prod.rts_cache == NULL) {
+			RING_LOG(ERR, "Cannot reserve memory for rts prod cache");
+			return -ENOMEM;
+		}
+	}
+	if (flags & RING_F_MC_RTS_V2_DEQ) {
+		rte_ring_set_cons_htd_max(r, r->capacity / HTD_MAX_DEF);
+		r->rts_cons.rts_cache = (struct rte_ring_rts_cache *)rte_zmalloc(
+			"RTS_CONS_CACHE", sizeof(struct rte_ring_rts_cache) * r->size, 0);
+		if (r->rts_cons.rts_cache == NULL) {
+			if (flags & RING_F_MP_RTS_V2_ENQ)
+				rte_free(r->rts_prod.rts_cache);
+			RING_LOG(ERR, "Cannot reserve memory for rts cons cache");
+			return -ENOMEM;
+		}
+	}
+
 	return 0;
 }
 
@@ -293,9 +325,13 @@ rte_ring_create_elem(const char *name, unsigned int esize, unsigned int count,
 					 mz_flags, alignof(typeof(*r)));
 	if (mz != NULL) {
 		r = mz->addr;
-		/* no need to check return value here, we already checked the
-		 * arguments above */
-		rte_ring_init(r, name, requested_count, flags);
+
+		if (rte_ring_init(r, name, requested_count, flags)) {
+			rte_free(te);
+			if (rte_memzone_free(mz) != 0)
+				RING_LOG(ERR, "Cannot free memory for ring");
+			return NULL;
+		}
 
 		te->data = (void *) r;
 		r->memzone = mz;
@@ -358,6 +394,11 @@ rte_ring_free(struct rte_ring *r)
 
 	rte_mcfg_tailq_write_unlock();
 
+	if (r->flags & RING_F_MP_RTS_V2_ENQ)
+		rte_free(r->rts_prod.rts_cache);
+	if (r->flags & RING_F_MC_RTS_V2_DEQ)
+		rte_free(r->rts_cons.rts_cache);
+
 	if (rte_memzone_free(r->memzone) != 0)
 		RING_LOG(ERR, "Cannot free memory");
 
diff --git a/lib/ring/rte_ring.h b/lib/ring/rte_ring.h
index 11ca69c73d..2b35ce038e 100644
--- a/lib/ring/rte_ring.h
+++ b/lib/ring/rte_ring.h
@@ -89,6 +89,9 @@ ssize_t rte_ring_get_memsize(unsigned int count);
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -101,6 +104,9 @@ ssize_t rte_ring_get_memsize(unsigned int count);
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
@@ -149,6 +155,9 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -161,6 +170,9 @@ int rte_ring_init(struct rte_ring *r, const char *name, unsigned int count,
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
diff --git a/lib/ring/rte_ring_core.h b/lib/ring/rte_ring_core.h
index 6cd6ce9884..9e627d26c1 100644
--- a/lib/ring/rte_ring_core.h
+++ b/lib/ring/rte_ring_core.h
@@ -55,6 +55,7 @@ enum rte_ring_sync_type {
 	RTE_RING_SYNC_ST,     /**< single thread only */
 	RTE_RING_SYNC_MT_RTS, /**< multi-thread relaxed tail sync */
 	RTE_RING_SYNC_MT_HTS, /**< multi-thread head/tail sync */
+	RTE_RING_SYNC_MT_RTS_V2, /**< multi-thread relaxed tail sync v2 */
 };
 
 /**
@@ -82,11 +83,16 @@ union __rte_ring_rts_poscnt {
 	} val;
 };
 
+struct rte_ring_rts_cache {
+	volatile RTE_ATOMIC(uint32_t) num;      /**< Number of objs. */
+};
+
 struct rte_ring_rts_headtail {
 	volatile union __rte_ring_rts_poscnt tail;
 	enum rte_ring_sync_type sync_type;  /**< sync type of prod/cons */
 	uint32_t htd_max;   /**< max allowed distance between head/tail */
 	volatile union __rte_ring_rts_poscnt head;
+	struct rte_ring_rts_cache *rts_cache; /**< Cache of prod/cons */
 };
 
 union __rte_ring_hts_pos {
@@ -163,4 +169,7 @@ struct rte_ring {
 #define RING_F_MP_HTS_ENQ 0x0020 /**< The default enqueue is "MP HTS". */
 #define RING_F_MC_HTS_DEQ 0x0040 /**< The default dequeue is "MC HTS". */
 
+#define RING_F_MP_RTS_V2_ENQ 0x0080 /**< The default enqueue is "MP RTS V2". */
+#define RING_F_MC_RTS_V2_DEQ 0x0100 /**< The default dequeue is "MC RTS V2". */
+
 #endif /* _RTE_RING_CORE_H_ */
diff --git a/lib/ring/rte_ring_elem.h b/lib/ring/rte_ring_elem.h
index b96bfc003f..1352709f94 100644
--- a/lib/ring/rte_ring_elem.h
+++ b/lib/ring/rte_ring_elem.h
@@ -71,6 +71,9 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
  *      - RING_F_MP_RTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer RTS mode".
+ *      - RING_F_MP_RTS_V2_ENQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
+ *        is "multi-producer RTS V2 mode".
  *      - RING_F_MP_HTS_ENQ: If this flag is set, the default behavior when
  *        using ``rte_ring_enqueue()`` or ``rte_ring_enqueue_bulk()``
  *        is "multi-producer HTS mode".
@@ -83,6 +86,9 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
  *      - RING_F_MC_RTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer RTS mode".
+ *      - RING_F_MC_RTS_V2_DEQ: If this flag is set, the default behavior when
+ *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
+ *        is "multi-consumer RTS V2 mode".
  *      - RING_F_MC_HTS_DEQ: If this flag is set, the default behavior when
  *        using ``rte_ring_dequeue()`` or ``rte_ring_dequeue_bulk()``
  *        is "multi-consumer HTS mode".
@@ -203,6 +209,9 @@ rte_ring_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mp_hts_enqueue_bulk_elem(r, obj_table, esize, n,
 			free_space);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mp_rts_v2_enqueue_bulk_elem(r, obj_table, esize, n,
+			free_space);
 	}
 
 	/* valid ring should never reach this point */
@@ -385,6 +394,9 @@ rte_ring_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mc_hts_dequeue_bulk_elem(r, obj_table, esize,
 			n, available);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mc_rts_v2_dequeue_bulk_elem(r, obj_table, esize,
+			n, available);
 	}
 
 	/* valid ring should never reach this point */
@@ -571,6 +583,9 @@ rte_ring_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mp_hts_enqueue_burst_elem(r, obj_table, esize,
 			n, free_space);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mp_rts_v2_enqueue_burst_elem(r, obj_table, esize,
+			n, free_space);
 	}
 
 	/* valid ring should never reach this point */
@@ -681,6 +696,9 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 	case RTE_RING_SYNC_MT_HTS:
 		return rte_ring_mc_hts_dequeue_burst_elem(r, obj_table, esize,
 			n, available);
+	case RTE_RING_SYNC_MT_RTS_V2:
+		return rte_ring_mc_rts_v2_dequeue_burst_elem(r, obj_table, esize,
+			n, available);
 	}
 
 	/* valid ring should never reach this point */
diff --git a/lib/ring/rte_ring_rts.h b/lib/ring/rte_ring_rts.h
index d7a3863c83..b47e400452 100644
--- a/lib/ring/rte_ring_rts.h
+++ b/lib/ring/rte_ring_rts.h
@@ -84,6 +84,33 @@ rte_ring_mp_rts_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
 			RTE_RING_QUEUE_FIXED, free_space);
 }
 
+/**
+ * Enqueue several objects on the RTS ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   The number of objects enqueued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_bulk_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	return __rte_ring_do_rts_v2_enqueue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_FIXED, free_space);
+}
+
 /**
  * Dequeue several objects from an RTS ring (multi-consumers safe).
  *
@@ -111,6 +138,33 @@ rte_ring_mc_rts_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
 			RTE_RING_QUEUE_FIXED, available);
 }
 
+/**
+ * Dequeue several objects from an RTS ring (multi-consumers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects that will be filled.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects dequeued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_bulk_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	return __rte_ring_do_rts_v2_dequeue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_FIXED, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -138,6 +192,33 @@ rte_ring_mp_rts_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
 			RTE_RING_QUEUE_VARIABLE, free_space);
 }
 
+/**
+ * Enqueue several objects on the RTS ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   - n: Actual number of objects enqueued.
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_burst_elem(struct rte_ring *r, const void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *free_space)
+{
+	return __rte_ring_do_rts_v2_enqueue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_VARIABLE, free_space);
+}
+
 /**
  * Dequeue several objects from an RTS  ring (multi-consumers safe).
  * When the requested objects are more than the available objects,
@@ -167,6 +248,35 @@ rte_ring_mc_rts_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
 			RTE_RING_QUEUE_VARIABLE, available);
 }
 
+/**
+ * Dequeue several objects from an RTS  ring (multi-consumers safe).
+ * When the requested objects are more than the available objects,
+ * only dequeue the actual number of objects.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects that will be filled.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   - n: Actual number of objects dequeued, 0 if ring is empty
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_burst_elem(struct rte_ring *r, void *obj_table,
+	unsigned int esize, unsigned int n, unsigned int *available)
+{
+	return __rte_ring_do_rts_v2_dequeue_elem(r, obj_table, esize, n,
+			RTE_RING_QUEUE_VARIABLE, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -213,6 +323,52 @@ rte_ring_mc_rts_dequeue_bulk(struct rte_ring *r, void **obj_table,
 			sizeof(uintptr_t), n, available);
 }
 
+/**
+ * Enqueue several objects on the RTS V2 ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   The number of objects enqueued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
+			 unsigned int n, unsigned int *free_space)
+{
+	return rte_ring_mp_rts_v2_enqueue_bulk_elem(r, obj_table,
+			sizeof(uintptr_t), n, free_space);
+}
+
+/**
+ * Dequeue several objects from an RTS V2 ring (multi-consumers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects) that will be filled.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   The number of objects dequeued, either 0 or n
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
+{
+	return rte_ring_mc_rts_v2_dequeue_bulk_elem(r, obj_table,
+			sizeof(uintptr_t), n, available);
+}
+
 /**
  * Enqueue several objects on the RTS ring (multi-producers safe).
  *
@@ -261,6 +417,54 @@ rte_ring_mc_rts_dequeue_burst(struct rte_ring *r, void **obj_table,
 			sizeof(uintptr_t), n, available);
 }
 
+/**
+ * Enqueue several objects on the RTS V2 ring (multi-producers safe).
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param free_space
+ *   if non-NULL, returns the amount of space in the ring after the
+ *   enqueue operation has finished.
+ * @return
+ *   - n: Actual number of objects enqueued.
+ */
+static __rte_always_inline unsigned int
+rte_ring_mp_rts_v2_enqueue_burst(struct rte_ring *r, void * const *obj_table,
+			 unsigned int n, unsigned int *free_space)
+{
+	return rte_ring_mp_rts_v2_enqueue_burst_elem(r, obj_table,
+			sizeof(uintptr_t), n, free_space);
+}
+
+/**
+ * Dequeue several objects from an RTS V2 ring (multi-consumers safe).
+ * When the requested objects are more than the available objects,
+ * only dequeue the actual number of objects.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects) that will be filled.
+ * @param n
+ *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
+ * @return
+ *   - n: Actual number of objects dequeued, 0 if ring is empty
+ */
+static __rte_always_inline unsigned int
+rte_ring_mc_rts_v2_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
+{
+	return rte_ring_mc_rts_v2_dequeue_burst_elem(r, obj_table,
+			sizeof(uintptr_t), n, available);
+}
+
 /**
  * Return producer max Head-Tail-Distance (HTD).
  *
@@ -273,7 +477,8 @@ rte_ring_mc_rts_dequeue_burst(struct rte_ring *r, void **obj_table,
 static inline uint32_t
 rte_ring_get_prod_htd_max(const struct rte_ring *r)
 {
-	if (r->prod.sync_type == RTE_RING_SYNC_MT_RTS)
+	if ((r->prod.sync_type == RTE_RING_SYNC_MT_RTS) ||
+			(r->prod.sync_type == RTE_RING_SYNC_MT_RTS_V2))
 		return r->rts_prod.htd_max;
 	return UINT32_MAX;
 }
@@ -292,7 +497,8 @@ rte_ring_get_prod_htd_max(const struct rte_ring *r)
 static inline int
 rte_ring_set_prod_htd_max(struct rte_ring *r, uint32_t v)
 {
-	if (r->prod.sync_type != RTE_RING_SYNC_MT_RTS)
+	if ((r->prod.sync_type != RTE_RING_SYNC_MT_RTS) &&
+			(r->prod.sync_type != RTE_RING_SYNC_MT_RTS_V2))
 		return -ENOTSUP;
 
 	r->rts_prod.htd_max = v;
@@ -311,7 +517,8 @@ rte_ring_set_prod_htd_max(struct rte_ring *r, uint32_t v)
 static inline uint32_t
 rte_ring_get_cons_htd_max(const struct rte_ring *r)
 {
-	if (r->cons.sync_type == RTE_RING_SYNC_MT_RTS)
+	if ((r->cons.sync_type == RTE_RING_SYNC_MT_RTS) ||
+			(r->cons.sync_type == RTE_RING_SYNC_MT_RTS_V2))
 		return r->rts_cons.htd_max;
 	return UINT32_MAX;
 }
@@ -330,7 +537,8 @@ rte_ring_get_cons_htd_max(const struct rte_ring *r)
 static inline int
 rte_ring_set_cons_htd_max(struct rte_ring *r, uint32_t v)
 {
-	if (r->cons.sync_type != RTE_RING_SYNC_MT_RTS)
+	if ((r->cons.sync_type != RTE_RING_SYNC_MT_RTS) &&
+			(r->cons.sync_type != RTE_RING_SYNC_MT_RTS_V2))
 		return -ENOTSUP;
 
 	r->rts_cons.htd_max = v;
diff --git a/lib/ring/rte_ring_rts_elem_pvt.h b/lib/ring/rte_ring_rts_elem_pvt.h
index 122650346b..4ce22a93ed 100644
--- a/lib/ring/rte_ring_rts_elem_pvt.h
+++ b/lib/ring/rte_ring_rts_elem_pvt.h
@@ -46,6 +46,92 @@ __rte_ring_rts_update_tail(struct rte_ring_rts_headtail *ht)
 			rte_memory_order_release, rte_memory_order_acquire) == 0);
 }
 
+/**
+ * @file rte_ring_rts_elem_pvt.h
+ * It is not recommended to include this file directly,
+ * include <rte_ring.h> instead.
+ * Contains internal helper functions for Relaxed Tail Sync (RTS) ring mode.
+ * For more information please refer to <rte_ring_rts.h>.
+ */
+
+/**
+ * @internal This function updates tail values.
+ */
+static __rte_always_inline void
+__rte_ring_rts_v2_update_tail(struct rte_ring_rts_headtail *ht,
+	uint32_t old_tail, uint32_t num, uint32_t mask)
+{
+	union __rte_ring_rts_poscnt ot, nt;
+
+	ot.val.cnt = nt.val.cnt = 0;
+	ot.val.pos = old_tail;
+	nt.val.pos = old_tail + num;
+
+	/*
+	 * If the tail is equal to the current enqueues/dequeues, update
+	 * the tail with new value and then continue to try to update the
+	 * tail until the num of the cache is 0, otherwise write the num of
+	 * the current enqueues/dequeues to the cache.
+	 */
+
+	if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+				(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+				rte_memory_order_release, rte_memory_order_acquire) == 0) {
+		ot.val.pos = old_tail;
+
+		/*
+		 * Write the num of the current enqueues/dequeues to the
+		 * corresponding cache.
+		 */
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			num, rte_memory_order_release);
+
+		/*
+		 * There may be competition with another enqueues/dequeues
+		 * for the update tail. The winner continues to try to update
+		 * the tail, and the loser exits.
+		 */
+		if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+					(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+					rte_memory_order_release, rte_memory_order_acquire) == 0)
+			return;
+
+		/*
+		 * Set the corresponding cache to 0 for next use.
+		 */
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			0, rte_memory_order_release);
+	}
+
+	/*
+	 * Try to update the tail until the num of the corresponding cache is 0.
+	 * Getting here means that the current enqueues/dequeues is trying to update
+	 * the tail of another enqueues/dequeues.
+	 */
+	while (1) {
+		num = rte_atomic_load_explicit(&ht->rts_cache[nt.val.pos & mask].num,
+			rte_memory_order_acquire);
+		if (num == 0)
+			break;
+
+		ot.val.pos = nt.val.pos;
+		nt.val.pos += num;
+
+		/*
+		 * There may be competition with another enqueues/dequeues
+		 * for the update tail. The winner continues to try to update
+		 * the tail, and the loser exits.
+		 */
+		if (rte_atomic_compare_exchange_strong_explicit(&ht->tail.raw,
+					(uint64_t *)(uintptr_t)&ot.raw, nt.raw,
+					rte_memory_order_release, rte_memory_order_acquire) == 0)
+			return;
+
+		rte_atomic_store_explicit(&ht->rts_cache[ot.val.pos & mask].num,
+			0, rte_memory_order_release);
+	};
+}
+
 /**
  * @internal This function waits till head/tail distance wouldn't
  * exceed pre-defined max value.
@@ -218,6 +304,47 @@ __rte_ring_do_rts_enqueue_elem(struct rte_ring *r, const void *obj_table,
 	return n;
 }
 
+/**
+ * @internal Enqueue several objects on the RTS ring.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to add in the ring from the obj_table.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
+ * @param free_space
+ *   returns the amount of space after the enqueue operation has finished
+ * @return
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_rts_v2_enqueue_elem(struct rte_ring *r, const void *obj_table,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	uint32_t *free_space)
+{
+	uint32_t free, head;
+
+	n =  __rte_ring_rts_move_prod_head(r, n, behavior, &head, &free);
+
+	if (n != 0) {
+		__rte_ring_enqueue_elems(r, head, obj_table, esize, n);
+		__rte_ring_rts_v2_update_tail(&r->rts_prod, head, n, r->mask);
+	}
+
+	if (free_space != NULL)
+		*free_space = free - n;
+	return n;
+}
+
 /**
  * @internal Dequeue several objects from the RTS ring.
  *
@@ -259,4 +386,45 @@ __rte_ring_do_rts_dequeue_elem(struct rte_ring *r, void *obj_table,
 	return n;
 }
 
+/**
+ * @internal Dequeue several objects from the RTS ring.
+ *
+ * @param r
+ *   A pointer to the ring structure.
+ * @param obj_table
+ *   A pointer to a table of objects.
+ * @param esize
+ *   The size of ring element, in bytes. It must be a multiple of 4.
+ *   This must be the same value used while creating the ring. Otherwise
+ *   the results are undefined.
+ * @param n
+ *   The number of objects to pull from the ring.
+ * @param behavior
+ *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
+ *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
+ * @param available
+ *   returns the number of remaining ring entries after the dequeue has finished
+ * @return
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
+ */
+static __rte_always_inline unsigned int
+__rte_ring_do_rts_v2_dequeue_elem(struct rte_ring *r, void *obj_table,
+	uint32_t esize, uint32_t n, enum rte_ring_queue_behavior behavior,
+	uint32_t *available)
+{
+	uint32_t entries, head;
+
+	n = __rte_ring_rts_move_cons_head(r, n, behavior, &head, &entries);
+
+	if (n != 0) {
+		__rte_ring_dequeue_elems(r, head, obj_table, esize, n);
+		__rte_ring_rts_v2_update_tail(&r->rts_cons, head, n, r->mask);
+	}
+
+	if (available != NULL)
+		*available = entries - n;
+	return n;
+}
+
 #endif /* _RTE_RING_RTS_ELEM_PVT_H_ */
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* [v1 15/16] crypto/virtio: add vhost backend to virtio_user
  @ 2024-12-24  7:37  1% ` Gowrishankar Muthukrishnan
  0 siblings, 0 replies; 200+ results
From: Gowrishankar Muthukrishnan @ 2024-12-24  7:37 UTC (permalink / raw)
  To: dev, Akhil Goyal, Maxime Coquelin, Chenbo Xia, Fan Zhang, Jay Zhou
  Cc: jerinj, anoobj, Rajesh Mudimadugula, Gowrishankar Muthukrishnan

Add vhost backend to virtio_user crypto.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
---
 drivers/crypto/virtio/meson.build             |   7 +
 drivers/crypto/virtio/virtio_cryptodev.c      |  57 +-
 drivers/crypto/virtio/virtio_cryptodev.h      |   3 +
 drivers/crypto/virtio/virtio_pci.h            |   7 +
 drivers/crypto/virtio/virtio_ring.h           |   6 -
 .../crypto/virtio/virtio_user/vhost_vdpa.c    | 310 +++++++
 .../virtio/virtio_user/virtio_user_dev.c      | 774 ++++++++++++++++++
 .../virtio/virtio_user/virtio_user_dev.h      |  88 ++
 drivers/crypto/virtio/virtio_user_cryptodev.c | 586 +++++++++++++
 9 files changed, 1810 insertions(+), 28 deletions(-)
 create mode 100644 drivers/crypto/virtio/virtio_user/vhost_vdpa.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.c
 create mode 100644 drivers/crypto/virtio/virtio_user/virtio_user_dev.h
 create mode 100644 drivers/crypto/virtio/virtio_user_cryptodev.c

diff --git a/drivers/crypto/virtio/meson.build b/drivers/crypto/virtio/meson.build
index a4954a094b..a178a61487 100644
--- a/drivers/crypto/virtio/meson.build
+++ b/drivers/crypto/virtio/meson.build
@@ -17,3 +17,10 @@ sources = files(
         'virtio_rxtx.c',
         'virtqueue.c',
 )
+
+if is_linux
+    sources += files('virtio_user_cryptodev.c',
+        'virtio_user/vhost_vdpa.c',
+        'virtio_user/virtio_user_dev.c')
+    deps += ['bus_vdev', 'common_virtio']
+endif
diff --git a/drivers/crypto/virtio/virtio_cryptodev.c b/drivers/crypto/virtio/virtio_cryptodev.c
index 159e96f7db..e9e65366fe 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.c
+++ b/drivers/crypto/virtio/virtio_cryptodev.c
@@ -544,24 +544,12 @@ virtio_crypto_init_device(struct rte_cryptodev *cryptodev,
 	return 0;
 }
 
-/*
- * This function is based on probe() function
- * It returns 0 on success.
- */
-static int
-crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
-		struct rte_cryptodev_pmd_init_params *init_params)
+int
+crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev)
 {
-	struct rte_cryptodev *cryptodev;
 	struct virtio_crypto_hw *hw;
 
-	PMD_INIT_FUNC_TRACE();
-
-	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
-					init_params);
-	if (cryptodev == NULL)
-		return -ENODEV;
-
 	cryptodev->driver_id = cryptodev_virtio_driver_id;
 	cryptodev->dev_ops = &virtio_crypto_dev_ops;
 
@@ -578,16 +566,41 @@ crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
 	hw->dev_id = cryptodev->data->dev_id;
 	hw->virtio_dev_capabilities = virtio_capabilities;
 
-	VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
-		cryptodev->data->dev_id, pci_dev->id.vendor_id,
-		pci_dev->id.device_id);
+	if (pci_dev) {
+		/* pci device init */
+		VIRTIO_CRYPTO_INIT_LOG_DBG("dev %d vendorID=0x%x deviceID=0x%x",
+			cryptodev->data->dev_id, pci_dev->id.vendor_id,
+			pci_dev->id.device_id);
 
-	/* pci device init */
-	if (vtpci_cryptodev_init(pci_dev, hw))
+		if (vtpci_cryptodev_init(pci_dev, hw))
+			return -1;
+	}
+
+	if (virtio_crypto_init_device(cryptodev, features) < 0)
 		return -1;
 
-	if (virtio_crypto_init_device(cryptodev,
-			VIRTIO_CRYPTO_PMD_GUEST_FEATURES) < 0)
+	return 0;
+}
+
+/*
+ * This function is based on probe() function
+ * It returns 0 on success.
+ */
+static int
+crypto_virtio_create(const char *name, struct rte_pci_device *pci_dev,
+		struct rte_cryptodev_pmd_init_params *init_params)
+{
+	struct rte_cryptodev *cryptodev;
+
+	PMD_INIT_FUNC_TRACE();
+
+	cryptodev = rte_cryptodev_pmd_create(name, &pci_dev->device,
+					init_params);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_CRYPTO_PMD_GUEST_FEATURES,
+			pci_dev) < 0)
 		return -1;
 
 	rte_cryptodev_pmd_probing_finish(cryptodev);
diff --git a/drivers/crypto/virtio/virtio_cryptodev.h b/drivers/crypto/virtio/virtio_cryptodev.h
index b4bdd9800b..95a1e09dca 100644
--- a/drivers/crypto/virtio/virtio_cryptodev.h
+++ b/drivers/crypto/virtio/virtio_cryptodev.h
@@ -74,4 +74,7 @@ uint16_t virtio_crypto_pkt_rx_burst(void *tx_queue,
 		struct rte_crypto_op **tx_pkts,
 		uint16_t nb_pkts);
 
+int crypto_virtio_dev_init(struct rte_cryptodev *cryptodev, uint64_t features,
+		struct rte_pci_device *pci_dev);
+
 #endif /* _VIRTIO_CRYPTODEV_H_ */
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
index 79945cb88e..c75777e005 100644
--- a/drivers/crypto/virtio/virtio_pci.h
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -20,6 +20,9 @@ struct virtqueue;
 #define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
 #define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
 
+/* VirtIO device IDs. */
+#define VIRTIO_ID_CRYPTO  20
+
 /* VirtIO ABI version, this must match exactly. */
 #define VIRTIO_PCI_ABI_VERSION 0
 
@@ -56,8 +59,12 @@ struct virtqueue;
 #define VIRTIO_CONFIG_STATUS_DRIVER    0x02
 #define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
 #define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_DEV_NEED_RESET	0x40
 #define VIRTIO_CONFIG_STATUS_FAILED    0x80
 
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_VRING_ALIGN 4096
+
 /*
  * Each virtqueue indirect descriptor list must be physically contiguous.
  * To allow us to malloc(9) each list individually, limit the number
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
index c74d1172b7..4b418f6e60 100644
--- a/drivers/crypto/virtio/virtio_ring.h
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -181,12 +181,6 @@ vring_init_packed(struct vring_packed *vr, uint8_t *p, rte_iova_t iova,
 				sizeof(struct vring_packed_desc_event)), align);
 }
 
-static inline void
-vring_init(struct vring *vr, unsigned int num, uint8_t *p, unsigned long align)
-{
-	vring_init_split(vr, p, 0, align, num);
-}
-
 /*
  * The following is used with VIRTIO_RING_F_EVENT_IDX.
  * Assuming a given event_idx value from the other size, if we have
diff --git a/drivers/crypto/virtio/virtio_user/vhost_vdpa.c b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
new file mode 100644
index 0000000000..3fedade775
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/vhost_vdpa.c
@@ -0,0 +1,310 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Marvell
+ */
+
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include <rte_memory.h>
+
+#include "virtio_user/vhost.h"
+
+#include "virtio_user_dev.h"
+#include "../virtio_pci.h"
+
+struct vhost_vdpa_data {
+	int vhostfd;
+	uint64_t protocol_features;
+};
+
+#define VHOST_VDPA_SUPPORTED_BACKEND_FEATURES		\
+	(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2	|	\
+	1ULL << VHOST_BACKEND_F_IOTLB_BATCH)
+
+/* vhost kernel & vdpa ioctls */
+#define VHOST_VIRTIO 0xAF
+#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
+#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
+#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
+#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
+#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
+#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
+#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
+#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
+#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
+#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
+#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
+#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct vhost_vdpa_config)
+#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct vhost_vring_state)
+#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
+#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
+
+/* no alignment requirement */
+struct vhost_iotlb_msg {
+	uint64_t iova;
+	uint64_t size;
+	uint64_t uaddr;
+#define VHOST_ACCESS_RO      0x1
+#define VHOST_ACCESS_WO      0x2
+#define VHOST_ACCESS_RW      0x3
+	uint8_t perm;
+#define VHOST_IOTLB_MISS           1
+#define VHOST_IOTLB_UPDATE         2
+#define VHOST_IOTLB_INVALIDATE     3
+#define VHOST_IOTLB_ACCESS_FAIL    4
+#define VHOST_IOTLB_BATCH_BEGIN    5
+#define VHOST_IOTLB_BATCH_END      6
+	uint8_t type;
+};
+
+#define VHOST_IOTLB_MSG_V2 0x2
+
+struct vhost_vdpa_config {
+	uint32_t off;
+	uint32_t len;
+	uint8_t buf[];
+};
+
+struct vhost_msg {
+	uint32_t type;
+	uint32_t reserved;
+	union {
+		struct vhost_iotlb_msg iotlb;
+		uint8_t padding[64];
+	};
+};
+
+
+static int
+vhost_vdpa_ioctl(int fd, uint64_t request, void *arg)
+{
+	int ret;
+
+	ret = ioctl(fd, request, arg);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Vhost-vDPA ioctl %"PRIu64" failed (%s)",
+				request, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_get_protocol_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_BACKEND_FEATURES, features);
+}
+
+static int
+vhost_vdpa_set_protocol_features(struct virtio_user_dev *dev, uint64_t features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_SET_BACKEND_FEATURES, &features);
+}
+
+static int
+vhost_vdpa_get_features(struct virtio_user_dev *dev, uint64_t *features)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int ret;
+
+	ret = vhost_vdpa_ioctl(data->vhostfd, VHOST_GET_FEATURES, features);
+	if (ret) {
+		PMD_DRV_LOG(ERR, "Failed to get features");
+		return -1;
+	}
+
+	/* Negotiated vDPA backend features */
+	ret = vhost_vdpa_get_protocol_features(dev, &data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to get backend features");
+		return -1;
+	}
+
+	data->protocol_features &= VHOST_VDPA_SUPPORTED_BACKEND_FEATURES;
+
+	ret = vhost_vdpa_set_protocol_features(dev, data->protocol_features);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Failed to set backend features");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vhost_vdpa_set_vring_enable(struct virtio_user_dev *dev, struct vhost_vring_state *state)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+
+	return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_VRING_ENABLE, state);
+}
+
+/**
+ * Set up environment to talk with a vhost vdpa backend.
+ *
+ * @return
+ *   - (-1) if fail to set up;
+ *   - (>=0) if successful.
+ */
+static int
+vhost_vdpa_setup(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data;
+	uint32_t did = (uint32_t)-1;
+
+	data = malloc(sizeof(*data));
+	if (!data) {
+		PMD_DRV_LOG(ERR, "(%s) Faidle to allocate backend data", dev->path);
+		return -1;
+	}
+
+	data->vhostfd = open(dev->path, O_RDWR);
+	if (data->vhostfd < 0) {
+		PMD_DRV_LOG(ERR, "Failed to open %s: %s",
+				dev->path, strerror(errno));
+		free(data);
+		return -1;
+	}
+
+	if (ioctl(data->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 ||
+			did != VIRTIO_ID_CRYPTO) {
+		PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u", did);
+		close(data->vhostfd);
+		free(data);
+		return -1;
+	}
+
+	dev->backend_data = data;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_cvq_enable(struct virtio_user_dev *dev, int enable)
+{
+	struct vhost_vring_state state = {
+		.index = dev->max_queue_pairs,
+		.num   = enable,
+	};
+
+	return vhost_vdpa_set_vring_enable(dev, &state);
+}
+
+static int
+vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev,
+				uint16_t pair_idx,
+				int enable)
+{
+	struct vhost_vring_state state = {
+		.index = pair_idx,
+		.num   = enable,
+	};
+
+	if (dev->qp_enabled[pair_idx] == enable)
+		return 0;
+
+	if (vhost_vdpa_set_vring_enable(dev, &state))
+		return -1;
+
+	dev->qp_enabled[pair_idx] = enable;
+	return 0;
+}
+
+static int
+vhost_vdpa_update_link_state(struct virtio_user_dev *dev)
+{
+	dev->crypto_status = VIRTIO_CRYPTO_S_HW_READY;
+	return 0;
+}
+
+static int
+vhost_vdpa_get_nr_vrings(struct virtio_user_dev *dev)
+{
+	int nr_vrings = dev->max_queue_pairs;
+
+	return nr_vrings;
+}
+
+static int
+vhost_vdpa_unmap_notification_area(struct virtio_user_dev *dev)
+{
+	int i, nr_vrings;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	for (i = 0; i < nr_vrings; i++) {
+		if (dev->notify_area[i])
+			munmap(dev->notify_area[i], getpagesize());
+	}
+	free(dev->notify_area);
+	dev->notify_area = NULL;
+
+	return 0;
+}
+
+static int
+vhost_vdpa_map_notification_area(struct virtio_user_dev *dev)
+{
+	struct vhost_vdpa_data *data = dev->backend_data;
+	int nr_vrings, i, page_size = getpagesize();
+	uint16_t **notify_area;
+
+	nr_vrings = vhost_vdpa_get_nr_vrings(dev);
+
+	/* CQ is another vring */
+	nr_vrings++;
+
+	notify_area = malloc(nr_vrings * sizeof(*notify_area));
+	if (!notify_area) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to allocate notify area array", dev->path);
+		return -1;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		notify_area[i] = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED | MAP_FILE,
+					data->vhostfd, i * page_size);
+		if (notify_area[i] == MAP_FAILED) {
+			PMD_DRV_LOG(ERR, "(%s) Map failed for notify address of queue %d",
+					dev->path, i);
+			i--;
+			goto map_err;
+		}
+	}
+	dev->notify_area = notify_area;
+
+	return 0;
+
+map_err:
+	for (; i >= 0; i--)
+		munmap(notify_area[i], page_size);
+	free(notify_area);
+
+	return -1;
+}
+
+struct virtio_user_backend_ops virtio_crypto_ops_vdpa = {
+	.setup = vhost_vdpa_setup,
+	.get_features = vhost_vdpa_get_features,
+	.cvq_enable = vhost_vdpa_cvq_enable,
+	.enable_qp = vhost_vdpa_enable_queue_pair,
+	.update_link_state = vhost_vdpa_update_link_state,
+	.map_notification_area = vhost_vdpa_map_notification_area,
+	.unmap_notification_area = vhost_vdpa_unmap_notification_area,
+};
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.c b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
new file mode 100644
index 0000000000..fed740073d
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.c
@@ -0,0 +1,774 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Marvell.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <pthread.h>
+
+#include <rte_alarm.h>
+#include <rte_string_fns.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_io.h>
+
+#include "virtio_user/vhost.h"
+#include "virtio_logs.h"
+
+#include "cryptodev_pmd.h"
+#include "virtio_crypto.h"
+#include "virtio_cvq.h"
+#include "virtio_user_dev.h"
+#include "virtqueue.h"
+
+#define VIRTIO_USER_MEM_EVENT_CLB_NAME "virtio_user_mem_event_clb"
+
+const char * const crypto_virtio_user_backend_strings[] = {
+	[VIRTIO_USER_BACKEND_UNKNOWN] = "VIRTIO_USER_BACKEND_UNKNOWN",
+	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
+};
+
+static int
+virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	if (dev->kickfds[queue_sel] >= 0) {
+		close(dev->kickfds[queue_sel]);
+		dev->kickfds[queue_sel] = -1;
+	}
+
+	if (dev->callfds[queue_sel] >= 0) {
+		close(dev->callfds[queue_sel]);
+		dev->callfds[queue_sel] = -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_init_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* May use invalid flag, but some backend uses kickfd and
+	 * callfd as criteria to judge if dev is alive. so finally we
+	 * use real event_fd.
+	 */
+	dev->callfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->callfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup callfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+	dev->kickfds[queue_sel] = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (dev->kickfds[queue_sel] < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to setup kickfd for queue %u: %s",
+				dev->path, queue_sel, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_destroy_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	struct vhost_vring_state state;
+	int ret;
+
+	state.index = queue_sel;
+	ret = dev->ops->get_vring_base(dev, &state);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to destroy queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
+	 * firstly because vhost depends on this msg to allocate virtqueue
+	 * pair.
+	 */
+	struct vhost_vring_file file;
+	int ret;
+
+	file.index = queue_sel;
+	file.fd = dev->callfds[queue_sel];
+	ret = dev->ops->set_vring_call(dev, &file);
+	if (ret < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, queue_sel);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
+{
+	int ret;
+	struct vhost_vring_file file;
+	struct vhost_vring_state state;
+	struct vring *vring = &dev->vrings.split[queue_sel];
+	struct vring_packed *pq_vring = &dev->vrings.packed[queue_sel];
+	uint64_t desc_addr, avail_addr, used_addr;
+	struct vhost_vring_addr addr = {
+		.index = queue_sel,
+		.log_guest_addr = 0,
+		.flags = 0, /* disable log */
+	};
+
+	if (queue_sel == dev->max_queue_pairs) {
+		if (!dev->scvq) {
+			PMD_INIT_LOG(ERR, "(%s) Shadow control queue expected but missing",
+					dev->path);
+			goto err;
+		}
+
+		/* Use shadow control queue information */
+		vring = &dev->scvq->vq_split.ring;
+		pq_vring = &dev->scvq->vq_packed.ring;
+	}
+
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+		desc_addr = pq_vring->desc_iova;
+		avail_addr = desc_addr + pq_vring->num * sizeof(struct vring_packed_desc);
+		used_addr =  RTE_ALIGN_CEIL(avail_addr + sizeof(struct vring_packed_desc_event),
+						VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	} else {
+		desc_addr = vring->desc_iova;
+		avail_addr = desc_addr + vring->num * sizeof(struct vring_desc);
+		used_addr = RTE_ALIGN_CEIL((uintptr_t)(&vring->avail->ring[vring->num]),
+					VIRTIO_VRING_ALIGN);
+
+		addr.desc_user_addr = desc_addr;
+		addr.avail_user_addr = avail_addr;
+		addr.used_user_addr = used_addr;
+	}
+
+	state.index = queue_sel;
+	state.num = vring->num;
+	ret = dev->ops->set_vring_num(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	state.index = queue_sel;
+	state.num = 0; /* no reservation */
+	if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+		state.num |= (1 << 15);
+	ret = dev->ops->set_vring_base(dev, &state);
+	if (ret < 0)
+		goto err;
+
+	ret = dev->ops->set_vring_addr(dev, &addr);
+	if (ret < 0)
+		goto err;
+
+	/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
+	 * lastly because vhost depends on this msg to judge if
+	 * virtio is ready.
+	 */
+	file.index = queue_sel;
+	file.fd = dev->kickfds[queue_sel];
+	ret = dev->ops->set_vring_kick(dev, &file);
+	if (ret < 0)
+		goto err;
+
+	return 0;
+err:
+	PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
+
+	return -1;
+}
+
+static int
+virtio_user_foreach_queue(struct virtio_user_dev *dev,
+			int (*fn)(struct virtio_user_dev *, uint32_t))
+{
+	uint32_t i, nr_vq;
+
+	nr_vq = dev->max_queue_pairs;
+
+	for (i = 0; i < nr_vq; i++)
+		if (fn(dev, i) < 0)
+			return -1;
+
+	return 0;
+}
+
+int
+crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev)
+{
+	uint64_t features;
+	int ret = -1;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 0: tell vhost to create queues */
+	if (virtio_user_foreach_queue(dev, virtio_user_create_queue) < 0)
+		goto error;
+
+	features = dev->features;
+
+	ret = dev->ops->set_features(dev, features);
+	if (ret < 0)
+		goto error;
+	PMD_DRV_LOG(INFO, "(%s) set features: 0x%" PRIx64, dev->path, features);
+error:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return ret;
+}
+
+int
+crypto_virtio_user_start_device(struct virtio_user_dev *dev)
+{
+	int ret;
+
+	/*
+	 * XXX workaround!
+	 *
+	 * We need to make sure that the locks will be
+	 * taken in the correct order to avoid deadlocks.
+	 *
+	 * Before releasing this lock, this thread should
+	 * not trigger any memory hotplug events.
+	 *
+	 * This is a temporary workaround, and should be
+	 * replaced when we get proper supports from the
+	 * memory subsystem in the future.
+	 */
+	rte_mcfg_mem_read_lock();
+	pthread_mutex_lock(&dev->mutex);
+
+	/* Step 2: share memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto error;
+
+	/* Step 3: kick queues */
+	ret = virtio_user_foreach_queue(dev, virtio_user_kick_queue);
+	if (ret < 0)
+		goto error;
+
+	ret = virtio_user_kick_queue(dev, dev->max_queue_pairs);
+	if (ret < 0)
+		goto error;
+
+	/* Step 4: enable queues */
+	for (int i = 0; i < dev->max_queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto error;
+	}
+
+	dev->started = true;
+
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	return 0;
+error:
+	pthread_mutex_unlock(&dev->mutex);
+	rte_mcfg_mem_read_unlock();
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
+
+	return -1;
+}
+
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev)
+{
+	uint32_t i;
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	if (!dev->started)
+		goto out;
+
+	for (i = 0; i < dev->max_queue_pairs; ++i) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	if (dev->scvq) {
+		ret = dev->ops->cvq_enable(dev, 0);
+		if (ret < 0)
+			goto err;
+	}
+
+	/* Stop the backend. */
+	if (virtio_user_foreach_queue(dev, virtio_user_destroy_queue) < 0)
+		goto err;
+
+	dev->started = false;
+
+out:
+	pthread_mutex_unlock(&dev->mutex);
+
+	return 0;
+err:
+	pthread_mutex_unlock(&dev->mutex);
+
+	PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
+
+	return -1;
+}
+
+static int
+virtio_user_dev_init_max_queue_pairs(struct virtio_user_dev *dev, uint32_t user_max_qp)
+{
+	int ret;
+
+	if (!dev->ops->get_config) {
+		dev->max_queue_pairs = user_max_qp;
+		return 0;
+	}
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&dev->max_queue_pairs,
+			offsetof(struct virtio_crypto_config, max_dataqueues),
+			sizeof(uint16_t));
+	if (ret) {
+		/*
+		 * We need to know the max queue pair from the device so that
+		 * the control queue gets the right index.
+		 */
+		dev->max_queue_pairs = 1;
+		PMD_DRV_LOG(ERR, "(%s) Failed to get max queue pairs from device", dev->path);
+
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_dev_init_cipher_services(struct virtio_user_dev *dev)
+{
+	struct virtio_crypto_config config;
+	int ret;
+
+	dev->crypto_services = RTE_BIT32(VIRTIO_CRYPTO_SERVICE_CIPHER);
+	dev->cipher_algo = 0;
+	dev->auth_algo = 0;
+	dev->akcipher_algo = 0;
+
+	if (!dev->ops->get_config)
+		return 0;
+
+	ret = dev->ops->get_config(dev, (uint8_t *)&config,	0, sizeof(config));
+	if (ret) {
+		PMD_DRV_LOG(ERR, "(%s) Failed to get crypto config from device", dev->path);
+		return ret;
+	}
+
+	dev->crypto_services = config.crypto_services;
+	dev->cipher_algo = ((uint64_t)config.cipher_algo_h << 32) |
+						config.cipher_algo_l;
+	dev->hash_algo = config.hash_algo;
+	dev->auth_algo = ((uint64_t)config.mac_algo_h << 32) |
+						config.mac_algo_l;
+	dev->aead_algo = config.aead_algo;
+	dev->akcipher_algo = config.akcipher_algo;
+	return 0;
+}
+
+static int
+virtio_user_dev_init_notify(struct virtio_user_dev *dev)
+{
+
+	if (virtio_user_foreach_queue(dev, virtio_user_init_notify_queue) < 0)
+		goto err;
+
+	if (dev->device_features & (1ULL << VIRTIO_F_NOTIFICATION_DATA))
+		if (dev->ops->map_notification_area &&
+				dev->ops->map_notification_area(dev))
+			goto err;
+
+	return 0;
+err:
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	return -1;
+}
+
+static void
+virtio_user_dev_uninit_notify(struct virtio_user_dev *dev)
+{
+	virtio_user_foreach_queue(dev, virtio_user_uninit_notify_queue);
+
+	if (dev->ops->unmap_notification_area && dev->notify_area)
+		dev->ops->unmap_notification_area(dev);
+}
+
+static void
+virtio_user_mem_event_cb(enum rte_mem_event type __rte_unused,
+			const void *addr,
+			size_t len __rte_unused,
+			void *arg)
+{
+	struct virtio_user_dev *dev = arg;
+	struct rte_memseg_list *msl;
+	uint16_t i;
+	int ret = 0;
+
+	/* ignore externally allocated memory */
+	msl = rte_mem_virt2memseg_list(addr);
+	if (msl->external)
+		return;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	if (dev->started == false)
+		goto exit;
+
+	/* Step 1: pause the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 0);
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* Step 2: update memory regions */
+	ret = dev->ops->set_memory_table(dev);
+	if (ret < 0)
+		goto exit;
+
+	/* Step 3: resume the active queues */
+	for (i = 0; i < dev->queue_pairs; i++) {
+		ret = dev->ops->enable_qp(dev, i, 1);
+		if (ret < 0)
+			goto exit;
+	}
+
+exit:
+	pthread_mutex_unlock(&dev->mutex);
+
+	if (ret < 0)
+		PMD_DRV_LOG(ERR, "(%s) Failed to update memory table", dev->path);
+}
+
+static int
+virtio_user_dev_setup(struct virtio_user_dev *dev)
+{
+	if (dev->is_server) {
+		if (dev->backend_type != VIRTIO_USER_BACKEND_VHOST_USER) {
+			PMD_DRV_LOG(ERR, "Server mode only supports vhost-user!");
+			return -1;
+		}
+	}
+
+	switch (dev->backend_type) {
+	case VIRTIO_USER_BACKEND_VHOST_VDPA:
+		dev->ops = &virtio_ops_vdpa;
+		dev->ops->setup = virtio_crypto_ops_vdpa.setup;
+		dev->ops->get_features = virtio_crypto_ops_vdpa.get_features;
+		dev->ops->cvq_enable = virtio_crypto_ops_vdpa.cvq_enable;
+		dev->ops->enable_qp = virtio_crypto_ops_vdpa.enable_qp;
+		dev->ops->update_link_state = virtio_crypto_ops_vdpa.update_link_state;
+		dev->ops->map_notification_area = virtio_crypto_ops_vdpa.map_notification_area;
+		dev->ops->unmap_notification_area = virtio_crypto_ops_vdpa.unmap_notification_area;
+		break;
+	default:
+		PMD_DRV_LOG(ERR, "(%s) Unknown backend type", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+virtio_user_alloc_vrings(struct virtio_user_dev *dev)
+{
+	int i, size, nr_vrings;
+	bool packed_ring = !!(dev->device_features & (1ull << VIRTIO_F_RING_PACKED));
+
+	nr_vrings = dev->max_queue_pairs + 1;
+
+	dev->callfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->callfds), 0);
+	if (!dev->callfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc callfds", dev->path);
+		return -1;
+	}
+
+	dev->kickfds = rte_zmalloc("virtio_user_dev", nr_vrings * sizeof(*dev->kickfds), 0);
+	if (!dev->kickfds) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc kickfds", dev->path);
+		goto free_callfds;
+	}
+
+	for (i = 0; i < nr_vrings; i++) {
+		dev->callfds[i] = -1;
+		dev->kickfds[i] = -1;
+	}
+
+	if (packed_ring)
+		size = sizeof(*dev->vrings.packed);
+	else
+		size = sizeof(*dev->vrings.split);
+	dev->vrings.ptr = rte_zmalloc("virtio_user_dev", nr_vrings * size, 0);
+	if (!dev->vrings.ptr) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc vrings metadata", dev->path);
+		goto free_kickfds;
+	}
+
+	if (packed_ring) {
+		dev->packed_queues = rte_zmalloc("virtio_user_dev",
+				nr_vrings * sizeof(*dev->packed_queues), 0);
+		if (!dev->packed_queues) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to alloc packed queues metadata",
+					dev->path);
+			goto free_vrings;
+		}
+	}
+
+	dev->qp_enabled = rte_zmalloc("virtio_user_dev",
+			nr_vrings * sizeof(*dev->qp_enabled), 0);
+	if (!dev->qp_enabled) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to alloc QP enable states", dev->path);
+		goto free_packed_queues;
+	}
+
+	return 0;
+
+free_packed_queues:
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+free_vrings:
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+free_kickfds:
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+free_callfds:
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+
+	return -1;
+}
+
+static void
+virtio_user_free_vrings(struct virtio_user_dev *dev)
+{
+	rte_free(dev->qp_enabled);
+	dev->qp_enabled = NULL;
+	rte_free(dev->packed_queues);
+	dev->packed_queues = NULL;
+	rte_free(dev->vrings.ptr);
+	dev->vrings.ptr = NULL;
+	rte_free(dev->kickfds);
+	dev->kickfds = NULL;
+	rte_free(dev->callfds);
+	dev->callfds = NULL;
+}
+
+#define VIRTIO_USER_SUPPORTED_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_HASH       | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+int
+crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server)
+{
+	uint64_t backend_features;
+
+	pthread_mutex_init(&dev->mutex, NULL);
+	strlcpy(dev->path, path, PATH_MAX);
+
+	dev->started = 0;
+	dev->queue_pairs = 1; /* mq disabled by default */
+	dev->max_queue_pairs = queues; /* initialize to user requested value for kernel backend */
+	dev->queue_size = queue_size;
+	dev->is_server = server;
+	dev->frontend_features = 0;
+	dev->unsupported_features = 0;
+	dev->backend_type = VIRTIO_USER_BACKEND_VHOST_VDPA;
+	dev->hw.modern = 1;
+
+	if (virtio_user_dev_setup(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) backend set up fails", dev->path);
+		return -1;
+	}
+
+	if (dev->ops->set_owner(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend owner", dev->path);
+		goto destroy;
+	}
+
+	if (dev->ops->get_backend_features(&backend_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend features", dev->path);
+		goto destroy;
+	}
+
+	dev->unsupported_features = ~(VIRTIO_USER_SUPPORTED_FEATURES | backend_features);
+
+	if (dev->ops->get_features(dev, &dev->device_features) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get device features", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_max_queue_pairs(dev, queues)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get max queue pairs", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_cipher_services(dev)) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get cipher services", dev->path);
+		goto destroy;
+	}
+
+	dev->frontend_features &= ~dev->unsupported_features;
+	dev->device_features &= ~dev->unsupported_features;
+
+	if (virtio_user_alloc_vrings(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to allocate vring metadata", dev->path);
+		goto destroy;
+	}
+
+	if (virtio_user_dev_init_notify(dev) < 0) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to init notifiers", dev->path);
+		goto free_vrings;
+	}
+
+	if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
+				virtio_user_mem_event_cb, dev)) {
+		if (rte_errno != ENOTSUP) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to register mem event callback",
+					dev->path);
+			goto notify_uninit;
+		}
+	}
+
+	return 0;
+
+notify_uninit:
+	virtio_user_dev_uninit_notify(dev);
+free_vrings:
+	virtio_user_free_vrings(dev);
+destroy:
+	dev->ops->destroy(dev);
+
+	return -1;
+}
+
+void
+crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev)
+{
+	crypto_virtio_user_stop_device(dev);
+
+	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
+
+	virtio_user_dev_uninit_notify(dev);
+
+	virtio_user_free_vrings(dev);
+
+	if (dev->is_server)
+		unlink(dev->path);
+
+	dev->ops->destroy(dev);
+}
+
+#define CVQ_MAX_DATA_DESCS 32
+
+static inline void *
+virtio_user_iova2virt(struct virtio_user_dev *dev __rte_unused, rte_iova_t iova)
+{
+	if (rte_eal_iova_mode() == RTE_IOVA_VA)
+		return (void *)(uintptr_t)iova;
+	else
+		return rte_mem_iova2virt(iova);
+}
+
+static inline int
+desc_is_avail(struct vring_packed_desc *desc, bool wrap_counter)
+{
+	uint16_t flags = rte_atomic_load_explicit(&desc->flags, rte_memory_order_acquire);
+
+	return wrap_counter == !!(flags & VRING_PACKED_DESC_F_AVAIL) &&
+		wrap_counter != !!(flags & VRING_PACKED_DESC_F_USED);
+}
+
+int
+crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status)
+{
+	int ret;
+
+	pthread_mutex_lock(&dev->mutex);
+	dev->status = status;
+	ret = dev->ops->set_status(dev, status);
+	if (ret && ret != -ENOTSUP)
+		PMD_INIT_LOG(ERR, "(%s) Failed to set backend status", dev->path);
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev)
+{
+	int ret;
+	uint8_t status;
+
+	pthread_mutex_lock(&dev->mutex);
+
+	ret = dev->ops->get_status(dev, &status);
+	if (!ret) {
+		dev->status = status;
+		PMD_INIT_LOG(DEBUG, "Updated Device Status(0x%08x):"
+			"\t-RESET: %u "
+			"\t-ACKNOWLEDGE: %u "
+			"\t-DRIVER: %u "
+			"\t-DRIVER_OK: %u "
+			"\t-FEATURES_OK: %u "
+			"\t-DEVICE_NEED_RESET: %u "
+			"\t-FAILED: %u",
+			dev->status,
+			(dev->status == VIRTIO_CONFIG_STATUS_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_ACK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_CONFIG_STATUS_FAILED));
+	} else if (ret != -ENOTSUP) {
+		PMD_INIT_LOG(ERR, "(%s) Failed to get backend status", dev->path);
+	}
+
+	pthread_mutex_unlock(&dev->mutex);
+	return ret;
+}
+
+int
+crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev)
+{
+	if (dev->ops->update_link_state)
+		return dev->ops->update_link_state(dev);
+
+	return 0;
+}
diff --git a/drivers/crypto/virtio/virtio_user/virtio_user_dev.h b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
new file mode 100644
index 0000000000..2a0052b3ca
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user/virtio_user_dev.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Marvell.
+ */
+
+#ifndef _VIRTIO_USER_DEV_H
+#define _VIRTIO_USER_DEV_H
+
+#include <limits.h>
+#include <stdbool.h>
+
+#include "../virtio_pci.h"
+#include "../virtio_ring.h"
+
+extern struct virtio_user_backend_ops virtio_crypto_ops_vdpa;
+
+enum virtio_user_backend_type {
+	VIRTIO_USER_BACKEND_UNKNOWN,
+	VIRTIO_USER_BACKEND_VHOST_USER,
+	VIRTIO_USER_BACKEND_VHOST_VDPA,
+};
+
+struct virtio_user_queue {
+	uint16_t used_idx;
+	bool avail_wrap_counter;
+	bool used_wrap_counter;
+};
+
+struct virtio_user_dev {
+	union {
+		struct virtio_crypto_hw hw;
+		uint8_t dummy[256];
+	};
+
+	void		*backend_data;
+	uint16_t	**notify_area;
+	char		path[PATH_MAX];
+	bool		hw_cvq;
+	uint16_t	max_queue_pairs;
+	uint64_t	device_features; /* supported features by device */
+	bool		*qp_enabled;
+
+	enum virtio_user_backend_type backend_type;
+	bool		is_server;  /* server or client mode */
+
+	int		*callfds;
+	int		*kickfds;
+	uint16_t	queue_pairs;
+	uint32_t	queue_size;
+	uint64_t	features; /* the negotiated features with driver,
+				   * and will be sync with device
+				   */
+	uint64_t	frontend_features; /* enabled frontend features */
+	uint64_t	unsupported_features; /* unsupported features mask */
+	uint8_t		status;
+	uint32_t	crypto_status;
+	uint32_t	crypto_services;
+	uint64_t	cipher_algo;
+	uint32_t	hash_algo;
+	uint64_t	auth_algo;
+	uint32_t	aead_algo;
+	uint32_t	akcipher_algo;
+
+	union {
+		void			*ptr;
+		struct vring		*split;
+		struct vring_packed	*packed;
+	} vrings;
+
+	struct virtio_user_queue *packed_queues;
+
+	struct virtio_user_backend_ops *ops;
+	pthread_mutex_t	mutex;
+	bool		started;
+
+	struct virtqueue	*scvq;
+};
+
+int crypto_virtio_user_dev_set_features(struct virtio_user_dev *dev);
+int crypto_virtio_user_start_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_stop_device(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_init(struct virtio_user_dev *dev, char *path, uint16_t queues,
+			int queue_size, int server);
+void crypto_virtio_user_dev_uninit(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
+int crypto_virtio_user_dev_update_status(struct virtio_user_dev *dev);
+int crypto_virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+extern const char * const crypto_virtio_user_backend_strings[];
+#endif
diff --git a/drivers/crypto/virtio/virtio_user_cryptodev.c b/drivers/crypto/virtio/virtio_user_cryptodev.c
new file mode 100644
index 0000000000..f5725f0a59
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_user_cryptodev.c
@@ -0,0 +1,586 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Marvell
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <bus_vdev_driver.h>
+#include <rte_cryptodev.h>
+#include <cryptodev_pmd.h>
+#include <rte_alarm.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+
+#include "virtio_user/virtio_user_dev.h"
+#include "virtio_user/vhost.h"
+#include "virtio_cryptodev.h"
+#include "virtio_logs.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
+
+static void
+virtio_user_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		     void *dst, int length __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (offset == offsetof(struct virtio_crypto_config, status)) {
+		crypto_virtio_user_dev_update_link_state(dev);
+		*(uint32_t *)dst = dev->crypto_status;
+	} else if (offset == offsetof(struct virtio_crypto_config, max_dataqueues))
+		*(uint16_t *)dst = dev->max_queue_pairs;
+	else if (offset == offsetof(struct virtio_crypto_config, crypto_services))
+		*(uint32_t *)dst = dev->crypto_services;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_l))
+		*(uint32_t *)dst = dev->cipher_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, cipher_algo_h))
+		*(uint32_t *)dst = dev->cipher_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, hash_algo))
+		*(uint32_t *)dst = dev->hash_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_l))
+		*(uint32_t *)dst = dev->auth_algo & 0xFFFF;
+	else if (offset == offsetof(struct virtio_crypto_config, mac_algo_h))
+		*(uint32_t *)dst = dev->auth_algo >> 32;
+	else if (offset == offsetof(struct virtio_crypto_config, aead_algo))
+		*(uint32_t *)dst = dev->aead_algo;
+	else if (offset == offsetof(struct virtio_crypto_config, akcipher_algo))
+		*(uint32_t *)dst = dev->akcipher_algo;
+}
+
+static void
+virtio_user_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		      const void *src, int length)
+{
+	RTE_SET_USED(hw);
+	RTE_SET_USED(src);
+
+	PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d",
+		    offset, length);
+}
+
+static void
+virtio_user_reset(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
+		crypto_virtio_user_stop_device(dev);
+}
+
+static void
+virtio_user_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint8_t old_status = dev->status;
+
+	if (status & VIRTIO_CONFIG_STATUS_FEATURES_OK &&
+			~old_status & VIRTIO_CONFIG_STATUS_FEATURES_OK) {
+		crypto_virtio_user_dev_set_features(dev);
+		/* Feature negotiation should be only done in probe time.
+		 * So we skip any more request here.
+		 */
+		dev->status |= VIRTIO_CONFIG_STATUS_FEATURES_OK;
+	}
+
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK) {
+		if (crypto_virtio_user_start_device(dev)) {
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	} else if (status == VIRTIO_CONFIG_STATUS_RESET) {
+		virtio_user_reset(hw);
+	}
+
+	crypto_virtio_user_dev_set_status(dev, status);
+	if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK && dev->scvq) {
+		if (dev->ops->cvq_enable(dev, 1) < 0) {
+			PMD_INIT_LOG(ERR, "(%s) Failed to start ctrlq", dev->path);
+			crypto_virtio_user_dev_update_status(dev);
+			return;
+		}
+	}
+}
+
+static uint8_t
+virtio_user_get_status(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	crypto_virtio_user_dev_update_status(dev);
+
+	return dev->status;
+}
+
+#define VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES   \
+	(1ULL << VIRTIO_CRYPTO_SERVICE_CIPHER     | \
+	 1ULL << VIRTIO_CRYPTO_SERVICE_AKCIPHER   | \
+	 1ULL << VIRTIO_F_VERSION_1               | \
+	 1ULL << VIRTIO_F_IN_ORDER                | \
+	 1ULL << VIRTIO_F_RING_PACKED             | \
+	 1ULL << VIRTIO_F_NOTIFICATION_DATA       | \
+	 1ULL << VIRTIO_RING_F_INDIRECT_DESC      | \
+	 1ULL << VIRTIO_F_ORDER_PLATFORM)
+
+static uint64_t
+virtio_user_get_features(struct virtio_crypto_hw *hw)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* unmask feature bits defined in vhost user protocol */
+	return (dev->device_features | dev->frontend_features) &
+		VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES;
+}
+
+static void
+virtio_user_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	dev->features = features & (dev->device_features | dev->frontend_features);
+}
+
+static uint8_t
+virtio_user_get_isr(struct virtio_crypto_hw *hw __rte_unused)
+{
+	/* rxq interrupts and config interrupt are separated in virtio-user,
+	 * here we only report config change.
+	 */
+	return VIRTIO_PCI_CAP_ISR_CFG;
+}
+
+static uint16_t
+virtio_user_set_config_irq(struct virtio_crypto_hw *hw __rte_unused,
+		    uint16_t vec __rte_unused)
+{
+	return 0;
+}
+
+static uint16_t
+virtio_user_set_queue_irq(struct virtio_crypto_hw *hw __rte_unused,
+			  struct virtqueue *vq __rte_unused,
+			  uint16_t vec)
+{
+	/* pretend we have done that */
+	return vec;
+}
+
+/* This function is to get the queue size, aka, number of descs, of a specified
+ * queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
+ * max supported queues.
+ */
+static uint16_t
+virtio_user_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id __rte_unused)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	/* Currently, each queue has same queue size */
+	return dev->queue_size;
+}
+
+static void
+virtio_user_setup_queue_packed(struct virtqueue *vq,
+			       struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	struct vring_packed *vring;
+	uint64_t desc_addr;
+	uint64_t avail_addr;
+	uint64_t used_addr;
+	uint16_t i;
+
+	vring  = &dev->vrings.packed[queue_idx];
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries *
+		sizeof(struct vring_packed_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr +
+			   sizeof(struct vring_packed_desc_event),
+			   VIRTIO_VRING_ALIGN);
+	vring->num = vq->vq_nentries;
+	vring->desc_iova = vq->vq_ring_mem;
+	vring->desc = (void *)(uintptr_t)desc_addr;
+	vring->driver = (void *)(uintptr_t)avail_addr;
+	vring->device = (void *)(uintptr_t)used_addr;
+	dev->packed_queues[queue_idx].avail_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_wrap_counter = true;
+	dev->packed_queues[queue_idx].used_idx = 0;
+
+	for (i = 0; i < vring->num; i++)
+		vring->desc[i].flags = 0;
+}
+
+static void
+virtio_user_setup_queue_split(struct virtqueue *vq, struct virtio_user_dev *dev)
+{
+	uint16_t queue_idx = vq->vq_queue_index;
+	uint64_t desc_addr, avail_addr, used_addr;
+
+	desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_VRING_ALIGN);
+
+	dev->vrings.split[queue_idx].num = vq->vq_nentries;
+	dev->vrings.split[queue_idx].desc_iova = vq->vq_ring_mem;
+	dev->vrings.split[queue_idx].desc = (void *)(uintptr_t)desc_addr;
+	dev->vrings.split[queue_idx].avail = (void *)(uintptr_t)avail_addr;
+	dev->vrings.split[queue_idx].used = (void *)(uintptr_t)used_addr;
+}
+
+static int
+virtio_user_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+
+	if (vtpci_with_packed_queue(hw))
+		virtio_user_setup_queue_packed(vq, dev);
+	else
+		virtio_user_setup_queue_split(vq, dev);
+
+	if (dev->notify_area)
+		vq->notify_addr = dev->notify_area[vq->vq_queue_index];
+
+	if (virtcrypto_cq_to_vq(hw->cvq) == vq)
+		dev->scvq = virtcrypto_cq_to_vq(hw->cvq);
+
+	return 0;
+}
+
+static void
+virtio_user_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	/* For legacy devices, write 0 to VIRTIO_PCI_QUEUE_PFN port, QEMU
+	 * correspondingly stops the ioeventfds, and reset the status of
+	 * the device.
+	 * For modern devices, set queue desc, avail, used in PCI bar to 0,
+	 * not see any more behavior in QEMU.
+	 *
+	 * Here we just care about what information to deliver to vhost-user
+	 * or vhost-kernel. So we just close ioeventfd for now.
+	 */
+
+	RTE_SET_USED(hw);
+	RTE_SET_USED(vq);
+}
+
+static void
+virtio_user_notify_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	struct virtio_user_dev *dev = virtio_user_get_dev(hw);
+	uint64_t notify_data = 1;
+
+	if (!dev->notify_area) {
+		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
+			  sizeof(notify_data)) < 0)
+			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
+				    strerror(errno));
+		return;
+	} else if (!vtpci_with_feature(hw, VIRTIO_F_NOTIFICATION_DATA)) {
+		rte_write16(vq->vq_queue_index, vq->notify_addr);
+		return;
+	}
+
+	if (vtpci_with_packed_queue(hw)) {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:30]: avail index
+		 * Bit[31]: avail wrap counter
+		 */
+		notify_data = ((uint32_t)(!!(vq->vq_packed.cached_flags &
+				VRING_PACKED_DESC_F_AVAIL)) << 31) |
+				((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	} else {
+		/* Bit[0:15]: vq queue index
+		 * Bit[16:31]: avail index
+		 */
+		notify_data = ((uint32_t)vq->vq_avail_idx << 16) |
+				vq->vq_queue_index;
+	}
+	rte_write32(notify_data, vq->notify_addr);
+}
+
+const struct virtio_pci_ops crypto_virtio_user_ops = {
+	.read_dev_cfg	= virtio_user_read_dev_config,
+	.write_dev_cfg	= virtio_user_write_dev_config,
+	.reset		= virtio_user_reset,
+	.get_status	= virtio_user_get_status,
+	.set_status	= virtio_user_set_status,
+	.get_features	= virtio_user_get_features,
+	.set_features	= virtio_user_set_features,
+	.get_isr	= virtio_user_get_isr,
+	.set_config_irq	= virtio_user_set_config_irq,
+	.set_queue_irq	= virtio_user_set_queue_irq,
+	.get_queue_num	= virtio_user_get_queue_num,
+	.setup_queue	= virtio_user_setup_queue,
+	.del_queue	= virtio_user_del_queue,
+	.notify_queue	= virtio_user_notify_queue,
+};
+
+static const char * const valid_args[] = {
+#define VIRTIO_USER_ARG_QUEUES_NUM     "queues"
+	VIRTIO_USER_ARG_QUEUES_NUM,
+#define VIRTIO_USER_ARG_QUEUE_SIZE     "queue_size"
+	VIRTIO_USER_ARG_QUEUE_SIZE,
+#define VIRTIO_USER_ARG_PATH           "path"
+	VIRTIO_USER_ARG_PATH,
+#define VIRTIO_USER_ARG_SERVER_MODE    "server"
+	VIRTIO_USER_ARG_SERVER_MODE,
+	NULL
+};
+
+#define VIRTIO_USER_DEF_Q_NUM	1
+#define VIRTIO_USER_DEF_Q_SZ	256
+#define VIRTIO_USER_DEF_SERVER_MODE	0
+
+static int
+get_string_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	uint64_t integer = 0;
+	if (!value || !extra_args)
+		return -EINVAL;
+	errno = 0;
+	integer = strtoull(value, NULL, 0);
+	/* extra_args keeps default value, it should be replaced
+	 * only in case of successful parsing of the 'value' arg
+	 */
+	if (errno == 0)
+		*(uint64_t *)extra_args = integer;
+	return -errno;
+}
+
+static struct rte_cryptodev *
+virtio_user_cryptodev_alloc(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev_pmd_init_params init_params = {
+		.name = "",
+		.private_data_size = sizeof(struct virtio_user_dev),
+	};
+	struct rte_cryptodev_data *data;
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	struct virtio_crypto_hw *hw;
+
+	init_params.socket_id = vdev->device.numa_node;
+	init_params.private_data_size = sizeof(struct virtio_user_dev);
+	cryptodev = rte_cryptodev_pmd_create(vdev->device.name, &vdev->device, &init_params);
+	if (cryptodev == NULL) {
+		PMD_INIT_LOG(ERR, "failed to create cryptodev vdev");
+		return NULL;
+	}
+
+	data = cryptodev->data;
+	dev = data->dev_private;
+	hw = &dev->hw;
+
+	hw->dev_id = data->dev_id;
+	VTPCI_OPS(hw) = &crypto_virtio_user_ops;
+
+	return cryptodev;
+}
+
+static void
+virtio_user_cryptodev_free(struct rte_cryptodev *cryptodev)
+{
+	rte_cryptodev_pmd_destroy(cryptodev);
+}
+
+static int
+virtio_user_pmd_probe(struct rte_vdev_device *vdev)
+{
+	uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+	uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
+	uint64_t queues = VIRTIO_USER_DEF_Q_NUM;
+	struct rte_cryptodev *cryptodev = NULL;
+	struct rte_kvargs *kvlist = NULL;
+	struct virtio_user_dev *dev;
+	char *path = NULL;
+	int ret;
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_args);
+
+	if (!kvlist) {
+		PMD_INIT_LOG(ERR, "error when parsing param");
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PATH,
+					&get_string_arg, &path) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_PATH);
+			goto end;
+		}
+	} else {
+		PMD_INIT_LOG(ERR, "arg %s is mandatory for virtio_user",
+				VIRTIO_USER_ARG_PATH);
+		goto end;
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUES_NUM) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUES_NUM,
+					&get_integer_arg, &queues) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUES_NUM);
+			goto end;
+		}
+	}
+
+	if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE) == 1) {
+		if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_QUEUE_SIZE,
+					&get_integer_arg, &queue_size) < 0) {
+			PMD_INIT_LOG(ERR, "error to parse %s",
+					VIRTIO_USER_ARG_QUEUE_SIZE);
+			goto end;
+		}
+	}
+
+	cryptodev = virtio_user_cryptodev_alloc(vdev);
+	if (!cryptodev) {
+		PMD_INIT_LOG(ERR, "virtio_user fails to alloc device");
+		goto end;
+	}
+
+	dev = cryptodev->data->dev_private;
+	if (crypto_virtio_user_dev_init(dev, path, queues, queue_size,
+			server_mode) < 0) {
+		PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	if (crypto_virtio_dev_init(cryptodev, VIRTIO_USER_CRYPTO_PMD_GUEST_FEATURES,
+			NULL) < 0) {
+		PMD_INIT_LOG(ERR, "crypto_virtio_dev_init fails");
+		crypto_virtio_user_dev_uninit(dev);
+		virtio_user_cryptodev_free(cryptodev);
+		goto end;
+	}
+
+	rte_cryptodev_pmd_probing_finish(cryptodev);
+
+	ret = 0;
+end:
+	rte_kvargs_free(kvlist);
+	free(path);
+	return ret;
+}
+
+static int
+virtio_user_pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_cryptodev *cryptodev;
+	const char *name;
+	int devid;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	PMD_DRV_LOG(INFO, "Removing %s", name);
+
+	devid = rte_cryptodev_get_dev_id(name);
+	if (devid < 0)
+		return -EINVAL;
+
+	rte_cryptodev_stop(devid);
+
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -ENODEV;
+
+	if (rte_cryptodev_pmd_destroy(cryptodev) < 0) {
+		PMD_DRV_LOG(ERR, "Failed to remove %s", name);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_map(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_map)
+		return dev->ops->dma_map(dev, addr, iova, len);
+
+	return 0;
+}
+
+static int virtio_user_pmd_dma_unmap(struct rte_vdev_device *vdev, void *addr,
+		uint64_t iova, size_t len)
+{
+	struct rte_cryptodev *cryptodev;
+	struct virtio_user_dev *dev;
+	const char *name;
+
+	if (!vdev)
+		return -EINVAL;
+
+	name = rte_vdev_device_name(vdev);
+	cryptodev = rte_cryptodev_pmd_get_named_dev(name);
+	if (cryptodev == NULL)
+		return -EINVAL;
+
+	dev = cryptodev->data->dev_private;
+
+	if (dev->ops->dma_unmap)
+		return dev->ops->dma_unmap(dev, addr, iova, len);
+
+	return 0;
+}
+
+static struct rte_vdev_driver virtio_user_driver = {
+	.probe = virtio_user_pmd_probe,
+	.remove = virtio_user_pmd_remove,
+	.dma_map = virtio_user_pmd_dma_map,
+	.dma_unmap = virtio_user_pmd_dma_unmap,
+};
+
+static struct cryptodev_driver virtio_crypto_drv;
+
+RTE_PMD_REGISTER_VDEV(crypto_virtio_user, virtio_user_driver);
+RTE_PMD_REGISTER_CRYPTO_DRIVER(virtio_crypto_drv,
+	virtio_user_driver.driver,
+	cryptodev_virtio_driver_id);
+RTE_PMD_REGISTER_ALIAS(crypto_virtio_user, crypto_virtio);
+RTE_PMD_REGISTER_PARAM_STRING(crypto_virtio_user,
+	"path=<path> "
+	"queues=<int> "
+	"queue_size=<int>");
-- 
2.25.1


^ permalink raw reply	[relevance 1%]

* Re: rte_event_eth_tx_adapter_enqueue() short enqueue
  @ 2024-12-19 17:12  3%         ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2024-12-19 17:12 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Mattias Rönnblom, dev, Jerin Jacob Kollanukkaran,
	Daniel Östman, Naga Harish K S V, nils.wiberg, gyumin.hwang,
	changshik.lee, Mattias Rönnblom

On Thu, Dec 19, 2024 at 04:59:33PM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Wednesday, 27 November 2024 12.07
> > 
> > On Wed, Nov 27, 2024 at 11:53:50AM +0100, Mattias Rönnblom wrote:
> > > On 2024-11-27 11:38, Bruce Richardson wrote:
> > > > On Wed, Nov 27, 2024 at 11:03:31AM +0100, Mattias Rönnblom wrote:
> > > > > Hi.
> > > > >
> > > > > Consider the following situation:
> > > > >
> > > > > An application does
> > > > >
> > > > > rte_event_eth_tx_adapter_enqueue()
> > > > >
> > > > > and due to back-pressure or some other reason not all
> > events/packets could
> > > > > be enqueued, and a count lower than the nb_events input parameter
> > is
> > > > > returned.
> > > > >
> > > > > The API says that "/../ the remaining events at the end of ev[]
> > are not
> > > > > consumed and the caller has to take care of them /../".
> > > > >
> > > > > May an event device rearrange the ev array so that any enqueue
> > failures are
> > > > > put last in the ev array?
> > > > >
> > > > > In other words: does the "at the end of ev[]" mean "at the end of
> > ev[] as
> > > > > the call has completed", or is the event array supposed to be
> > untouched, and
> > > > > thus the same events are at the end both before and after the
> > call.
> > > > >
> > > > > The ev array pointer is not const, so from that perspective it
> > may be
> > > > > modified.
> > > > >
> > > > > This situation may occur for example the bonding driver is used
> > under the
> > > > > hood. The bonding driver does this kind of rearrangements on the
> > ethdev
> > > > > level.
> > > > >
> > > >
> > > > Interesting question. I tend to think that we should not proclude
> > this
> > > > reordering, as it should allow e.g  an eventdev which is short on
> > space to
> > > > selectively enqueue only the high priority events.
> > > >
> > >
> > > Allowing reordering may be a little surprising to the user. At least
> > it
> > > would be for me.
> > >
> > > Other eventdev APIs enqueue do not allow this kind of reordering
> > (with
> > > const-marked arrays).
> > >
> > 
> > That is a good point. I forgot that the events are directly passed to
> > the
> > enqueue functions rather than being passed as pointers, which could
> > then be
> > reordered without modifying the underlying events.
> > 
> > > That said, I lean toward agreeing with you, since it will solve the
> > ethdev
> > > tx_burst() mapping issue mentioned.
> > >
> > 
> > If enabling this solves a real problem, then let's allow it, despite
> > the
> > inconsistency in the APIs. Again, though, we need to to call this out
> > in
> > the docs very prominently to avoid surprises.
> > 
> > Alternatively, do we want to add a separate API that explicitly allows
> > reordering, and update the existing API to have a const value
> > parameter?
> > For drivers that don't implement the reordering they can just not
> > provide
> > the reordering function and the non-reorder version can be used
> > transparently instead.
> 
> IMHO, allowing reordering with the current API would break the developer's reasonable expectations of the API.
> Breaking reasonable expectations could be considered an API break.
> 
> Some application may have a parallel array with metadata about the events.
> If the events are reordered (and the last N of them deferred to the application to process), the application can no longer index into the metadata array (to process the metadata of the deferred events).
> 
> For reference, consider the SORING proposed by Konstantin.
> 
> Regarding "const":
> It's my impression that "const" is missing in lots of APIs where the parameter must not be modified.

+1 to this.
Fortunately, if we find ones where it is missing, it's not an ABI or API
break to add in the missing const clarification. Therefore, let's add these
consts in whenever we spot them missing!

> So, developers cannot rely on "const" as an indication if a passed parameter might be modified or not.
> Obviously, "const" cannot be modified. But no "const" does not imply that the parameter is contractually modifiable by the function.
> 

Or more specifically, the develop cannot derive any information from the
absence of const in our APIs - the parameters might be modified, but then
again they may not.

Sometimes this causes problems, where application code wants to have
const-correctness but is blocked by a DPDK function where it logically
should not modify parameters e.g. a configure function, but is not
explicitly committing via const not to.

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [PATCH v6] graph: mcore: optimize graph search
  2024-12-16 14:49  4%           ` David Marchand
@ 2024-12-17  9:04  0%             ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2024-12-17  9:04 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: dev, jerinj, kirankumark, ndabilpuram, yanzhirun_163, Huichao Cai

On Mon, Dec 16, 2024 at 3:49 PM David Marchand
<david.marchand@redhat.com> wrote:
> $ abidiff --suppr
> /home/dmarchan/git/pub/dpdk.org/main/devtools/libabigail.abignore
> --no-added-syms --headers-dir1
> /home/dmarchan/abi/v24.11/build-gcc-shared/usr/local/include
> --headers-dir2 /home/dmarchan/builds/main/build-gcc-shared/install/usr/local/include
> /home/dmarchan/abi/v24.11/build-gcc-shared/usr/local/lib64/librte_graph.so.25.0
> /home/dmarchan/builds/main/build-gcc-shared/install/usr/local/lib64/librte_graph.so.25.1
> Functions changes summary: 0 Removed, 1 Changed (9 filtered out), 0
> Added functions
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
> 1 function with some indirect sub-type change:
>
>   [C] 'function bool
> __rte_graph_mcore_dispatch_sched_node_enqueue(rte_node*,
> rte_graph_rq_head*)' at rte_graph_model_mcore_dispatch.c:117:1 has
> some indirect sub-type changes:
>     parameter 1 of type 'rte_node*' has sub-type changes:
>       in pointed to type 'struct rte_node' at rte_graph_worker_common.h:92:1:
>         type size hasn't changed
>         1 data member changes (2 filtered):
>           anonymous data member at offset 1536 (in bits) changed from:
>             union {struct {unsigned int lcore_id; uint64_t
> total_sched_objs; uint64_t total_sched_fail;} dispatch;}
>           to:
>             union {struct {unsigned int lcore_id; uint64_t
> total_sched_objs; uint64_t total_sched_fail; rte_graph* graph;}
> dispatch;}
>
>
> What would be the best way to suppress this warning?
> I tried the following which seems to work, but I prefer to ask for your advice.
>
> [suppress_type]
>     name = rte_node
>     has_data_member_at = offset_of(total_sched_fail)

Gah.. I meant has_data_member_inserted_at.
But then testing with has_data_member_inserted_at, the warning is not
suppressed either...

Any help appreciated.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v6] graph: mcore: optimize graph search
  2024-12-16  1:43 11%         ` [PATCH v6] " Huichao Cai
@ 2024-12-16 14:49  4%           ` David Marchand
  2024-12-17  9:04  0%             ` David Marchand
  2025-01-20 14:36  4%           ` Huichao Cai
  2025-02-06  2:53 11%           ` [PATCH v7 1/1] " Huichao Cai
  2 siblings, 1 reply; 200+ results
From: David Marchand @ 2024-12-16 14:49 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: dev, jerinj, kirankumark, ndabilpuram, yanzhirun_163, Huichao Cai

Salut Dodji,

On Mon, Dec 16, 2024 at 2:44 AM Huichao Cai <chcchc88@163.com> wrote:
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 21b8cd6113..a92ee29512 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -33,3 +33,8 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +[suppress_type]
> +       name = rte_node
> +       has_size_change = no
> +       has_data_member_inserted_between =
> +{offset_of(total_sched_fail), offset_of(xstat_off)}

Here is a suppression rule I suggested but does not have the intended effect.

For the context:

Before the change (that you can find below with the next hunk), we
made sure to zero the whole rte_node object at runtime in the library
allocator.
And the offset of the field next to 'dispatch' is fixed with an
explicit alignas() statement.

        /** Fast schedule area for mcore dispatch model. */
        union {
                alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
                        unsigned int lcore_id;  /**< Node running
lcore. */
                        uint64_t total_sched_objs; /**< Number of
objects scheduled. */
                        uint64_t total_sched_fail; /**< Number of
scheduled failure. */
                } dispatch;
        };

        /** Fast path area cache line 1. */
        alignas(RTE_CACHE_LINE_MIN_SIZE)
        rte_graph_off_t xstat_off; /**< Offset to xstat counters. */

If you want the whole definition, you can have a look at:
https://git.dpdk.org/dpdk/tree/lib/graph/rte_graph_worker_common.h#n87

[...]

> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
> index d3ec88519d..aef0f65673 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>                         unsigned int lcore_id;  /**< Node running lcore. */
>                         uint64_t total_sched_objs; /**< Number of objects scheduled. */
>                         uint64_t total_sched_fail; /**< Number of scheduled failure. */
> +                       struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
>                 } dispatch;
>         };

Now, the patch adds a new field in the struct {} dispatch field.

Here is what abidiff reports:

$ abidiff --version
abidiff: 2.6.0

$ abidiff --suppr
/home/dmarchan/git/pub/dpdk.org/main/devtools/libabigail.abignore
--no-added-syms --headers-dir1
/home/dmarchan/abi/v24.11/build-gcc-shared/usr/local/include
--headers-dir2 /home/dmarchan/builds/main/build-gcc-shared/install/usr/local/include
/home/dmarchan/abi/v24.11/build-gcc-shared/usr/local/lib64/librte_graph.so.25.0
/home/dmarchan/builds/main/build-gcc-shared/install/usr/local/lib64/librte_graph.so.25.1
Functions changes summary: 0 Removed, 1 Changed (9 filtered out), 0
Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 function with some indirect sub-type change:

  [C] 'function bool
__rte_graph_mcore_dispatch_sched_node_enqueue(rte_node*,
rte_graph_rq_head*)' at rte_graph_model_mcore_dispatch.c:117:1 has
some indirect sub-type changes:
    parameter 1 of type 'rte_node*' has sub-type changes:
      in pointed to type 'struct rte_node' at rte_graph_worker_common.h:92:1:
        type size hasn't changed
        1 data member changes (2 filtered):
          anonymous data member at offset 1536 (in bits) changed from:
            union {struct {unsigned int lcore_id; uint64_t
total_sched_objs; uint64_t total_sched_fail;} dispatch;}
          to:
            union {struct {unsigned int lcore_id; uint64_t
total_sched_objs; uint64_t total_sched_fail; rte_graph* graph;}
dispatch;}


What would be the best way to suppress this warning?
I tried the following which seems to work, but I prefer to ask for your advice.

[suppress_type]
    name = rte_node
    has_data_member_at = offset_of(total_sched_fail)


Thanks.


-- 
David Marchand


^ permalink raw reply	[relevance 4%]

* Community CI Meeting Minutes - December 12, 2024
@ 2024-12-16  4:18  3% Patrick Robb
  0 siblings, 0 replies; 200+ results
From: Patrick Robb @ 2024-12-16  4:18 UTC (permalink / raw)
  To: ci; +Cc: dev, dts

[-- Attachment #1: Type: text/plain, Size: 3479 bytes --]

#####################################################################
December 12, 2024
Attendees
1. Patrick Robb
2. Aaron Conole
3. Paul Szczepanek
4. Luca Vizzarro

#####################################################################
Minutes

=====================================================================
General Announcements
* Dts roadmap:
https://docs.google.com/document/d/1doTZOOpkv4D5P2w6K7fEJpa_CjzrlMl3mCeDBWtxnko/edit?tab=t.0
* AWS:
   * They have confirmed that AWS ARM Graviton systems will be included in
CI testing
* Aaron: It would be ideal if we could update some of the scripts in
dpdk-ci repo such that the systems being brought over across labs are more
uniform
   * Polling patchwork is one example of a script requiring an update

=====================================================================
CI Status

---------------------------------------------------------------------
UNH-IOL Community Lab
* ABI Testing: UNH is building new container images now.
   * Libabigail 2.6 should be used going forward, which comes with a new
dependency “libxxhash”
* Marvell SDK 12: We are still having trouble setting up the Marvell crypto
devices, so we are upgrading to the latest SDK and reflashing the board.
* Maintenance:
   * Migrating and upgrading our Jenkins instance next Monday at 15:00 UTC
- downtime expected to be a few hours.
* Working on turning on new DTS in CI testing currently.
* “Retest Button” For periodic testing is in code review now, expected to
be available on the DPDK Dashboard next week.
* Dpdk-ci repo:
   * Need a review and merge of the create_series_artifact.py patch which
adds the meson check script.

---------------------------------------------------------------------
Intel Lab
* None

---------------------------------------------------------------------
Github Actions
* None

---------------------------------------------------------------------
Loongarch Lab
* None

=====================================================================
DTS Improvements & Test Development
* Patrick will do a review for the Ruff patch
* Paramiko:
   * bug has been resolved by Nick, by setting a while loop in which we
wait for the expected prompt to enter the paramiko buffer.
   * There is another Paramiko race condition in which a file read/write
error bubbles up at the conclusion of the DTS execution (may be a race
condition). This does not affect the testrun but it pollutes the logs, so
we should investigate this during the 25.03 cycle.
* Pending patches from the previous release: Work to review and merge Ruff
very quickly, then rebase all the old patches (in groups) and quickly merge
those groups.
   * Will rebase and review 1 old patchseries per developer per week in
December/January, and we should be
* Scapy/MyPy updates:
   * The new Scapy version includes better type support - we will update
within the 25.03 cycle.
* Poetry.lock file is committed to the repo (it is not included in the
.gitignore). After some discussion we have confirmed that this is correct,
and that maintainers should periodically update the poetry.lock file in the
remote repo.
   * Lock file makes dependency resolution faster
   * Lock file provides a universal lock of dependencies across python
versions, across time

=====================================================================
Any other business
* Next Meeting Jan 16, 2024

[-- Attachment #2: Type: text/html, Size: 3772 bytes --]

^ permalink raw reply	[relevance 3%]

* DTS WG Meeting Minutes - December 5, 2024
@ 2024-12-16  4:14  4% Patrick Robb
  0 siblings, 0 replies; 200+ results
From: Patrick Robb @ 2024-12-16  4:14 UTC (permalink / raw)
  To: dev; +Cc: ci, dts

[-- Attachment #1: Type: text/plain, Size: 1807 bytes --]

#####################################################################
December 5, 2024
Attendees
* Patrick Robb
* Paul Szczepanek
* Luca Vizzarro

#####################################################################
Minutes

=====================================================================
General Discussion
* CI Testing labs discussion:
   * ABI testing can begin again
      * UNH is rebuilding its container images with the new v25 ABI
reference.
   * Still debugging some issues with cryptodev device creation on the
Marvell CN10K device. Going to rebuild the Marvell SDK with version 12 and
reflash the board.
   * AMD donated servers have been rack mounted, provisioned with ubuntu
24.04 and DTS/DPDK dependencies.
   * ARM Grace server delivery date is 12/23
* Patrick and Aaron had a call with AWS about setting up a CI “Lab” for AWS
which would do per patch testing for the test-report mailing list
* xSightLabs got a DTS demo from Patrick - they are using both legacy DTS
and new DTS in parallel
* December 26 CI meeting is cancelled
* January 2 DTS meeting is cancelled
* DPDK 24.11 has been released
* 25.03 roadmap status:
https://docs.google.com/document/d/1doTZOOpkv4D5P2w6K7fEJpa_CjzrlMl3mCeDBWtxnko/edit

=====================================================================
Patch discussions
* Ruff:
   * The default rules are too minimal, but we don’t need to use literally
every rule. Luca will look for a recommended set of rules to use online
* Flow rule dataclass v5 series is submitted
* Bugzilla discussions

=====================================================================
Any other business
* Patrick will invite the Microsoft Azure testers to the DTS meetings
   * mamcgove@microsoft.com
* Next meeting Dec 19, 2024

[-- Attachment #2: Type: text/html, Size: 2050 bytes --]

^ permalink raw reply	[relevance 4%]

* [PATCH v6] graph: mcore: optimize graph search
  2024-12-13  2:21 10%       ` [PATCH v5] graph: mcore: optimize graph search Huichao Cai
  2024-12-13 14:36  3%         ` David Marchand
@ 2024-12-16  1:43 11%         ` Huichao Cai
  2024-12-16 14:49  4%           ` David Marchand
                             ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Huichao Cai @ 2024-12-16  1:43 UTC (permalink / raw)
  Cc: dev, jerinj, kirankumark, ndabilpuram, yanzhirun_163

In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 devtools/libabigail.abignore               |  5 +++++
 doc/guides/rel_notes/release_25_03.rst     |  2 ++
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 21b8cd6113..a92ee29512 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -33,3 +33,8 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+[suppress_type]
+       name = rte_node
+       has_size_change = no
+       has_data_member_inserted_between =
+{offset_of(total_sched_fail), offset_of(xstat_off)}
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 426dfcd982..55ffe8170d 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -102,6 +102,8 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* graph: Added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
+
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index d3ec88519d..aef0f65673 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 
-- 
2.27.0


^ permalink raw reply	[relevance 11%]

* Re: [PATCH v5] graph: mcore: optimize graph search
  2024-12-13  2:21 10%       ` [PATCH v5] graph: mcore: optimize graph search Huichao Cai
@ 2024-12-13 14:36  3%         ` David Marchand
  2024-12-16  1:43 11%         ` [PATCH v6] " Huichao Cai
  1 sibling, 0 replies; 200+ results
From: David Marchand @ 2024-12-13 14:36 UTC (permalink / raw)
  To: Huichao Cai
  Cc: jerinj, kirankumark, ndabilpuram, yanzhirun_163, dev, Thomas Monjalon

On Fri, Dec 13, 2024 at 3:22 AM Huichao Cai <chcchc88@163.com> wrote:
> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
> index d3ec88519d..aef0f65673 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>                         unsigned int lcore_id;  /**< Node running lcore. */
>                         uint64_t total_sched_objs; /**< Number of objects scheduled. */
>                         uint64_t total_sched_fail; /**< Number of scheduled failure. */
> +                       struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
>                 } dispatch;
>         };

The rte_node struct size is not changed with this patch.
In v24.11, rte_node objects are populated/allocated in
graph_nodes_populate which zero's the whole rte_node.
So this change looks safe from an ABI compat with v24.11 pov.

However, we need to waive the warning from libabigail:
http://mails.dpdk.org/archives/test-report/2024-December/834167.html

Please add a temporary exception in devtools/libabigail.abignore.

It should be something like:
[suppress_type]
       name = rte_node
       has_size_change = no
       has_data_member_inserted_between =
{offset_of(total_sched_fail), offset_of(xstat_off)}


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* [PATCH v5] graph: mcore: optimize graph search
  2024-11-14  8:45  5%     ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
  2024-11-14  8:45  5%       ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-12-13  2:21 10%       ` Huichao Cai
  2024-12-13 14:36  3%         ` David Marchand
  2024-12-16  1:43 11%         ` [PATCH v6] " Huichao Cai
  1 sibling, 2 replies; 200+ results
From: Huichao Cai @ 2024-12-13  2:21 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 doc/guides/rel_notes/release_25_03.rst     |  2 ++
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
index 426dfcd982..55ffe8170d 100644
--- a/doc/guides/rel_notes/release_25_03.rst
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -102,6 +102,8 @@ ABI Changes
 
 * No ABI change that would break compatibility with 24.11.
 
+* graph: Added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
+
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index d3ec88519d..aef0f65673 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 
-- 
2.27.0


^ permalink raw reply	[relevance 10%]

* Re: [PATCH 0/3] Defer lcore variables allocation
  2024-12-09 17:40  3%       ` David Marchand
@ 2024-12-10  9:41  0%         ` Mattias Rönnblom
  0 siblings, 0 replies; 200+ results
From: Mattias Rönnblom @ 2024-12-10  9:41 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, thomas, frode.nordahl, mattias.ronnblom

On 2024-12-09 18:40, David Marchand wrote:
> On Mon, Dec 9, 2024 at 4:39 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>> On 2024-12-09 12:03, David Marchand wrote:
>>> On Fri, Dec 6, 2024 at 12:02 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>>>> On 2024-12-05 18:57, David Marchand wrote:
>>>>> As I had reported in rc2, the lcore variables allocation have a
>>>>> noticeable impact on applications consuming DPDK, even when such
>>>>> applications does not use DPDK, or use features associated to
>>>>> some lcore variables.
>>>>>
>>>>> While the amount has been reduced in a rush before rc2,
>>>>> there are still cases when the increased memory footprint is noticed
>>>>> like in scaling tests.
>>>>> See https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2090931
>>>>>
>>>>
>>>> What this bug report fails to mention is that it only affects
>>>> applications using locked memory.
>>>
>>> - By locked memory, are you referring to mlock() and friends?
>>> No ovsdb binary calls them, only the datapath cares about mlocking.
>>>
>>>
>>> - At a minimum, I understand the lcore var change introduced an
>>> increase in memory of 4kB * 128 (getpagesize() * RTE_MAX_LCORES),
>>> since lcore_var_alloc() calls memset() of the lcore var size, for
>>> every lcore.
>>>
>>
>> Yes, that is my understanding. It's also consistent with the
>> measurements I've posted on this list.
>>
>>> In this unit test where 1000 processes are kept alive in parallel,
>>> this means memory consumption increased by 512k * 1000, so ~500M at
>>> least.
>>> This amount of memory is probably significant in a resource-restrained
>>> env like a (Ubuntu) CI.
>>>
>>>
>>
>> I wouldn't expect thousands of concurrent processes in a
>> resource-constrained system. Sounds wasteful indeed. But sure, there may
>> well be scenarios where this make sense.
>>
>>> - I went and traced this unit tests on my laptop by monitoring
>>> kmem:mm_page_alloc, though there may be a better metrics when it comes
>>> to memory consumption.
>>>
>>> # dir=build; perf stat -e kmem:mm_page_alloc -- tests/testsuite -C
>>> $dir/tests AUTOTEST_PATH=$dir/utilities:$dir/vswitchd:$dir/ovsdb:$dir/vtep:$dir/tests:$dir/ipsec::
>>> 2154
>>>
>>> Which gives:
>>> - 1 635 489      kmem:mm_page_alloc for v23.11
>>> - 5 777 043      kmem:mm_page_alloc for v24.11
>>>
>>
>> Interesting. What is vm.overcommit_memory set to?
> 
> # cat /proc/sys/vm/overcommit_memory
> 0
> 
> And I am not sure what is being used in Ubuntu CI.
> 
> But the problem is, in the end, simpler.
> 
> [snip]
> 
>>
>>> There is a 4M difference, where I would expect 128k.
>>> So something more happens, than a simple page allocation per lcore,
>>> though I fail to understand what.
> 
> Isolating the perf events for one process of this huge test, I counted
> 4878 page alloc calls.
>  From them, 4108 had rte_lcore_var_alloc in their calling stack which
> is unexpected.
> 
> After spending some time reading glibc, I noticed alloc_perturb().
> *bingo*, I remembered that OVS unit tests are run with MALLOC_PERTURB_
> (=165 after double checking OVS sources).
> 
> """
> Tunable: glibc.malloc.perturb
> 
> This tunable supersedes the MALLOC_PERTURB_ environment variable and
> is identical in features.
> 
> If set to a non-zero value, memory blocks are initialized with values
> depending on some low order bits of this tunable when they are
> allocated (except when allocated by calloc) and freed. This can be
> used to debug the use of uninitialized or freed heap memory. Note that
> this option does not guarantee that the freed block will have any
> specific values. It only guarantees that the content the block had
> before it was freed will be overwritten.
> 
> The default value of this tunable is ‘0’.
> """
> 

OK, excellent work, detective. :)

Do have a workaround for this issue, so that this test suite will work 
with vanilla DPDK 24.11? I guess OVS wants to keep the PERTURB settings.

The fix you've suggested will solve this issue for the no-DPDK-usage 
case. I'm guessing allocating the first lcore var block off of the BSS 
(e.g., via a static variable) would as well, in addition to solving 
similar cases but where there is "light" DPDK usage (i.e., 
rte_eal_init() is called, but with no real app).

> Now, reproducing this out of the test:
> 
> $ perf stat -e kmem:mm_page_alloc -- ./build/ovsdb/ovsdb-client --help
>> /dev/null
>   Performance counter stats for './build/ovsdb/ovsdb-client --help':
>                 810      kmem:mm_page_alloc
>         0,003277941 seconds time elapsed
>         0,003260000 seconds user
>         0,000000000 seconds sys
> 
> $ MALLOC_PERTURB_=165 perf stat -e kmem:mm_page_alloc --
> ./build/ovsdb/ovsdb-client --help >/dev/null
>   Performance counter stats for './build/ovsdb/ovsdb-client --help':
>               4 789      kmem:mm_page_alloc
>         0,008766171 seconds time elapsed
>         0,000976000 seconds user
>         0,007794000 seconds sys
> 
> So the issue is not triggered by mlock'd memory, but by the whole
> buffer of 16M for lcore variables being touched by a glibc debugging
> feature.
> > And in Ubuntu CI, it translated to requesting 16G.
> 
>>>
>>>
>>> Btw, just focusing on lcore var, I did two more tests:
>>> - 1 606 998      kmem:mm_page_alloc for v24.11 + revert all lcore var changes.
>>> - 1 634 606      kmem:mm_page_alloc for v24.11 + current series with
>>> postponed allocations.
>>>
>>>
>>
>> If one move initialization to shared object constructors (from having
>> been at some later time), and then end up not running that
>> initialization code at all (e.g., DPDK is not used), those code pages
>> will increase RSS. That might well hurt more than the lcore variable
>> memory itself, depending on how much code is run.
>>
>> However, such read-only pages can be replaced with something more useful
>> if the system is under memory pressure, so they aren't really a big
>> issue as far as (real) memory footprint is concerned.
>>
>> Just linking to DPDK (and its dependencies) already came with a 1-7 MB
>> RSS penalty, prior to lcore variables. I wonder how much of that goes
>> away if all RTE_INIT() type constructors are removed.
> 
> Regardless of the RSS change, removing completely constructors is not simple.
> Postponing *all* existing constructors from DPDK code would be an ABI
> breakage, as RTE_INIT have a priority notion and an application
> callbacks using RTE_INIT may rely on this.

Agreed.

> Just deferring "unprioritised" constructors would be doable on paper,
> but the location in rte_eal_init where those are is deferred would
> have to be carefully evaluated (with -d plugins in mind).
> 
> 

It seems to me that a reworking of this area should have a bigger scope 
than just addressing this issue.

RTE_INIT() should probably be deprecated, and DPDK shouldn't encourage 
the use of shared-object level constructors.

For dynamically loaded modules (-d), there needs to be some kind of 
replacement, serving the same function.

There should probably be some way to hook into the initialization 
process (available also for apps), which should all happen at 
rte_eal_init() (or later).

Does the priority concept make sense? At least conceptually, the 
initialization should be based off a dependency graph (DAG).

You could reduce the priorities to a number of named stages (just like 
in FreeBSD or Linux). A minor tweak to the current model. However, in 
DPDK, it would be useful if a generic facility could be used by apps, 
and thus the number and names of the stages are open ended (unlike the 
UNIX kernels').

You could rely on explicit initialization alone, where each module 
initializes it's dependencies. That would lead to repeated init function 
calls on the same module, unless there's some init framework help from 
EAL to prevent that. Overall, that would lead to more code, where 
various higher-level modules needs to initialize many dependencies.

Maybe the DAG is available on the build (meson) level, and thus the code 
can be generated out of that?

Some random thoughts.


^ permalink raw reply	[relevance 0%]

* Re: [PATCH RESEND v7 2/5] ethdev: fix skip valid port in probing callback
  @ 2024-12-10  1:50  0%     ` lihuisong (C)
  2025-01-10  3:21  0%       ` lihuisong (C)
  0 siblings, 1 reply; 200+ results
From: lihuisong (C) @ 2024-12-10  1:50 UTC (permalink / raw)
  To: thomas, ferruh.yigit, Stephen Hemminger
  Cc: dev, fengchengwen, liuyonglong, andrew.rybchenko, Somnath Kotur,
	Ajit Khaparde, Dariusz Sosnowski, Suanming Mou, Matan Azrad,
	Ori Kam, Viacheslav Ovsiienko

Hi Ferruh, Stephen and Thomas,

Can you take a look at this patch? After all, it is an issue in ethdev 
layer.
This also is the fruit we disscussed with Thomas and Ferruh before.
Please go back to this thread. If we don't need this patch, please let 
me know. I will drop it from my upstreaming list.

/Huisong


在 2024/9/29 13:52, Huisong Li 写道:
> The event callback in application may use the macro RTE_ETH_FOREACH_DEV to
> iterate over all enabled ports to do something(like, verifying the port id
> validity) when receive a probing event. If the ethdev state of a port is
> not RTE_ETH_DEV_UNUSED, this port will be considered as a valid port.
>
> However, this state is set to RTE_ETH_DEV_ATTACHED after pushing probing
> event. It means that probing callback will skip this port. But this
> assignment can not move to front of probing notification. See
> commit be8cd210379a ("ethdev: fix port probing notification")
>
> So this patch has to add a new state, RTE_ETH_DEV_ALLOCATED. Set the ethdev
> state to RTE_ETH_DEV_ALLOCATED before pushing probing event and set it to
> RTE_ETH_DEV_ATTACHED after definitely probed. And this port is valid if its
> device state is 'ALLOCATED' or 'ATTACHED'.
>
> In addition, the new state has to be placed behind 'REMOVED' to avoid ABI
> break. Fortunately, this ethdev state is internal and applications can not
> access it directly. So this patch encapsulates an API, rte_eth_dev_is_used,
> for ethdev or PMD to call and eliminate concerns about using this state
> enum value comparison.
>
> Fixes: be8cd210379a ("ethdev: fix port probing notification")
> Cc: stable@dpdk.org
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>   drivers/net/bnxt/bnxt_ethdev.c |  3 ++-
>   drivers/net/mlx5/mlx5.c        |  2 +-
>   lib/ethdev/ethdev_driver.c     | 13 ++++++++++---
>   lib/ethdev/ethdev_driver.h     | 12 ++++++++++++
>   lib/ethdev/ethdev_pci.h        |  2 +-
>   lib/ethdev/rte_class_eth.c     |  2 +-
>   lib/ethdev/rte_ethdev.c        |  4 ++--
>   lib/ethdev/rte_ethdev.h        |  4 +++-
>   lib/ethdev/version.map         |  1 +
>   9 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index c6ad764813..7401dcd8b5 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -6612,7 +6612,8 @@ bnxt_dev_uninit(struct rte_eth_dev *eth_dev)
>   
>   	PMD_DRV_LOG(DEBUG, "Calling Device uninit\n");
>   
> -	if (eth_dev->state != RTE_ETH_DEV_UNUSED)
> +
> +	if (rte_eth_dev_is_used(eth_dev->state))
>   		bnxt_dev_close_op(eth_dev);
>   
>   	return 0;
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 8d266b0e64..0df49e1f69 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -3371,7 +3371,7 @@ mlx5_eth_find_next(uint16_t port_id, struct rte_device *odev)
>   	while (port_id < RTE_MAX_ETHPORTS) {
>   		struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>   
> -		if (dev->state != RTE_ETH_DEV_UNUSED &&
> +		if (rte_eth_dev_is_used(dev->state) &&
>   		    dev->device &&
>   		    (dev->device == odev ||
>   		     (dev->device->driver &&
> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> index c335a25a82..a87dbb00ff 100644
> --- a/lib/ethdev/ethdev_driver.c
> +++ b/lib/ethdev/ethdev_driver.c
> @@ -55,8 +55,8 @@ eth_dev_find_free_port(void)
>   	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
>   		/* Using shared name field to find a free port. */
>   		if (eth_dev_shared_data->data[i].name[0] == '\0') {
> -			RTE_ASSERT(rte_eth_devices[i].state ==
> -				   RTE_ETH_DEV_UNUSED);
> +			RTE_ASSERT(!rte_eth_dev_is_used(
> +					rte_eth_devices[i].state));
>   			return i;
>   		}
>   	}
> @@ -221,11 +221,18 @@ rte_eth_dev_probing_finish(struct rte_eth_dev *dev)
>   	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
>   		eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>   
> +	dev->state = RTE_ETH_DEV_ALLOCATED;
>   	rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_NEW, NULL);
>   
>   	dev->state = RTE_ETH_DEV_ATTACHED;
>   }
>   
> +bool rte_eth_dev_is_used(uint16_t dev_state)
> +{
> +	return dev_state == RTE_ETH_DEV_ALLOCATED ||
> +		dev_state == RTE_ETH_DEV_ATTACHED;
> +}
> +
>   int
>   rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>   {
> @@ -243,7 +250,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>   	if (ret != 0)
>   		return ret;
>   
> -	if (eth_dev->state != RTE_ETH_DEV_UNUSED)
> +	if (rte_eth_dev_is_used(eth_dev->state))
>   		rte_eth_dev_callback_process(eth_dev,
>   				RTE_ETH_EVENT_DESTROY, NULL);
>   
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index abed4784aa..aa35b65848 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -1704,6 +1704,18 @@ int rte_eth_dev_callback_process(struct rte_eth_dev *dev,
>   __rte_internal
>   void rte_eth_dev_probing_finish(struct rte_eth_dev *dev);
>   
> +/**
> + * Check if a Ethernet device state is used or not
> + *
> + * @param dev_state
> + *   The state of the Ethernet device
> + * @return
> + *   - true if the state of the Ethernet device is allocated or attached
> + *   - false if this state is neither allocated nor attached
> + */
> +__rte_internal
> +bool rte_eth_dev_is_used(uint16_t dev_state);
> +
>   /**
>    * Create memzone for HW rings.
>    * malloc can't be used as the physical address is needed.
> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> index ec4f731270..05dec6716b 100644
> --- a/lib/ethdev/ethdev_pci.h
> +++ b/lib/ethdev/ethdev_pci.h
> @@ -179,7 +179,7 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
>   	 * eth device has been released.
>   	 */
>   	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
> -	    eth_dev->state == RTE_ETH_DEV_UNUSED)
> +	    !rte_eth_dev_is_used(eth_dev->state))
>   		return 0;
>   
>   	if (dev_uninit) {
> diff --git a/lib/ethdev/rte_class_eth.c b/lib/ethdev/rte_class_eth.c
> index b52f1dd9f2..81e70670d9 100644
> --- a/lib/ethdev/rte_class_eth.c
> +++ b/lib/ethdev/rte_class_eth.c
> @@ -118,7 +118,7 @@ eth_dev_match(const struct rte_eth_dev *edev,
>   	const struct rte_kvargs *kvlist = arg->kvlist;
>   	unsigned int pair;
>   
> -	if (edev->state == RTE_ETH_DEV_UNUSED)
> +	if (!rte_eth_dev_is_used(edev->state))
>   		return -1;
>   	if (arg->device != NULL && arg->device != edev->device)
>   		return -1;
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index a1f7efa913..4dc66abb7b 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -349,7 +349,7 @@ uint16_t
>   rte_eth_find_next(uint16_t port_id)
>   {
>   	while (port_id < RTE_MAX_ETHPORTS &&
> -			rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED)
> +	       !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
>   		port_id++;
>   
>   	if (port_id >= RTE_MAX_ETHPORTS)
> @@ -408,7 +408,7 @@ rte_eth_dev_is_valid_port(uint16_t port_id)
>   	int is_valid;
>   
>   	if (port_id >= RTE_MAX_ETHPORTS ||
> -	    (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED))
> +	    !rte_eth_dev_is_used(rte_eth_devices[port_id].state))
>   		is_valid = 0;
>   	else
>   		is_valid = 1;
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index a9f92006da..9cc37e8cde 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -2083,10 +2083,12 @@ typedef uint16_t (*rte_tx_callback_fn)(uint16_t port_id, uint16_t queue,
>   enum rte_eth_dev_state {
>   	/** Device is unused before being probed. */
>   	RTE_ETH_DEV_UNUSED = 0,
> -	/** Device is attached when allocated in probing. */
> +	/** Device is attached when definitely probed. */
>   	RTE_ETH_DEV_ATTACHED,
>   	/** Device is in removed state when plug-out is detected. */
>   	RTE_ETH_DEV_REMOVED,
> +	/** Device is allocated and is set before reporting new event. */
> +	RTE_ETH_DEV_ALLOCATED,
>   };
>   
>   struct rte_eth_dev_sriov {
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index f63dc32aa2..6ecf1ab89d 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -349,6 +349,7 @@ INTERNAL {
>   	rte_eth_dev_get_by_name;
>   	rte_eth_dev_is_rx_hairpin_queue;
>   	rte_eth_dev_is_tx_hairpin_queue;
> +	rte_eth_dev_is_used;
>   	rte_eth_dev_probing_finish;
>   	rte_eth_dev_release_port;
>   	rte_eth_dev_internal_reset;

^ permalink raw reply	[relevance 0%]

* Re: [PATCH 0/3] Defer lcore variables allocation
  @ 2024-12-09 17:40  3%       ` David Marchand
  2024-12-10  9:41  0%         ` Mattias Rönnblom
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2024-12-09 17:40 UTC (permalink / raw)
  To: Mattias Rönnblom; +Cc: dev, thomas, frode.nordahl, mattias.ronnblom

On Mon, Dec 9, 2024 at 4:39 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
> On 2024-12-09 12:03, David Marchand wrote:
> > On Fri, Dec 6, 2024 at 12:02 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
> >> On 2024-12-05 18:57, David Marchand wrote:
> >>> As I had reported in rc2, the lcore variables allocation have a
> >>> noticeable impact on applications consuming DPDK, even when such
> >>> applications does not use DPDK, or use features associated to
> >>> some lcore variables.
> >>>
> >>> While the amount has been reduced in a rush before rc2,
> >>> there are still cases when the increased memory footprint is noticed
> >>> like in scaling tests.
> >>> See https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2090931
> >>>
> >>
> >> What this bug report fails to mention is that it only affects
> >> applications using locked memory.
> >
> > - By locked memory, are you referring to mlock() and friends?
> > No ovsdb binary calls them, only the datapath cares about mlocking.
> >
> >
> > - At a minimum, I understand the lcore var change introduced an
> > increase in memory of 4kB * 128 (getpagesize() * RTE_MAX_LCORES),
> > since lcore_var_alloc() calls memset() of the lcore var size, for
> > every lcore.
> >
>
> Yes, that is my understanding. It's also consistent with the
> measurements I've posted on this list.
>
> > In this unit test where 1000 processes are kept alive in parallel,
> > this means memory consumption increased by 512k * 1000, so ~500M at
> > least.
> > This amount of memory is probably significant in a resource-restrained
> > env like a (Ubuntu) CI.
> >
> >
>
> I wouldn't expect thousands of concurrent processes in a
> resource-constrained system. Sounds wasteful indeed. But sure, there may
> well be scenarios where this make sense.
>
> > - I went and traced this unit tests on my laptop by monitoring
> > kmem:mm_page_alloc, though there may be a better metrics when it comes
> > to memory consumption.
> >
> > # dir=build; perf stat -e kmem:mm_page_alloc -- tests/testsuite -C
> > $dir/tests AUTOTEST_PATH=$dir/utilities:$dir/vswitchd:$dir/ovsdb:$dir/vtep:$dir/tests:$dir/ipsec::
> > 2154
> >
> > Which gives:
> > - 1 635 489      kmem:mm_page_alloc for v23.11
> > - 5 777 043      kmem:mm_page_alloc for v24.11
> >
>
> Interesting. What is vm.overcommit_memory set to?

# cat /proc/sys/vm/overcommit_memory
0

And I am not sure what is being used in Ubuntu CI.

But the problem is, in the end, simpler.

[snip]

>
> > There is a 4M difference, where I would expect 128k.
> > So something more happens, than a simple page allocation per lcore,
> > though I fail to understand what.

Isolating the perf events for one process of this huge test, I counted
4878 page alloc calls.
From them, 4108 had rte_lcore_var_alloc in their calling stack which
is unexpected.

After spending some time reading glibc, I noticed alloc_perturb().
*bingo*, I remembered that OVS unit tests are run with MALLOC_PERTURB_
(=165 after double checking OVS sources).

"""
Tunable: glibc.malloc.perturb

This tunable supersedes the MALLOC_PERTURB_ environment variable and
is identical in features.

If set to a non-zero value, memory blocks are initialized with values
depending on some low order bits of this tunable when they are
allocated (except when allocated by calloc) and freed. This can be
used to debug the use of uninitialized or freed heap memory. Note that
this option does not guarantee that the freed block will have any
specific values. It only guarantees that the content the block had
before it was freed will be overwritten.

The default value of this tunable is ‘0’.
"""

Now, reproducing this out of the test:

$ perf stat -e kmem:mm_page_alloc -- ./build/ovsdb/ovsdb-client --help
>/dev/null
 Performance counter stats for './build/ovsdb/ovsdb-client --help':
               810      kmem:mm_page_alloc
       0,003277941 seconds time elapsed
       0,003260000 seconds user
       0,000000000 seconds sys

$ MALLOC_PERTURB_=165 perf stat -e kmem:mm_page_alloc --
./build/ovsdb/ovsdb-client --help >/dev/null
 Performance counter stats for './build/ovsdb/ovsdb-client --help':
             4 789      kmem:mm_page_alloc
       0,008766171 seconds time elapsed
       0,000976000 seconds user
       0,007794000 seconds sys

So the issue is not triggered by mlock'd memory, but by the whole
buffer of 16M for lcore variables being touched by a glibc debugging
feature.

And in Ubuntu CI, it translated to requesting 16G.

> >
> >
> > Btw, just focusing on lcore var, I did two more tests:
> > - 1 606 998      kmem:mm_page_alloc for v24.11 + revert all lcore var changes.
> > - 1 634 606      kmem:mm_page_alloc for v24.11 + current series with
> > postponed allocations.
> >
> >
>
> If one move initialization to shared object constructors (from having
> been at some later time), and then end up not running that
> initialization code at all (e.g., DPDK is not used), those code pages
> will increase RSS. That might well hurt more than the lcore variable
> memory itself, depending on how much code is run.
>
> However, such read-only pages can be replaced with something more useful
> if the system is under memory pressure, so they aren't really a big
> issue as far as (real) memory footprint is concerned.
>
> Just linking to DPDK (and its dependencies) already came with a 1-7 MB
> RSS penalty, prior to lcore variables. I wonder how much of that goes
> away if all RTE_INIT() type constructors are removed.

Regardless of the RSS change, removing completely constructors is not simple.
Postponing *all* existing constructors from DPDK code would be an ABI
breakage, as RTE_INIT have a priority notion and an application
callbacks using RTE_INIT may rely on this.
Just deferring "unprioritised" constructors would be doable on paper,
but the location in rte_eal_init where those are is deferred would
have to be carefully evaluated (with -d plugins in mind).


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* RE: [PATCH v16 1/4] lib: add generic support for reading PMU events
  @ 2024-12-06 18:15  3%       ` Konstantin Ananyev
  2025-01-07  7:45  0%         ` Tomasz Duszynski
  0 siblings, 1 reply; 200+ results
From: Konstantin Ananyev @ 2024-12-06 18:15 UTC (permalink / raw)
  To: Tomasz Duszynski, Thomas Monjalon
  Cc: Ruifeng.Wang, bruce.richardson, david.marchand, dev, jerinj,
	konstantin.v.ananyev, mattias.ronnblom, mb, roretzla, stephen,
	zhoumin



> 
> Add support for programming PMU counters and reading their values
> in runtime bypassing kernel completely.
> 
> This is especially useful in cases where CPU cores are isolated
> i.e run dedicated tasks. In such cases one cannot use standard
> perf utility without sacrificing latency and performance.
> 
> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
> ---

Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

As future possible enhancements - I think it would be useful to
make control-path API MT safe, plus probably try to hide some of
the exposed internal structures (rte_pmu_event_group, etc.) inside .c
(to minimize surface for possible ABI breakage).

> --
> 2.34.1


^ permalink raw reply	[relevance 3%]

* Re: [RFC v3 2/2] ethdev: introduce the cache stashing hints API
  2024-12-05 15:40  3%       ` David Marchand
@ 2024-12-05 21:00  0%         ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-12-05 21:00 UTC (permalink / raw)
  To: David Marchand
  Cc: Wathsala Vithanage, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, dev, nd, Honnappa Nagarahalli, Dhruv Tripathi

[-- Attachment #1: Type: text/plain, Size: 6686 bytes --]

Your right my test was crude. Just do build and look at symbol table of
static linked binary.
I was confused since pointer is exposed but not data structure

On Thu, Dec 5, 2024, 07:40 David Marchand <david.marchand@redhat.com> wrote:

> On Tue, Dec 3, 2024 at 10:13 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Mon, 21 Oct 2024 01:52:46 +0000
> > Wathsala Vithanage <wathsala.vithanage@arm.com> wrote:
> >
> > > Extend the ethdev library to enable the stashing of different data
> > > objects, such as the ones listed below, into CPU caches directly
> > > from the NIC.
> > >
> > > - Rx/Tx queue descriptors
> > > - Rx packets
> > > - Packet headers
> > > - packet payloads
> > > - Data of a packet at an offset from the start of the packet
> > >
> > > The APIs are designed in a hardware/vendor agnostic manner such that
> > > supporting PMDs could use any capabilities available in the underlying
> > > hardware for fine-grained stashing of data objects into a CPU cache
> > > (e.g., Steering Tags int PCIe TLP Processing Hints).
> > >
> > > The API provides an interface to query the availability of stashing
> > > capabilities, i.e., platform/NIC support, stashable object types, etc,
> > > via the rte_eth_dev_stashing_capabilities_get interface.
> > >
> > > The function pair rte_eth_dev_stashing_rx_config_set and
> > > rte_eth_dev_stashing_tx_config_set sets the stashing hint (the CPU,
> > > cache level, and data object types) on the Rx and Tx queues.
> > >
> > > PMDs that support stashing must register their implementations with the
> > > following eth_dev_ops callbacks, which are invoked by the ethdev
> > > functions listed above.
> > >
> > > - stashing_capabilities_get
> > > - stashing_rx_hints_set
> > > - stashing_tx_hints_set
> > >
> > > Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
> > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
> > >
> > > ---
> > >  lib/ethdev/ethdev_driver.h |  66 +++++++++++++++
> > >  lib/ethdev/rte_ethdev.c    | 120 +++++++++++++++++++++++++++
> > >  lib/ethdev/rte_ethdev.h    | 161 +++++++++++++++++++++++++++++++++++++
> > >  lib/ethdev/version.map     |   4 +
> > >  4 files changed, 351 insertions(+)
> > >
> > > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > > index 1fd4562b40..7caaea54a8 100644
> > > --- a/lib/ethdev/ethdev_driver.h
> > > +++ b/lib/ethdev/ethdev_driver.h
> > > @@ -1367,6 +1367,68 @@ enum rte_eth_dev_operation {
> > >  typedef uint64_t (*eth_get_restore_flags_t)(struct rte_eth_dev *dev,
> > >                                           enum rte_eth_dev_operation
> op);
> > >
> > > +/**
> > > + * @internal
> > > + * Set cache stashing hints in Rx queue.
> > > + *
> > > + * @param dev
> > > + *   Port (ethdev) handle.
> > > + * @param queue_id
> > > + *   Rx queue.
> > > + * @param config
> > > + *   Stashing hints configuration for the queue.
> > > + *
> > > + * @return
> > > + *   -ENOTSUP if the device or the platform does not support cache
> stashing.
> > > + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing
> feature.
> > > + *   -EINVAL  on invalid arguments.
> > > + *   0 on success.
> > > + */
> > > +typedef int (*eth_stashing_rx_hints_set_t)(struct rte_eth_dev *dev,
> uint16_t queue_id,
> > > +                                        struct
> rte_eth_stashing_config *config);
> > > +
> > > +/**
> > > + * @internal
> > > + * Set cache stashing hints in Tx queue.
> > > + *
> > > + * @param dev
> > > + *   Port (ethdev) handle.
> > > + * @param queue_id
> > > + *   Tx queue.
> > > + * @param config
> > > + *   Stashing hints configuration for the queue.
> > > + *
> > > + * @return
> > > + *   -ENOTSUP if the device or the platform does not support cache
> stashing.
> > > + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing
> feature.
> > > + *   -EINVAL  on invalid arguments.
> > > + *   0 on success.
> > > + */
> > > +typedef int (*eth_stashing_tx_hints_set_t)(struct rte_eth_dev *dev,
> uint16_t queue_id,
> > > +                                        struct
> rte_eth_stashing_config *config);
> > > +
> > > +/**
> > > + * @internal
> > > + * Get cache stashing object types supported in the ethernet device.
> > > + * The return value indicates availability of stashing hints support
> > > + * in the hardware and the PMD.
> > > + *
> > > + * @param dev
> > > + *   Port (ethdev) handle.
> > > + * @param objects
> > > + *   PMD sets supported bits on return.
> > > + *
> > > + * @return
> > > + *   -ENOTSUP if the device or the platform does not support cache
> stashing.
> > > + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing
> feature.
> > > + *   -EINVAL  on NULL values for types or hints parameters.
> > > + *   On return, types and hints parameters will have bits set for
> supported
> > > + *   object types and hints.
> > > + *   0 on success.
> > > + */
> > > +typedef int (*eth_stashing_capabilities_get_t)(struct rte_eth_dev
> *dev,
> > > +                                          uint16_t *objects);
> > > +
> > >  /**
> > >   * @internal A structure containing the functions exported by an
> Ethernet driver.
> > >   */
> > > @@ -1393,6 +1455,10 @@ struct eth_dev_ops {
> > >       eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC
> address */
> > >       eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address
> */
> > >       eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address
> */
> > > +     eth_stashing_rx_hints_set_t   stashing_rx_hints_set; /**< Set Rx
> cache stashing*/
> > > +     eth_stashing_tx_hints_set_t   stashing_tx_hints_set; /**< Set Tx
> cache stashing*/
> > > +     /** Get supported stashing hints*/
> > > +     eth_stashing_capabilities_get_t stashing_capabilities_get;
> > >       /** Set list of multicast addresses */
> > >       eth_set_mc_addr_list_t     set_mc_addr_list;
> > >       mtu_set_t                  mtu_set;       /**< Set MTU */
> >
> > Since eth_dev_ops is visible in application binary, it is part of the
> ABI.
> > Therefore it can not be changed until 25.11 release.
>
> The layout of eth_dev_ops is not exposed to applications as it is in a
> private header.
> Could you clarify where you see a breakage for an application?
>
>
> I see an ABI breakage for out of tree drivers though.
> This could be avoided by moving those added ops at the end of the struct?
>
>
> --
> David Marchand
>
>

[-- Attachment #2: Type: text/html, Size: 8692 bytes --]

^ permalink raw reply	[relevance 0%]

* Re: [RFC v3 2/2] ethdev: introduce the cache stashing hints API
  2024-12-03 21:13  3%     ` Stephen Hemminger
@ 2024-12-05 15:40  3%       ` David Marchand
  2024-12-05 21:00  0%         ` Stephen Hemminger
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2024-12-05 15:40 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Wathsala Vithanage, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, dev, nd, Honnappa Nagarahalli, Dhruv Tripathi

On Tue, Dec 3, 2024 at 10:13 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Mon, 21 Oct 2024 01:52:46 +0000
> Wathsala Vithanage <wathsala.vithanage@arm.com> wrote:
>
> > Extend the ethdev library to enable the stashing of different data
> > objects, such as the ones listed below, into CPU caches directly
> > from the NIC.
> >
> > - Rx/Tx queue descriptors
> > - Rx packets
> > - Packet headers
> > - packet payloads
> > - Data of a packet at an offset from the start of the packet
> >
> > The APIs are designed in a hardware/vendor agnostic manner such that
> > supporting PMDs could use any capabilities available in the underlying
> > hardware for fine-grained stashing of data objects into a CPU cache
> > (e.g., Steering Tags int PCIe TLP Processing Hints).
> >
> > The API provides an interface to query the availability of stashing
> > capabilities, i.e., platform/NIC support, stashable object types, etc,
> > via the rte_eth_dev_stashing_capabilities_get interface.
> >
> > The function pair rte_eth_dev_stashing_rx_config_set and
> > rte_eth_dev_stashing_tx_config_set sets the stashing hint (the CPU,
> > cache level, and data object types) on the Rx and Tx queues.
> >
> > PMDs that support stashing must register their implementations with the
> > following eth_dev_ops callbacks, which are invoked by the ethdev
> > functions listed above.
> >
> > - stashing_capabilities_get
> > - stashing_rx_hints_set
> > - stashing_tx_hints_set
> >
> > Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
> >
> > ---
> >  lib/ethdev/ethdev_driver.h |  66 +++++++++++++++
> >  lib/ethdev/rte_ethdev.c    | 120 +++++++++++++++++++++++++++
> >  lib/ethdev/rte_ethdev.h    | 161 +++++++++++++++++++++++++++++++++++++
> >  lib/ethdev/version.map     |   4 +
> >  4 files changed, 351 insertions(+)
> >
> > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > index 1fd4562b40..7caaea54a8 100644
> > --- a/lib/ethdev/ethdev_driver.h
> > +++ b/lib/ethdev/ethdev_driver.h
> > @@ -1367,6 +1367,68 @@ enum rte_eth_dev_operation {
> >  typedef uint64_t (*eth_get_restore_flags_t)(struct rte_eth_dev *dev,
> >                                           enum rte_eth_dev_operation op);
> >
> > +/**
> > + * @internal
> > + * Set cache stashing hints in Rx queue.
> > + *
> > + * @param dev
> > + *   Port (ethdev) handle.
> > + * @param queue_id
> > + *   Rx queue.
> > + * @param config
> > + *   Stashing hints configuration for the queue.
> > + *
> > + * @return
> > + *   -ENOTSUP if the device or the platform does not support cache stashing.
> > + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
> > + *   -EINVAL  on invalid arguments.
> > + *   0 on success.
> > + */
> > +typedef int (*eth_stashing_rx_hints_set_t)(struct rte_eth_dev *dev, uint16_t queue_id,
> > +                                        struct rte_eth_stashing_config *config);
> > +
> > +/**
> > + * @internal
> > + * Set cache stashing hints in Tx queue.
> > + *
> > + * @param dev
> > + *   Port (ethdev) handle.
> > + * @param queue_id
> > + *   Tx queue.
> > + * @param config
> > + *   Stashing hints configuration for the queue.
> > + *
> > + * @return
> > + *   -ENOTSUP if the device or the platform does not support cache stashing.
> > + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
> > + *   -EINVAL  on invalid arguments.
> > + *   0 on success.
> > + */
> > +typedef int (*eth_stashing_tx_hints_set_t)(struct rte_eth_dev *dev, uint16_t queue_id,
> > +                                        struct rte_eth_stashing_config *config);
> > +
> > +/**
> > + * @internal
> > + * Get cache stashing object types supported in the ethernet device.
> > + * The return value indicates availability of stashing hints support
> > + * in the hardware and the PMD.
> > + *
> > + * @param dev
> > + *   Port (ethdev) handle.
> > + * @param objects
> > + *   PMD sets supported bits on return.
> > + *
> > + * @return
> > + *   -ENOTSUP if the device or the platform does not support cache stashing.
> > + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
> > + *   -EINVAL  on NULL values for types or hints parameters.
> > + *   On return, types and hints parameters will have bits set for supported
> > + *   object types and hints.
> > + *   0 on success.
> > + */
> > +typedef int (*eth_stashing_capabilities_get_t)(struct rte_eth_dev *dev,
> > +                                          uint16_t *objects);
> > +
> >  /**
> >   * @internal A structure containing the functions exported by an Ethernet driver.
> >   */
> > @@ -1393,6 +1455,10 @@ struct eth_dev_ops {
> >       eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC address */
> >       eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address */
> >       eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address */
> > +     eth_stashing_rx_hints_set_t   stashing_rx_hints_set; /**< Set Rx cache stashing*/
> > +     eth_stashing_tx_hints_set_t   stashing_tx_hints_set; /**< Set Tx cache stashing*/
> > +     /** Get supported stashing hints*/
> > +     eth_stashing_capabilities_get_t stashing_capabilities_get;
> >       /** Set list of multicast addresses */
> >       eth_set_mc_addr_list_t     set_mc_addr_list;
> >       mtu_set_t                  mtu_set;       /**< Set MTU */
>
> Since eth_dev_ops is visible in application binary, it is part of the ABI.
> Therefore it can not be changed until 25.11 release.

The layout of eth_dev_ops is not exposed to applications as it is in a
private header.
Could you clarify where you see a breakage for an application?


I see an ABI breakage for out of tree drivers though.
This could be avoided by moving those added ops at the end of the struct?


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] version: 25.03-rc0
  2024-12-04 10:06  3% ` Thomas Monjalon
@ 2024-12-04 12:05  3%   ` David Marchand
  0 siblings, 0 replies; 200+ results
From: David Marchand @ 2024-12-04 12:05 UTC (permalink / raw)
  To: dpdklab, Patrick Robb
  Cc: dev, Aaron Conole, Michael Santana, ci, Thomas Monjalon

On Wed, Dec 4, 2024 at 11:06 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/12/2024 08:54, David Marchand:
> > Start a new release cycle with empty release notes.
> > Bump version and ABI minor.
> > Bump libabigail from 2.4 to 2.6 and enable ABI checks.
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
>
> Added a note about the new libabigail which will allow us
> to split a library (like EAL) without having warnings.
>
> Applied, so a new release cycle is started!
>
> Note to all branch maintainers: please rebase on this commit
> and enable ABI checks in your local configuration.
>
> Happy 25.03 :)

Time to re-enable ABI checks in CI too (please note that libabigail
version has been bumped).


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] version: 25.03-rc0
  2024-12-03  7:54 11% [PATCH] version: 25.03-rc0 David Marchand
@ 2024-12-04 10:06  3% ` Thomas Monjalon
  2024-12-04 12:05  3%   ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2024-12-04 10:06 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Aaron Conole, Michael Santana

03/12/2024 08:54, David Marchand:
> Start a new release cycle with empty release notes.
> Bump version and ABI minor.
> Bump libabigail from 2.4 to 2.6 and enable ABI checks.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

Added a note about the new libabigail which will allow us
to split a library (like EAL) without having warnings.

Applied, so a new release cycle is started!

Note to all branch maintainers: please rebase on this commit
and enable ABI checks in your local configuration.

Happy 25.03 :)



^ permalink raw reply	[relevance 3%]

* Re: [RFC v3 2/2] ethdev: introduce the cache stashing hints API
  @ 2024-12-03 21:13  3%     ` Stephen Hemminger
  2024-12-05 15:40  3%       ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2024-12-03 21:13 UTC (permalink / raw)
  To: Wathsala Vithanage
  Cc: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, dev, nd,
	Honnappa Nagarahalli, Dhruv Tripathi

On Mon, 21 Oct 2024 01:52:46 +0000
Wathsala Vithanage <wathsala.vithanage@arm.com> wrote:

> Extend the ethdev library to enable the stashing of different data
> objects, such as the ones listed below, into CPU caches directly
> from the NIC.
> 
> - Rx/Tx queue descriptors
> - Rx packets
> - Packet headers
> - packet payloads
> - Data of a packet at an offset from the start of the packet
> 
> The APIs are designed in a hardware/vendor agnostic manner such that
> supporting PMDs could use any capabilities available in the underlying
> hardware for fine-grained stashing of data objects into a CPU cache
> (e.g., Steering Tags int PCIe TLP Processing Hints).
> 
> The API provides an interface to query the availability of stashing
> capabilities, i.e., platform/NIC support, stashable object types, etc,
> via the rte_eth_dev_stashing_capabilities_get interface.
> 
> The function pair rte_eth_dev_stashing_rx_config_set and
> rte_eth_dev_stashing_tx_config_set sets the stashing hint (the CPU, 
> cache level, and data object types) on the Rx and Tx queues.
> 
> PMDs that support stashing must register their implementations with the
> following eth_dev_ops callbacks, which are invoked by the ethdev
> functions listed above.
> 
> - stashing_capabilities_get
> - stashing_rx_hints_set
> - stashing_tx_hints_set
> 
> Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
> 
> ---
>  lib/ethdev/ethdev_driver.h |  66 +++++++++++++++
>  lib/ethdev/rte_ethdev.c    | 120 +++++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev.h    | 161 +++++++++++++++++++++++++++++++++++++
>  lib/ethdev/version.map     |   4 +
>  4 files changed, 351 insertions(+)
> 
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index 1fd4562b40..7caaea54a8 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -1367,6 +1367,68 @@ enum rte_eth_dev_operation {
>  typedef uint64_t (*eth_get_restore_flags_t)(struct rte_eth_dev *dev,
>  					    enum rte_eth_dev_operation op);
>  
> +/**
> + * @internal
> + * Set cache stashing hints in Rx queue.
> + *
> + * @param dev
> + *   Port (ethdev) handle.
> + * @param queue_id
> + *   Rx queue.
> + * @param config
> + *   Stashing hints configuration for the queue.
> + *
> + * @return
> + *   -ENOTSUP if the device or the platform does not support cache stashing.
> + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
> + *   -EINVAL  on invalid arguments.
> + *   0 on success.
> + */
> +typedef int (*eth_stashing_rx_hints_set_t)(struct rte_eth_dev *dev, uint16_t queue_id,
> +					   struct rte_eth_stashing_config *config);
> +
> +/**
> + * @internal
> + * Set cache stashing hints in Tx queue.
> + *
> + * @param dev
> + *   Port (ethdev) handle.
> + * @param queue_id
> + *   Tx queue.
> + * @param config
> + *   Stashing hints configuration for the queue.
> + *
> + * @return
> + *   -ENOTSUP if the device or the platform does not support cache stashing.
> + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
> + *   -EINVAL  on invalid arguments.
> + *   0 on success.
> + */
> +typedef int (*eth_stashing_tx_hints_set_t)(struct rte_eth_dev *dev, uint16_t queue_id,
> +					   struct rte_eth_stashing_config *config);
> +
> +/**
> + * @internal
> + * Get cache stashing object types supported in the ethernet device.
> + * The return value indicates availability of stashing hints support
> + * in the hardware and the PMD.
> + *
> + * @param dev
> + *   Port (ethdev) handle.
> + * @param objects
> + *   PMD sets supported bits on return.
> + *
> + * @return
> + *   -ENOTSUP if the device or the platform does not support cache stashing.
> + *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
> + *   -EINVAL  on NULL values for types or hints parameters.
> + *   On return, types and hints parameters will have bits set for supported
> + *   object types and hints.
> + *   0 on success.
> + */
> +typedef int (*eth_stashing_capabilities_get_t)(struct rte_eth_dev *dev,
> +					     uint16_t *objects);
> +
>  /**
>   * @internal A structure containing the functions exported by an Ethernet driver.
>   */
> @@ -1393,6 +1455,10 @@ struct eth_dev_ops {
>  	eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC address */
>  	eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address */
>  	eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address */
> +	eth_stashing_rx_hints_set_t   stashing_rx_hints_set; /**< Set Rx cache stashing*/
> +	eth_stashing_tx_hints_set_t   stashing_tx_hints_set; /**< Set Tx cache stashing*/
> +	/** Get supported stashing hints*/
> +	eth_stashing_capabilities_get_t stashing_capabilities_get;
>  	/** Set list of multicast addresses */
>  	eth_set_mc_addr_list_t     set_mc_addr_list;
>  	mtu_set_t                  mtu_set;       /**< Set MTU */

Since eth_dev_ops is visible in application binary, it is part of the ABI.
Therefore it can not be changed until 25.11 release.


^ permalink raw reply	[relevance 3%]

* [PATCH] version: 25.03-rc0
@ 2024-12-03  7:54 11% David Marchand
  2024-12-04 10:06  3% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2024-12-03  7:54 UTC (permalink / raw)
  To: dev; +Cc: thomas, Aaron Conole, Michael Santana

Start a new release cycle with empty release notes.
Bump version and ABI minor.
Bump libabigail from 2.4 to 2.6 and enable ABI checks.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 .github/workflows/build.yml            |   8 +-
 ABI_VERSION                            |   2 +-
 VERSION                                |   2 +-
 doc/guides/rel_notes/index.rst         |   1 +
 doc/guides/rel_notes/release_25_03.rst | 138 +++++++++++++++++++++++++
 5 files changed, 145 insertions(+), 6 deletions(-)
 create mode 100644 doc/guides/rel_notes/release_25_03.rst

diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
index d99700b6e9..dcafb4a8f5 100644
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -12,7 +12,7 @@ defaults:
 env:
   REF_GIT_BRANCH: main
   REF_GIT_REPO: https://dpdk.org/git/dpdk
-  REF_GIT_TAG: none
+  REF_GIT_TAG: v24.11
 
 jobs:
   checkpatch:
@@ -46,7 +46,7 @@ jobs:
       BUILD_EXAMPLES: ${{ contains(matrix.config.checks, 'examples') }}
       CC: ccache ${{ matrix.config.compiler }}
       DEF_LIB: ${{ matrix.config.library }}
-      LIBABIGAIL_VERSION: libabigail-2.4
+      LIBABIGAIL_VERSION: libabigail-2.6
       MINGW: ${{ matrix.config.cross == 'mingw' }}
       MINI: ${{ matrix.config.mini != '' }}
       PPC64LE: ${{ matrix.config.cross == 'ppc64le' }}
@@ -69,7 +69,7 @@ jobs:
             checks: stdatomic
           - os: ubuntu-22.04
             compiler: gcc
-            checks: debug+doc+examples+tests
+            checks: abi+debug+doc+examples+tests
           - os: ubuntu-22.04
             compiler: clang
             checks: asan+doc+tests
@@ -133,7 +133,7 @@ jobs:
         python3-pyelftools python3-setuptools python3-wheel zlib1g-dev
     - name: Install libabigail build dependencies if no cache is available
       if: env.ABI_CHECKS == 'true' && steps.libabigail-cache.outputs.cache-hit != 'true'
-      run: sudo apt install -y autoconf automake libdw-dev libtool libxml2-dev
+      run: sudo apt install -y autoconf automake libdw-dev libtool libxml2-dev libxxhash-dev
     - name: Install i386 cross compiling packages
       if: env.BUILD_32BIT == 'true'
       run: sudo apt install -y gcc-multilib g++-multilib libnuma-dev:i386
diff --git a/ABI_VERSION b/ABI_VERSION
index be8e64f5a3..8b9bee5b58 100644
--- a/ABI_VERSION
+++ b/ABI_VERSION
@@ -1 +1 @@
-25.0
+25.1
diff --git a/VERSION b/VERSION
index 0a492611a0..04a8405dad 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-24.11.0
+25.03.0-rc0
diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
index 74ddae3e81..fc0309113e 100644
--- a/doc/guides/rel_notes/index.rst
+++ b/doc/guides/rel_notes/index.rst
@@ -8,6 +8,7 @@ Release Notes
     :maxdepth: 1
     :numbered:
 
+    release_25_03
     release_24_11
     release_24_07
     release_24_03
diff --git a/doc/guides/rel_notes/release_25_03.rst b/doc/guides/rel_notes/release_25_03.rst
new file mode 100644
index 0000000000..426dfcd982
--- /dev/null
+++ b/doc/guides/rel_notes/release_25_03.rst
@@ -0,0 +1,138 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright 2024 The DPDK contributors
+
+.. include:: <isonum.txt>
+
+DPDK Release 25.03
+==================
+
+.. **Read this first.**
+
+   The text in the sections below explains how to update the release notes.
+
+   Use proper spelling, capitalization and punctuation in all sections.
+
+   Variable and config names should be quoted as fixed width text:
+   ``LIKE_THIS``.
+
+   Build the docs and view the output file to ensure the changes are correct::
+
+      ninja -C build doc
+      xdg-open build/doc/guides/html/rel_notes/release_25_03.html
+
+
+New Features
+------------
+
+.. This section should contain new features added in this release.
+   Sample format:
+
+   * **Add a title in the past tense with a full stop.**
+
+     Add a short 1-2 sentence description in the past tense.
+     The description should be enough to allow someone scanning
+     the release notes to understand the new feature.
+
+     If the feature adds a lot of sub-features you can use a bullet list
+     like this:
+
+     * Added feature foo to do something.
+     * Enhanced feature bar to do something else.
+
+     Refer to the previous release notes for examples.
+
+     Suggested order in release notes items:
+     * Core libs (EAL, mempool, ring, mbuf, buses)
+     * Device abstraction libs and PMDs (ordered alphabetically by vendor name)
+       - ethdev (lib, PMDs)
+       - cryptodev (lib, PMDs)
+       - eventdev (lib, PMDs)
+       - etc
+     * Other libs
+     * Apps, Examples, Tools (if significant)
+
+     This section is a comment. Do not overwrite or remove it.
+     Also, make sure to start the actual text at the margin.
+     =======================================================
+
+
+Removed Items
+-------------
+
+.. This section should contain removed items in this release. Sample format:
+
+   * Add a short 1-2 sentence description of the removed item
+     in the past tense.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =======================================================
+
+
+API Changes
+-----------
+
+.. This section should contain API changes. Sample format:
+
+   * sample: Add a short 1-2 sentence description of the API change
+     which was announced in the previous releases and made in this release.
+     Start with a scope label like "ethdev:".
+     Use fixed width quotes for ``function_names`` or ``struct_names``.
+     Use the past tense.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =======================================================
+
+
+ABI Changes
+-----------
+
+.. This section should contain ABI changes. Sample format:
+
+   * sample: Add a short 1-2 sentence description of the ABI change
+     which was announced in the previous releases and made in this release.
+     Start with a scope label like "ethdev:".
+     Use fixed width quotes for ``function_names`` or ``struct_names``.
+     Use the past tense.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =======================================================
+
+* No ABI change that would break compatibility with 24.11.
+
+
+Known Issues
+------------
+
+.. This section should contain new known issues in this release. Sample format:
+
+   * **Add title in present tense with full stop.**
+
+     Add a short 1-2 sentence description of the known issue
+     in the present tense. Add information on any known workarounds.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =======================================================
+
+
+Tested Platforms
+----------------
+
+.. This section should contain a list of platforms that were tested
+   with this release.
+
+   The format is:
+
+   * <vendor> platform with <vendor> <type of devices> combinations
+
+     * List of CPU
+     * List of OS
+     * List of devices
+     * Other relevant details...
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =======================================================
-- 
2.47.0


^ permalink raw reply	[relevance 11%]

* Re: [PATCH v2 1/3] net: add thread-safe crc api
    @ 2024-12-02 22:36  3%   ` Stephen Hemminger
  2025-02-06 20:43  0%     ` Kusztal, ArkadiuszX
  2025-02-06 20:38  4%   ` [PATCH v3] " Arkadiusz Kusztal
  2 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2024-12-02 22:36 UTC (permalink / raw)
  To: Arkadiusz Kusztal; +Cc: dev, ferruh.yigit, kai.ji, brian.dooley

On Tue,  1 Oct 2024 19:11:48 +0100
Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com> wrote:

> The current net CRC API is not thread-safe, this patch
> solves this by adding another, thread-safe API functions.

Couldn't the old API be made threadsafe with TLS?

> This API is also safe to use across multiple processes,
> yet with limitations on max-simd-bitwidth, which will be checked only by
> the process that created the CRC context; all other processes will use
> the same CRC function when used with the same CRC context.
> It is an undefined behavior when process binaries are compiled
> with different SIMD capabilities when the same CRC context is used.
> 
> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>

The API/ABI can't change for 25.03, do you want to support both?
Or wait until 25.11?

^ permalink raw reply	[relevance 3%]

* DPDK 24.11 released
@ 2024-11-30 23:50  4% Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2024-11-30 23:50 UTC (permalink / raw)
  To: announce

A new major release is available:
	https://fast.dpdk.org/rel/dpdk-24.11.tar.xz

It was a busy release cycle:
	1329 commits from 196 authors
	2557 files changed, 376587 insertions(+), 177108 deletions(-)

And it includes some API/ABI compatibility breakages.
This release won't be ABI-compatible with previous ones.
The new major ABI version is 25.
The next releases 25.03 and 25.07 will be ABI-compatible with 24.11.

The branch 24.11 should be supported for three years,
making it recommended for system integration and deployment.

Highlights of 24.11:
	- lcore variables allocation
	- bit set and atomic bit manipulation
	- AMD uncore power management
	- per-CPU power management QoS for resume latency
	- IPv6 address API
	- RSS hash key generation
	- Ethernet link lanes
	- flow table index action
	- Cisco enic VF
	- Marvell CN20K
	- Napatech ntnic flow engine
	- Realtek r8169 driver
	- ZTE gdtc / zxdh driver initialization
	- symmetric crypto SM4
	- asymmetric crypto EdDSA
	- event device pre-scheduling
	- event device independent enqueue
	- logging rework (timestamp, color, syslog, journal)

More details in the release notes:
	https://doc.dpdk.org/guides/rel_notes/release_24_11.html


There are 50 new contributors (including authors, reviewers and testers).
Welcome to Adel Belkhiri, Ahmed Zaki, Andre Muezerie, Andrzej Wilczynski,
Bartosz Jakub Rosadzinski, Bill Xiang, Chenxingyu Wang,
Danylo Vodopianov,  Dhruv Tripathi, Doug Foster, Gur Stavi, Hanxiao Li,
Howard Wang, Huaxing Zhu, Julien Hascoet, Jun Zhang, Junlong Wang,
Kiran Kumar Kokkilagadda, Luka Jankovic, Lukas Sismis, Lukasz Cieplicki,
Malcolm Bumgardner, Mateusz Polchlopek, Michal Jaron, Michal Nowak,
Midde Ajijur Rehaman, Mihai Brodschi, Niall Meade, Norbert Zulinski,
Ofer Dagan, Oleg Akhrem, Oleksandr Nahnybida, Peter Morrow,
Praveen Kaligineedi, Przemyslaw Gierszynski, Rogelio Domínguez Hernández,
Sangtani Parag Satishbhai, Slawomir Laba, Stefan Laesser,
Sudheer Mogilappagari, Thomas Wilks, Tim Martin, Tomáš Ďurovec,
Varun Lakkur Ambaji Rao, Vasuthevan Maheswaran, Vinod Krishna,
Wojciech Panfil, Xinying Yu, Yogesh Bhosale, and Yong Zhang.

Below is the number of commits per employer (with authors count):
	221     Intel (60)
	158     Marvell (22)
	137     Napatech (3)
	107     stephen@networkplumber.org (1)
	 96     NVIDIA (15)
	 93     NXP (10)
	 85     Red Hat (3)
	 70     Corigine (9)
	 70     Broadcom (15)
	 54     Huawei (6)
	 34     Arm (5)
	 30     Ericsson (1)
	        ...

A big thank to all courageous people who took on the non rewarding task
of reviewing other's job.
Based on Reviewed-by and Acked-by tags, the top non-PMD reviewers are:
	 73     Morten Brørup <mb@smartsharesystems.com>
	 66     Stephen Hemminger <stephen@networkplumber.org>
	 57     Chengwen Feng <fengchengwen@huawei.com>
	 41     Bruce Richardson <bruce.richardson@intel.com>
	 31     Luca Vizzarro <luca.vizzarro@arm.com>
	 26     Ferruh Yigit <ferruh.yigit@amd.com>
	 26     David Marchand <david.marchand@redhat.com>


The next version will be 25.03 in March.
The new features for 25.03 can be submitted during December:
	http://core.dpdk.org/roadmap#dates
Please share your roadmap.


Thanks everyone



^ permalink raw reply	[relevance 4%]

* [PATCH v1] doc: update release notes for 24.11
@ 2024-11-28 17:07  4% John McNamara
  0 siblings, 0 replies; 200+ results
From: John McNamara @ 2024-11-28 17:07 UTC (permalink / raw)
  To: dev; +Cc: thomas, John McNamara

Fix grammar, spelling and formatting of DPDK 24.11 release notes.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/rel_notes/release_24_11.rst | 158 +++++++++++++++----------
 1 file changed, 93 insertions(+), 65 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 48b399cda7..b7e0f1224b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -57,14 +57,14 @@ New Features
 
 * **Added new bit manipulation API.**
 
-  The support for bit-level operations on single 32- and 64-bit words in
-  <rte_bitops.h> has been extended with semantically well-defined functions.
+  Extended support for bit-level operations on single 32 and 64-bit words in
+  ``<rte_bitops.h>`` with semantically well-defined functions.
 
   * ``rte_bit_[test|set|clear|assign|flip]`` functions provide excellent
     performance (by avoiding restricting the compiler and CPU), but give
-    no guarantees in regards to memory ordering or atomicity.
+    no guarantees in relation to memory ordering or atomicity.
 
-  * ``rte_bit_atomic_*`` provide atomic bit-level operations, including
+  * ``rte_bit_atomic_*`` provides atomic bit-level operations including
     the possibility to specify memory ordering constraints.
 
   The new public API elements are polymorphic, using the _Generic-based
@@ -72,15 +72,17 @@ New Features
 
 * **Added multi-word bitset API.**
 
-  A new multi-word bitset API has been introduced in the EAL.
+  Introduced a new multi-word bitset API to the EAL.
+
   The RTE bitset is optimized for scenarios where the bitset size exceeds the
   capacity of a single word (e.g., larger than 64 bits), but is not large
   enough to justify the overhead and complexity of the more scalable,
-  yet slower, <rte_bitmap.h> API.
+  yet slower, ``<rte_bitmap.h>`` API.
+
   This addition provides an efficient and straightforward alternative
-  for handling bitsets of intermediate sizes.
+  for handling bitsets of intermediate size.
 
-* **Added per-lcore static memory allocation facility.**
+* **Added a per-lcore static memory allocation facility.**
 
   Added EAL API ``<rte_lcore_var.h>`` for statically allocating small,
   frequently-accessed data structures, for which one instance should exist
@@ -89,10 +91,10 @@ New Features
   With lcore variables, data is organized spatially on a per-lcore id basis,
   rather than per library or PMD, avoiding the need for cache aligning
   (or RTE_CACHE_GUARDing) data structures, which in turn
-  reduces CPU cache internal fragmentation, improving performance.
+  reduces CPU cache internal fragmentation and improves performance.
 
   Lcore variables are similar to thread-local storage (TLS, e.g. C11 ``_Thread_local``),
-  but decoupling the values' life time from that of the threads.
+  but decouples the values' life times from those of the threads.
 
 * **Extended service cores statistics.**
 
@@ -101,7 +103,7 @@ New Features
   * ``RTE_SERVICE_ATTR_IDLE_CALL_COUNT`` tracks the number of service function
     invocations where no actual work was performed.
 
-  * ``RTE_SERVICE_ATTR_ERROR_CALL_COUNT`` tracks the number invocations
+  * ``RTE_SERVICE_ATTR_ERROR_CALL_COUNT`` tracks the number of invocations
     resulting in an error.
 
   The new statistics are useful for debugging and profiling.
@@ -110,17 +112,17 @@ New Features
 
   Added function attributes to ``rte_malloc`` and similar functions
   that can catch some obvious bugs at compile time (with GCC 11.0 or later).
-  Examples: calling ``free`` on pointer that was allocated with ``rte_malloc``
-  (and vice versa); freeing the same pointer twice in the same routine;
-  freeing an object that was not created by allocation; etc.
+  For example, calling ``free`` on a pointer that was allocated with ``rte_malloc``
+  (and vice versa); freeing the same pointer twice in the same routine or
+  freeing an object that was not created by allocation.
 
-* **Updated logging library**
+* **Updated logging library.**
 
   * The log subsystem is initialized earlier in startup so all messages go through the library.
 
   * If the application is a systemd service and the log output is being sent to standard error
     then DPDK will switch to journal native protocol.
-    This allows the more data such as severity to be sent.
+    This allows more data such as severity to be sent.
 
   * The syslog option has changed.
     By default, messages are no longer sent to syslog unless the ``--syslog`` option is specified.
@@ -136,7 +138,7 @@ New Features
 
 * **Added more ICMP message types and codes.**
 
-  New ICMP message types and codes from RFC 792 were added in ``rte_icmp.h``.
+  Added new ICMP message types and codes from RFC 792 in ``rte_icmp.h``.
 
 * **Added IPv6 address structure and related utilities.**
 
@@ -154,7 +156,7 @@ New Features
 
 * **Extended flow table index features.**
 
-  * Extended the flow table insertion type enum with
+  * Extended the flow table insertion type enum with the
     ``RTE_FLOW_TABLE_INSERTION_TYPE_INDEX_WITH_PATTERN`` type.
   * Added a function for inserting a flow rule by index with pattern:
     ``rte_flow_async_create_by_index_with_pattern()``.
@@ -171,8 +173,8 @@ New Features
 
   * Modified the PMD API that controls the LLQ header policy.
   * Replaced ``enable_llq``, ``normal_llq_hdr`` and ``large_llq_hdr`` devargs
-    with a new shared devarg ``llq_policy`` that keeps the same logic.
-  * Added validation check for Rx packet descriptor consistency.
+    with a new shared devarg ``llq_policy`` that maintains the same logic.
+  * Added a validation check for Rx packet descriptor consistency.
 
 * **Updated Cisco enic driver.**
 
@@ -187,17 +189,19 @@ New Features
 
   * Updated supported version of the FPGA to 9563.55.49.
   * Extended and fixed logging.
-  * Added NT flow filter initialization.
-  * Added NT flow backend initialization.
-  * Added initialization of FPGA modules related to flow HW offload.
-  * Added basic handling of the virtual queues.
-  * Added flow handling support.
-  * Added statistics support.
-  * Added age flow action support.
-  * Added meter flow metering and flow policy support.
-  * Added flow actions update support.
-  * Added asynchronous flow support.
-  * Added MTU update support.
+  * Added:
+
+    - NT flow filter initialization.
+    - NT flow backend initialization.
+    - Initialization of FPGA modules related to flow HW offload.
+    - Basic handling of the virtual queues.
+    - Flow handling support.
+    - Statistics support.
+    - Age flow action support.
+    - Meter flow metering and flow policy support.
+    - Flow actions update support.
+    - Asynchronous flow support.
+    - MTU update support.
 
 * **Updated NVIDIA mlx5 net driver.**
 
@@ -211,9 +215,10 @@ New Features
 
 * **Added ZTE zxdh net driver [EXPERIMENTAL].**
 
-  Added ethdev driver support for zxdh NX Series Ethernet Controller.
+  Added ethdev driver support for the zxdh NX Series Ethernet Controller.
+  This has:
 
-  * Ability to initialize the NIC.
+  * The ability to initialize the NIC.
   * No datapath support.
 
 * **Added cryptodev queue pair reset support.**
@@ -232,9 +237,9 @@ New Features
 
 * **Updated IPsec_MB crypto driver.**
 
-  * Added support for SM3 algorithm.
-  * Added support for SM3 HMAC algorithm.
-  * Added support for SM4 CBC, SM4 ECB and SM4 CTR algorithms.
+  * Added support for the SM3 algorithm.
+  * Added support for the SM3 HMAC algorithm.
+  * Added support for the SM4 CBC, SM4 ECB and SM4 CTR algorithms.
   * Bumped the minimum version requirement of Intel IPsec Multi-buffer library to v1.4.
     Affected PMDs: KASUMI, SNOW3G, ZUC, AESNI GCM, AESNI MB and CHACHAPOLY.
 
@@ -264,7 +269,7 @@ New Features
 * **Added Marvell cnxk RVU LF rawdev driver.**
 
   Added a new raw device driver for Marvell cnxk based devices
-  to allow out-of-tree driver to manage RVU LF device.
+  to allow ans out-of-tree driver to manage a RVU LF device.
   It enables operations such as sending/receiving mailbox,
   register and notify the interrupts, etc.
 
@@ -286,7 +291,7 @@ New Features
 
   Added support for independent enqueue feature.
   With this feature eventdev supports enqueue in any order
-  or specifically in a different order than dequeue.
+  or specifically in a different order to dequeue.
   The feature is intended for eventdevs supporting burst mode.
   Applications should use ``RTE_EVENT_PORT_CFG_INDEPENDENT_ENQ`` to enable
   the feature if the capability ``RTE_EVENT_DEV_CAP_INDEPENDENT_ENQ`` exists.
@@ -305,8 +310,8 @@ New Features
 
 * **Added IPv4 network order lookup in the FIB library.**
 
-  A new flag field is introduced in ``rte_fib_conf`` structure.
-  This field is used to pass an extra configuration settings such as ability
+  A new flag field is introduced in the ``rte_fib_conf`` structure.
+  This field is used to pass an extra configuration settings such as the ability
   to lookup IPv4 addresses in network byte order.
 
 * **Added RSS hash key generating API.**
@@ -317,7 +322,7 @@ New Features
 * **Added per-CPU power management QoS interface.**
 
   Added per-CPU PM QoS interface to lower the resume latency
-  when wake up from idle state.
+  when waking up from idle state.
 
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
@@ -326,7 +331,7 @@ New Features
 
 * **Added node specific statistics.**
 
-  Added ability for node to advertise and update multiple xstat counters,
+  Added ability for ans node to advertise and update multiple xstat counters,
   that can be retrieved using ``rte_graph_cluster_stats_get``.
 
 
@@ -342,7 +347,7 @@ Removed Items
    Also, make sure to start the actual text at the margin.
    =======================================================
 
-* ethdev: Removed the __rte_ethdev_trace_rx_burst symbol, as the corresponding
+* ethdev: Removed the ``__rte_ethdev_trace_rx_burst`` symbol, as the corresponding
   tracepoint was split into two separate ones for empty and non-empty calls.
 
 
@@ -363,8 +368,8 @@ API Changes
 
 * kvargs: reworked the process API.
 
-  * The already existing ``rte_kvargs_process`` now only handles key=value cases and
-    rejects if only a key is present in the parsed string.
+  * The already existing ``rte_kvargs_process`` now only handles ``key=value`` cases and
+    rejects input where only a key is present in the parsed string.
   * ``rte_kvargs_process_opt`` has been added to behave as ``rte_kvargs_process`` in previous
     releases: it handles key=value and only-key cases.
   * Both ``rte_kvargs_process`` and ``rte_kvargs_process_opt`` reject a NULL ``kvlist`` parameter.
@@ -381,24 +386,35 @@ API Changes
 * net: A new IPv6 address structure was introduced to replace ad-hoc ``uint8_t[16]`` arrays.
   The following libraries and symbols were modified:
 
-  cmdline
+  - cmdline:
+
     - ``cmdline_ipaddr_t``
-  ethdev
+
+  - ethdev:
+
     - ``struct rte_flow_action_set_ipv6``
     - ``struct rte_flow_item_icmp6_nd_na``
     - ``struct rte_flow_item_icmp6_nd_ns``
     - ``struct rte_flow_tunnel``
-  fib
+
+  - fib:
+
     - ``rte_fib6_add()``
     - ``rte_fib6_delete()``
     - ``rte_fib6_lookup_bulk()``
     - ``RTE_FIB6_IPV6_ADDR_SIZE`` (deprecated, replaced with ``RTE_IPV6_ADDR_SIZE``)
     - ``RTE_FIB6_MAXDEPTH`` (deprecated, replaced with ``RTE_IPV6_MAX_DEPTH``)
-  hash
+
+  - hash:
+
     - ``struct rte_ipv6_tuple``
-  ipsec
+
+  - ipsec:
+
     - ``struct rte_ipsec_sadv6_key``
-  lpm
+
+  - lpm:
+
     - ``rte_lpm6_add()``
     - ``rte_lpm6_delete()``
     - ``rte_lpm6_delete_bulk_func()``
@@ -407,20 +423,32 @@ API Changes
     - ``rte_lpm6_lookup_bulk_func()``
     - ``RTE_LPM6_IPV6_ADDR_SIZE`` (deprecated, replaced with ``RTE_IPV6_ADDR_SIZE``)
     - ``RTE_LPM6_MAX_DEPTH`` (deprecated, replaced with ``RTE_IPV6_MAX_DEPTH``)
-  net
+
+  - net:
+
     - ``struct rte_ipv6_hdr``
-  node
+
+  - node:
+
     - ``rte_node_ip6_route_add()``
-  pipeline
+
+  - pipeline:
+
     - ``struct rte_swx_ipsec_sa_encap_params``
     - ``struct rte_table_action_ipv6_header``
     - ``struct rte_table_action_nat_params``
-  security
+
+  - security:
+
     - ``struct rte_security_ipsec_tunnel_param``
-  table
+
+  - table:
+
     - ``struct rte_table_lpm_ipv6_key``
     - ``RTE_LPM_IPV6_ADDR_SIZE`` (deprecated, replaced with ``RTE_IPV6_ADDR_SIZE``)
-  rib
+
+  - rib:
+
     - ``rte_rib6_get_ip()``
     - ``rte_rib6_get_nxt()``
     - ``rte_rib6_insert()``
@@ -452,7 +480,7 @@ ABI Changes
    =======================================================
 
 * eal: The maximum number of file descriptors that can be passed to a secondary process
-  has been increased from 8 to 253 (which is the maximum possible with Unix domain socket).
+  has been increased from 8 to 253 (which is the maximum possible with Unix domain sockets).
   This allows for more queues when using software devices such as TAP and XDP.
 
 * ethdev: Added ``filter`` and ``names`` fields to ``rte_dev_reg_info`` structure
@@ -468,25 +496,25 @@ ABI Changes
 * cryptodev: The enum ``rte_crypto_asym_xform_type`` and struct ``rte_crypto_asym_op``
   are updated to include new values to support EdDSA.
 
-* cryptodev: The ``rte_crypto_rsa_xform`` struct member to hold private key
-  in either exponent or quintuple format is changed from union to struct data type.
+* cryptodev: The ``rte_crypto_rsa_xform`` struct member to hold private key data
+  in either exponent or quintuple format is changed from a union to a struct data type.
   This change is to support ASN.1 syntax (RFC 3447 Appendix A.1.2).
 
 * cryptodev: The padding struct ``rte_crypto_rsa_padding`` is moved
   from ``rte_crypto_rsa_op_param`` to ``rte_crypto_rsa_xform``
   as the padding information is part of session creation
-  instead of per packet crypto operation.
+  instead of the per packet crypto operation.
   This change is required to support virtio-crypto specifications.
 
 * bbdev: The structure ``rte_bbdev_stats`` was updated to add a new parameter
-  to optionally report the number of enqueue batch available ``enqueue_depth_avail``.
+  to optionally report the number of enqueue batches available ``enqueue_depth_avail``.
 
-* dmadev: Added ``nb_priorities`` field to ``rte_dma_info`` structure
-  and ``priority`` field to ``rte_dma_conf`` structure
+* dmadev: Added ``nb_priorities`` field to the ``rte_dma_info`` structure
+  and ``priority`` field to the ``rte_dma_conf`` structure
   to get device supported priority levels
   and configure required priority from the application.
 
-* eventdev: Added ``preschedule_type`` field to ``rte_event_dev_config`` structure.
+* eventdev: Added the ``preschedule_type`` field to ``rte_event_dev_config`` structure.
 
 * eventdev: Removed the single-event enqueue and dequeue function pointers
   from ``rte_event_fp_fps``.
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* Re: [PATCH] doc: correct definition of Stats per queue feature
  @ 2024-11-26 23:39  0%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2024-11-26 23:39 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Shreyansh Jain, John McNamara, Andrew Rybchenko, Ferruh Yigit

11/10/2024 21:25, Ferruh Yigit:
> On 10/11/2024 2:38 AM, Stephen Hemminger wrote:
> > Change the documentation to match current usage of this feature
> > in the NIC table. Moved this sub heading to be after basic
> > stats because the queue stats reported now are in the same structure.
> > 
> > Although the "Stats per Queue" feature was originally intended
> > to be related to stats mapping, the overwhelming majority of drivers
> > report this feature with a different meaning.
> > 
> > Hopefully in later release the per-queue stats limitations
> > can be fixed, but this requires and API, ABI, and lots of driver
> > changes.
> > 
> > Fixes: dad1ec72a377 ("doc: document NIC features")
> > Cc: ferruh.yigit@intel.com
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>

Applied with spacing fixed, thanks.




^ permalink raw reply	[relevance 0%]

* [PATCH v1 0/4] Adjust wording for NUMA vs. socket ID in DPDK
@ 2024-11-26 13:14  3% Anatoly Burakov
  0 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2024-11-26 13:14 UTC (permalink / raw)
  To: dev

While initially, DPDK has used the term "socket ID" to refer to physical package
ID, the last time DPDK read "physical_package_id" for socket ID was ~9 years
ago, so it's been a while since we've actually switched over to using the term
"socket" to mean "NUMA node".

This wasn't a problem before, as most systems had one NUMA node per physical
socket. However, in the last few years, more and more systems have multiple NUMA
nodes per physical CPU socket. Since DPDK used NUMA nodes already, the
transition was pretty seamless, however now we're faced with a situation when
most of our documentation still uses outdated terms, and our API is ripe with
references to "sockets" when in actuality we mean "NUMA nodes". This could be a
source of confusion.

While completely renaming all of our API's would be a huge effort, will take a
long time and arguably wouldn't even be worth the API breakages (given that this
mismatch between terminology and reality is implicitly understood by most people
working on DPDK, and so this isn't so much of a problem in practice), we can do
some tweaks around the edges and at least document this unfortunate reality.

This patchset suggests the following changes:

- Update rte_socket/rte_lcore documentation to refer to NUMA nodes rather than
sockets
- Rename internal structures' fields to better reflect this intention
- Rename --socket-mem/--socket-limit flags to refer to NUMA rather than sockets

The documentation is updated to refer to new EAL flags, but is otherwise left
untouched, and instead the entry in "glossary" is amended to indicate that when
DPDK documentation refers to "sockets", it actually means "NUMA ID's". As next
steps, we could rename all API parameters to refer to NUMA ID rather than socket
ID - this would not break neither API nor ABI, and instead would be a
documentation change in practice.

RFCv1 -> v1:
- Dropped patch 5
- Updated error messages in patch 4 to refer to old flags as well

Anatoly Burakov (4):
  eal: update socket ID API documentation
  lcore: rename socket ID to NUMA ID
  eal: rename socket ID to NUMA ID in internal config
  eal: rename --socket-mem/--socket-limit

 doc/guides/faq/faq.rst                        |  4 +--
 doc/guides/howto/lm_bond_virtio_sriov.rst     |  2 +-
 doc/guides/howto/lm_virtio_vhost_user.rst     |  2 +-
 doc/guides/howto/pvp_reference_benchmark.rst  |  4 +--
 .../virtio_user_for_container_networking.rst  |  2 +-
 doc/guides/linux_gsg/build_sample_apps.rst    | 20 +++++------
 doc/guides/linux_gsg/linux_eal_parameters.rst | 16 ++++-----
 doc/guides/nics/mlx4.rst                      |  2 +-
 doc/guides/nics/mlx5.rst                      |  2 +-
 .../prog_guide/env_abstraction_layer.rst      | 12 +++----
 doc/guides/prog_guide/glossary.rst            |  5 ++-
 doc/guides/prog_guide/multi_proc_support.rst  |  2 +-
 doc/guides/sample_app_ug/bbdev_app.rst        |  6 ++--
 doc/guides/sample_app_ug/ipsec_secgw.rst      |  6 ++--
 doc/guides/sample_app_ug/vdpa.rst             |  2 +-
 doc/guides/sample_app_ug/vhost.rst            |  4 +--
 lib/eal/common/eal_common_dynmem.c            | 14 ++++----
 lib/eal/common/eal_common_lcore.c             | 10 +++---
 lib/eal/common/eal_common_options.c           | 33 ++++++++++---------
 lib/eal/common/eal_common_thread.c            | 12 +++----
 lib/eal/common/eal_internal_cfg.h             | 10 +++---
 lib/eal/common/eal_options.h                  |  8 +++--
 lib/eal/common/eal_private.h                  |  2 +-
 lib/eal/common/malloc_heap.c                  |  2 +-
 lib/eal/freebsd/eal.c                         |  2 +-
 lib/eal/include/rte_lcore.h                   | 25 +++++++-------
 lib/eal/linux/eal.c                           | 28 +++++++++-------
 lib/eal/linux/eal_memory.c                    | 22 ++++++-------
 lib/eal/windows/eal.c                         |  2 +-
 29 files changed, 137 insertions(+), 124 deletions(-)

-- 
2.43.5


^ permalink raw reply	[relevance 3%]

* Tech Board Meeting Minutes - 2024-Nov-13
@ 2024-11-20 22:24  3% Honnappa Nagarahalli
  0 siblings, 0 replies; 200+ results
From: Honnappa Nagarahalli @ 2024-11-20 22:24 UTC (permalink / raw)
  To: techboard, dev; +Cc: nd


Members Attending
-------------------------
Aaron Conole
Bruce Richardson
Hemant Agrawal
Jerin Jacob
Kevin Traynor
Konstantin Ananyev
Maxime Coquelin
Morten Brørup
Stephen Hemminger
Thomas Monjalon

NOTE: The technical board meetings are on every second Wednesday at 3pm
UTC.  Meetings are public, and DPDK community members are welcome to
attend.  Agenda and minutes can be found at http://core.dpdk.org/techboard/minutes

Next meeting will be on Wednesday 2024-Nov-27 @ 3pm UTC, and will be chaired by Hemant Agrawal

Agenda Items
============
1) A DPDK summit in Prague approved. The tentative dates are in May 7th to 8th or 21st to 22nd 2025.
 
2) Did we write down all ideas from the Montreal brainstorm (public + techboard)?
Honnappa has summarized and sent this to Techboard.
Action Item: Honnappa to add the list to the slides in Google Docs (https://docs.google.com/presentation/d/1TDmz1_xvWFWxrMtXgKA03e_4yPUUTWdJ7aWV9_BBUhE/edit?usp=sharing)
 
3) Did we summarize ideas coming from email and Slack?
Stephen has summarized these.
Action Item: Stephen to add these to the slides mentioned above. Create a excel sheet and send it to Techboard for voting.
 
4)Next Steps
   a) Need sizing for these challenges and Mentors
   b) Govboard has approved $10,000 for coding challenge prize. The coding challenge could happen in conjunction with that.
   c) Stephen, Ben and possibly Nathan would lead this effort. Once the mentors are decided, mentors will manage the individual coding challenge. The prize money will be decided once the priority is decided. Tentatively - publish 10 challenges and select 3 best.
   d) Can we send a teaser in advance?
 
5) PVS Studio has posted a blog on issues found in static analysis, worth look - https://pvs-studio.com/en/blog/posts/cpp/1183/
 
6) Status of 24.11 release – RC2 released. Lots of features introduced. so_ring will be merged after this release. Lcore variable feature from Mathias is an interesting feature. One concern is the amount of memory it uses. The memory is not coming from Hugepages. It allocates 128KB per lcore. This is merged but not marked as Experimental this will allow us to make changes even if it breaks ABI.
Action Item: Patrick to check if it is possible to add some test cases in CI pipeline to warn changes to memory usage caused by a patch.
Action Item: Thomas to mark this feature as Experimental.
 
7) The documented process for merging is to have at least 2 reviews. However, the reality seems different, things get merged before the release. Thomas and David review the patches during the release process if there were few reviews. We could have list of patches that need review at some location (dpdk.org?).
 

^ permalink raw reply	[relevance 3%]

* Re: rte_fib network order bug
  2024-11-15 16:20  0%                   ` Stephen Hemminger
@ 2024-11-17 15:04  3%                     ` Vladimir Medvedkin
  0 siblings, 0 replies; 200+ results
From: Vladimir Medvedkin @ 2024-11-17 15:04 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Robin Jarry, Morten Brørup, Medvedkin, Vladimir, dev

[-- Attachment #1: Type: text/plain, Size: 4983 bytes --]

Hi all,

[Robin] > I had not understood that it was *only* the lookups that were
network order
[Morten] >When I saw the byte order flag the first time, it was not clear
to me either that it only affected lookups - I too thought it covered the
entire API of the library. This needs to be emphasized in the description
of the flag. And the flag's name should contain LOOKUP
[Morten] > And/or rename RTE_FIB_F_NETWORK_ORDER to
RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar.

There is a clear comment for this flag that it has effects on lookup.
Repeating the statement with an exclamation mark seems too much. Moreover,
at first this flag was named "RTE_FIB_FLAG_LOOKUP_BE" and it was suggested
for renaming here:
https://inbox.dpdk.org/dev/D4SWPKOPRD5Z.87YIET3Y4AW@redhat.com/

[Morten] >Control plane API should use CPU byte order ... adding it
(support for network byte order) to the RIB library would be nice too.
I'm not sure if I understood you correctly here, RIB is a control plane
library.

[Robin] > an IPv4 address is *not* an integer. It should be treated as an
opaque value.
I don't agree here. IPv4 is 32 bits of information. CPUs usually can treat
32 bits of information as an integer, which is really useful.

[Morten] > Treating IPv4 addresses as byte arrays would allow simple
memcmp() for range comparison
How is it possible for a general case? For example, I need to test IP
addresses against range 1.1.1.7 - 10.20.30.37.

[Robin] >Also for consistency with IPv6, I really think that *all*
addresses should be dealt in their network form.
There is no such a problem as byte order mismatch for IPv6 addresses since
they can not be treated by modern CPUs as an native integer type.

[Robin] >But it (RTE_IPV4) will always generate addresses in *host order*.
Which means they cannot be used in IPv4 headers without passing them
through htonl().
RTE_IPV4 is not limited by setting IPv4 headers values.

[Robin] >Maybe we could revert that patch and defer a complete change of
the rib/fib APIs to only expose network order addresses?
I don't agree with that. Don't limit yourself to just manipulating network
headers.

[Robin] >Thinking about it some more. Having a flag for such a drastic
change in behaviour does not seem right.
This flag is optional. I don't see any problems with that.

In general, here we just have different perspectives on the problem. I can
see and understand your point.
My considerations are:
- The vast majority of the longest prefix match algorithms works with
addresses in host byte order (binary trees, multibit tries, DXR, except
only hash based lookup)
- If you do byteswap two or more times - If you run byteswap two or more
times, you are probably doing something wrong in terms of computations

So, feel free to submit patches adding this feature to the control plane
API, but let's consider:
- default behaviour should remain the same. Why? At least because for my
usecases I'd like to have "data representation" (byte swap) outside of the
library. Not to mention ABI/API breakage
-  IPv4 should stay as uint32_t. C doesn't know such a thing as byte order,
it knows about size and signedness. rte_be32_t is just a hint for us -
humans :)


пт, 15 нояб. 2024 г. в 17:00, Stephen Hemminger <stephen@networkplumber.org
>:

> On Fri, 15 Nov 2024 15:28:33 +0100
> "Robin Jarry" <rjarry@redhat.com> wrote:
>
> > Morten Brørup, Nov 15, 2024 at 14:52:
> > > Robin, you've totally won me over on this endian discussion. :-)
> > > Especially the IPv6 comparison make it clear why IPv4 should also be
> > > network byte order.
> > >
> > > API/ABI stability is a pain... we're stuck with host endian IPv4
> > > addresses; e.g. for the RTE_IPV4() macro, which I now agree produces
> > > the wrong endian value (on little endian CPUs).
> >
> > At least for 24.11 it is too late. But maybe we could make it right for
> > the next LTS?
> >
> > >> Vladimir, could we at least consider adding a real network order mode
> > >> for the rib and fib libraries? So that we can have consistent APIs
> > >> between IPv4 and IPv6?
> > >
> > > And/or rename RTE_FIB_F_NETWORK_ORDER to
> > > RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar. This is important if real
> > > network order mode is added (now or later)!
> >
> > Maybe we could revert that patch and defer a complete change of the
> > rib/fib APIs to only expose network order addresses? It would be an ABI
> > breakage but if properly announced in advance, it should be possible.
> >
> > Thinking about it some more. Having a flag for such a drastic change in
> > behaviour does not seem right.
>
> It was a mistake for DPDK to define its own data structures for IP
> addresses.
> Would have been much better to stick with the legacy what BSD, Linux (and
> Windows)
> uses in API. 'struct in_addr' and 'struct in6_addr'
>
> Reinvention did not help users.
>


-- 
Regards,
Vladimir

[-- Attachment #2: Type: text/html, Size: 11432 bytes --]

^ permalink raw reply	[relevance 3%]

* Re: rte_fib network order bug
  2024-11-15 14:28  3%                 ` Robin Jarry
@ 2024-11-15 16:20  0%                   ` Stephen Hemminger
  2024-11-17 15:04  3%                     ` Vladimir Medvedkin
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2024-11-15 16:20 UTC (permalink / raw)
  To: Robin Jarry; +Cc: Morten Brørup, Medvedkin, Vladimir, dev

On Fri, 15 Nov 2024 15:28:33 +0100
"Robin Jarry" <rjarry@redhat.com> wrote:

> Morten Brørup, Nov 15, 2024 at 14:52:
> > Robin, you've totally won me over on this endian discussion. :-)
> > Especially the IPv6 comparison make it clear why IPv4 should also be 
> > network byte order.
> >
> > API/ABI stability is a pain... we're stuck with host endian IPv4 
> > addresses; e.g. for the RTE_IPV4() macro, which I now agree produces 
> > the wrong endian value (on little endian CPUs).  
> 
> At least for 24.11 it is too late. But maybe we could make it right for 
> the next LTS?
> 
> >> Vladimir, could we at least consider adding a real network order mode 
> >> for the rib and fib libraries? So that we can have consistent APIs 
> >> between IPv4 and IPv6?  
> >
> > And/or rename RTE_FIB_F_NETWORK_ORDER to 
> > RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar. This is important if real 
> > network order mode is added (now or later)!  
> 
> Maybe we could revert that patch and defer a complete change of the 
> rib/fib APIs to only expose network order addresses? It would be an ABI 
> breakage but if properly announced in advance, it should be possible.
> 
> Thinking about it some more. Having a flag for such a drastic change in 
> behaviour does not seem right.

It was a mistake for DPDK to define its own data structures for IP addresses.
Would have been much better to stick with the legacy what BSD, Linux (and Windows)
uses in API. 'struct in_addr' and 'struct in6_addr'

Reinvention did not help users.

^ permalink raw reply	[relevance 0%]

* RE: [EXTERNAL] Re: [PATCH v5 1/1] graph: improve node layout
  2024-11-15 14:23  0%           ` Thomas Monjalon
@ 2024-11-15 15:57  0%             ` Jerin Jacob
  0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2024-11-15 15:57 UTC (permalink / raw)
  To: Thomas Monjalon, Nithin Kumar Dabilpuram
  Cc: Kiran Kumar Kokkilagadda, yanzhirun_163, dev, Huichao Cai



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, November 15, 2024 7:54 PM
> To: Jerin Jacob <jerinj@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>
> Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>;
> yanzhirun_163@163.com; dev@dpdk.org; Huichao Cai <chcchc88@163.com>
> Subject: [EXTERNAL] Re: [PATCH v5 1/1] graph: improve node layout
> 
> Is it good to go? 15/11/2024 02: 55, Huichao Cai: > The members "dispatch"
> and "xstat_off" of the structure "rte_node" > can be min cache aligned to make
> room for future expansion and to > make sure have better performance. Add
> corresponding 
> Is it good to go?
> 
> 
> 15/11/2024 02:55, Huichao Cai:
> > The members "dispatch" and "xstat_off" of the structure "rte_node"
> > can be min cache aligned to make room for future expansion and to make
> > sure have better performance. Add corresponding comments.
> >
> > Signed-off-by: Huichao Cai <chcchc88@163.com>]


Acked-by: Jerin Jacob <jerinj@marvell.com>


> > ---
> >  doc/guides/rel_notes/release_24_11.rst |  2 ++
> >  lib/graph/rte_graph_worker_common.h    | 10 +++++++---
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_24_11.rst
> > b/doc/guides/rel_notes/release_24_11.rst
> > index 5063badf39..32800e8cb0 100644
> > --- a/doc/guides/rel_notes/release_24_11.rst
> > +++ b/doc/guides/rel_notes/release_24_11.rst
> > @@ -491,6 +491,8 @@ ABI Changes
> >    added new structure ``rte_node_xstats`` to ``rte_node_register`` and
> >    added ``xstat_off`` to ``rte_node``.
> >
> > +* graph: The members ``dispatch`` and ``xstat_off`` of the structure
> > +``rte_node`` have been
> > +  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
> >
> >  Known Issues
> >  ------------
> > diff --git a/lib/graph/rte_graph_worker_common.h
> > b/lib/graph/rte_graph_worker_common.h
> > index a518af2b2a..d3ec88519d 100644
> > --- a/lib/graph/rte_graph_worker_common.h
> > +++ b/lib/graph/rte_graph_worker_common.h
> > @@ -104,16 +104,20 @@ struct __rte_cache_aligned rte_node {
> >  	/** Original process function when pcap is enabled. */
> >  	rte_node_process_t original_process;
> >
> > +	/** Fast schedule area for mcore dispatch model. */
> >  	union {
> > -		/* Fast schedule area for mcore dispatch model */
> > -		struct {
> > +		alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
> >  			unsigned int lcore_id;  /**< Node running lcore. */
> >  			uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
> >  			uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> >  		} dispatch;
> >  	};
> > +
> > +	/** Fast path area cache line 1. */
> > +	alignas(RTE_CACHE_LINE_MIN_SIZE)
> >  	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> > -	/* Fast path area  */
> > +
> > +	/** Fast path area cache line 2. */
> >  	__extension__ struct __rte_cache_aligned {  #define RTE_NODE_CTX_SZ
> > 16
> >  		union {
> >
> 
> 
> 
> 


^ permalink raw reply	[relevance 0%]

* Re: rte_fib network order bug
  2024-11-15 13:52  3%               ` Morten Brørup
@ 2024-11-15 14:28  3%                 ` Robin Jarry
  2024-11-15 16:20  0%                   ` Stephen Hemminger
  0 siblings, 1 reply; 200+ results
From: Robin Jarry @ 2024-11-15 14:28 UTC (permalink / raw)
  To: Morten Brørup, Medvedkin, Vladimir, dev

Morten Brørup, Nov 15, 2024 at 14:52:
> Robin, you've totally won me over on this endian discussion. :-)
> Especially the IPv6 comparison make it clear why IPv4 should also be 
> network byte order.
>
> API/ABI stability is a pain... we're stuck with host endian IPv4 
> addresses; e.g. for the RTE_IPV4() macro, which I now agree produces 
> the wrong endian value (on little endian CPUs).

At least for 24.11 it is too late. But maybe we could make it right for 
the next LTS?

>> Vladimir, could we at least consider adding a real network order mode 
>> for the rib and fib libraries? So that we can have consistent APIs 
>> between IPv4 and IPv6?
>
> And/or rename RTE_FIB_F_NETWORK_ORDER to 
> RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar. This is important if real 
> network order mode is added (now or later)!

Maybe we could revert that patch and defer a complete change of the 
rib/fib APIs to only expose network order addresses? It would be an ABI 
breakage but if properly announced in advance, it should be possible.

Thinking about it some more. Having a flag for such a drastic change in 
behaviour does not seem right.

>> On that same topic, I wonder if it would make sense to change the API 
>> parameters to use an opaque rte_ipv4_addr_t type instead of a native 
>> uint32_t to avoid any confusion.
>
> It could be considered an IPv4 address type (like the IPv6 address 
> type) (which should be in network endian), which it is not, so I don't 
> like this idea.
>
> What the API really should offer is a choice (or a union) of uint32_t 
> and rte_be32_t, but that's not possible, so also using uint32_t for 
> big endian values seems like a viable compromise.
>
> Another alternative, using void* for the IPv4 address array, seems 
> overkill to me, since compilers don't warn about mixing uint32_t with 
> rte_be32_t values (like mixing signed and unsigned emits warnings).

If what I proposed above is possible, then all these APIs could be using 
rte_be32_t values (or even better, an rte_ipv4_addr_t alias for 
consistency with IPv6). That would make everything much simpler.

Thoughts?


^ permalink raw reply	[relevance 3%]

* Re: [PATCH v5 1/1] graph: improve node layout
  2024-11-15  1:55  5%         ` [PATCH v5 1/1] graph: improve node layout Huichao Cai
@ 2024-11-15 14:23  0%           ` Thomas Monjalon
  2024-11-15 15:57  0%             ` [EXTERNAL] " Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2024-11-15 14:23 UTC (permalink / raw)
  To: jerinj, ndabilpuram; +Cc: kirankumark, yanzhirun_163, dev, Huichao Cai

Is it good to go?


15/11/2024 02:55, Huichao Cai:
> The members "dispatch" and "xstat_off" of the structure "rte_node"
> can be min cache aligned to make room for future expansion and to
> make sure have better performance. Add corresponding comments.
> 
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
>  doc/guides/rel_notes/release_24_11.rst |  2 ++
>  lib/graph/rte_graph_worker_common.h    | 10 +++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
> index 5063badf39..32800e8cb0 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -491,6 +491,8 @@ ABI Changes
>    added new structure ``rte_node_xstats`` to ``rte_node_register`` and
>    added ``xstat_off`` to ``rte_node``.
>  
> +* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been
> +  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
>  
>  Known Issues
>  ------------
> diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
> index a518af2b2a..d3ec88519d 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -104,16 +104,20 @@ struct __rte_cache_aligned rte_node {
>  	/** Original process function when pcap is enabled. */
>  	rte_node_process_t original_process;
>  
> +	/** Fast schedule area for mcore dispatch model. */
>  	union {
> -		/* Fast schedule area for mcore dispatch model */
> -		struct {
> +		alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
>  			unsigned int lcore_id;  /**< Node running lcore. */
>  			uint64_t total_sched_objs; /**< Number of objects scheduled. */
>  			uint64_t total_sched_fail; /**< Number of scheduled failure. */
>  		} dispatch;
>  	};
> +
> +	/** Fast path area cache line 1. */
> +	alignas(RTE_CACHE_LINE_MIN_SIZE)
>  	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> -	/* Fast path area  */
> +
> +	/** Fast path area cache line 2. */
>  	__extension__ struct __rte_cache_aligned {
>  #define RTE_NODE_CTX_SZ 16
>  		union {
> 






^ permalink raw reply	[relevance 0%]

* RE: rte_fib network order bug
  @ 2024-11-15 13:52  3%               ` Morten Brørup
  2024-11-15 14:28  3%                 ` Robin Jarry
  0 siblings, 1 reply; 200+ results
From: Morten Brørup @ 2024-11-15 13:52 UTC (permalink / raw)
  To: Robin Jarry, Medvedkin, Vladimir, dev

> From: Robin Jarry [mailto:rjarry@redhat.com]
> Sent: Friday, 15 November 2024 14.02
> 
> Morten Brørup, Nov 14, 2024 at 15:35:
> >> RTE_IPV4 is only useful to define addresses in unit tests.
> >
> > There are plenty of special IP addresses and subnets, where a
> shortcut
> > macro makes the address easier readable in the code.
> 
> OK, let me reformulate. I didn't mean to say that RTE_IPV4 is useless.
> But it will always generate addresses in *host order*. Which means they
> cannot be used in IPv4 headers without passing them through htonl().
> This is weird in my opinion.

Robin, you've totally won me over on this endian discussion. :-)
Especially the IPv6 comparison make it clear why IPv4 should also be network byte order.

API/ABI stability is a pain... we're stuck with host endian IPv4 addresses; e.g. for the RTE_IPV4() macro, which I now agree produces the wrong endian value (on little endian CPUs).

> 
> >> Why would control plane use a different representation of addresses
> >> compared to data plane?
> >
> > Excellent question.
> > Old habit? Growing up using big endian CPUs, we have come to think of
> > IPv4 addresses as 32 bit numbers, so we keep treating them as such.
> > With this old way of thinking, the only reason to use network endian
> > in the fast path with little endian CPUs is for performance reasons
> > (to save the byte swap) - if not, we would still prefer using host
> > endian in the fast path too.
> 
> I understand the implementation reasons why you would prefer working
> with host order integers. But the APIs that deal with IPv4 addresses
> should not reflect implementation details.

They were probably designed based on the same way of thinking I was used to (until you convinced me I was wrong).

> 
> >> Also for consistency with IPv6, I really think
> >> that *all* addresses should be dealt in their network form.
> >
> > Food for thought!
> 
> Vladimir, could we at least consider adding a real network order mode
> for the rib and fib libraries? So that we can have consistent APIs
> between IPv4 and IPv6?

And/or rename RTE_FIB_F_NETWORK_ORDER to RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar. This is important if real network order mode is added (now or later)!

> 
> On that same topic, I wonder if it would make sense to change the API
> parameters to use an opaque rte_ipv4_addr_t type instead of a native
> uint32_t to avoid any confusion.

It could be considered an IPv4 address type (like the IPv6 address type) (which should be in network endian), which it is not, so I don't like this idea.
What the API really should offer is a choice (or a union) of uint32_t and rte_be32_t, but that's not possible, so also using uint32_t for big endian values seems like a viable compromise.
Another alternative, using void* for the IPv4 address array, seems overkill to me, since compilers don't warn about mixing uint32_t with rte_be32_t values (like mixing signed and unsigned emits warnings).

> 
> Thanks!


^ permalink raw reply	[relevance 3%]

* RE: [EXTERNAL] Re: [PATCH v15 4/4] eal: add PMU support to tracing library
  2024-11-12 23:09  3%     ` Stephen Hemminger
@ 2024-11-15 10:24  0%       ` Tomasz Duszynski
  0 siblings, 0 replies; 200+ results
From: Tomasz Duszynski @ 2024-11-15 10:24 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jerin Jacob, Sunil Kumar Kori, Tyler Retzlaff, Ruifeng.Wang,
	bruce.richardson, david.marchand, dev, konstantin.v.ananyev,
	mattias.ronnblom, mb, thomas, zhoumin

>-----Original Message-----
>From: Stephen Hemminger <stephen@networkplumber.org>
>Sent: Wednesday, November 13, 2024 12:10 AM
>To: Tomasz Duszynski <tduszynski@marvell.com>
>Cc: Jerin Jacob <jerinj@marvell.com>; Sunil Kumar Kori <skori@marvell.com>; Tyler Retzlaff
><roretzla@linux.microsoft.com>; Ruifeng.Wang@arm.com; bruce.richardson@intel.com;
>david.marchand@redhat.com; dev@dpdk.org; konstantin.v.ananyev@yandex.ru;
>mattias.ronnblom@ericsson.com; mb@smartsharesystems.com; thomas@monjalon.net; zhoumin@loongson.cn
>Subject: [EXTERNAL] Re: [PATCH v15 4/4] eal: add PMU support to tracing library
>
>On Fri, 25 Oct 2024 10: 54: 14 +0200 Tomasz Duszynski <tduszynski@ marvell. com> wrote: > In order
>to profile app one needs to store significant amount of samples > somewhere for an analysis later
>on. Since trace library supports > 
>On Fri, 25 Oct 2024 10:54:14 +0200
>Tomasz Duszynski <tduszynski@marvell.com> wrote:
>
>> In order to profile app one needs to store significant amount of
>> samples somewhere for an analysis later on. Since trace library
>> supports storing data in a CTF format lets take advantage of that and
>> add a dedicated PMU tracepoint.
>>
>> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
>> ---
>>  app/test/test_trace_perf.c               | 10 ++++
>>  doc/guides/prog_guide/profile_app.rst    |  5 ++
>>  doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
>>  lib/eal/common/eal_common_trace.c        |  5 +-
>>  lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
>>  lib/eal/common/eal_common_trace_points.c |  5 ++
>>  lib/eal/common/eal_trace.h               |  4 ++
>>  lib/eal/common/meson.build               |  1 +
>>  lib/eal/include/rte_eal_trace.h          | 11 ++++
>>  lib/eal/version.map                      |  1 +
>>  lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
>>  lib/pmu/rte_pmu.h                        | 24 +++++++--
>>  lib/pmu/version.map                      |  1 +
>>  13 files changed, 198 insertions(+), 6 deletions(-)  create mode
>> 100644 lib/eal/common/eal_common_trace_pmu.c
>
>
>There is an issue with calling a rte_experimental function.
>
>-------------------------------BEGIN LOGS----------------------------
>####################################################################################
>#### [Begin job log] "ubuntu-22.04-gcc-debug+doc+examples+tests" at step Build and test
>####################################################################################
>[3384/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o
>FAILED: buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o
>ccache gcc -Ibuildtools/chkincs/chkincs.p -Ibuildtools/chkincs -I../buildtools/chkincs -
>Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -
>I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -
>I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -
>I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -
>Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/pmu -I../lib/pmu -
>Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -
>Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -
>I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -
>I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -
>Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -
>I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -
>Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -
>Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -
>I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -
>I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -
>I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats
>-Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -
>Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev
>-I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder
>-Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost
>-I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -
>Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -
>I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -
>pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -g -include rte_config.h -
>Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -
>Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -
>Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned -
>Wno-missing-field-initializers -D_GNU_SOURCE -march=corei7 -mrtm -MD -MQ
>buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -MF buildtools/chkincs/chkincs.p/meson-
>generated_rte_pmu.c.o.d -o buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -c
>buildtools/chkincs/chkincs.p/rte_pmu.c
>In file included from buildtools/chkincs/chkincs.p/rte_pmu.c:1:
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h: In function ‘rte_pmu_read’:
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:214:17: error: ‘__rte_pmu_enable_group’ is
>deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
>  214 |                 ret = __rte_pmu_enable_group(group);
>      |                 ^~~
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:132:1: note: declared here
>  132 | __rte_pmu_enable_group(struct rte_pmu_event_group *group);
>      | ^~~~~~~~~~~~~~~~~~~~~~
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:222:9: error: ‘__rte_pmu_read_userpage’ is
>deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
>  222 |         return __rte_pmu_read_userpage(group->mmap_pages[index]);
>      |         ^~~~~~
>/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:86:1: note: declared here
>   86 | __rte_pmu_read_userpage(struct perf_event_mmap_page *pc)
>      | ^~~~~~~~~~~~~~~~~~~~~~~
>cc1: all warnings being treated as errors [3385/6468] Compiling C object
>buildtools/chkincs/chkincs.p/meson-generated_rte_byteorder.c.o
>[3386/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_atomic.c.o
>[3387/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_rtm.c.o
>[3388/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_memcpy.c.o
>[3389/6468] Compiling C object app/dpdk-test.p/test_test_memcpy_perf.c.o
>ninja: build stopped: subcommand failed.
>##[error]Process completed with exit code 1.

Right, this indeed pops up with -Dcheck_includes=true. Will fix this in v16. 

Thanks.

^ permalink raw reply	[relevance 0%]

* [PATCH v5 1/1] graph: improve node layout
  2024-11-14  8:45  5%       ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
  2024-11-14 10:05  0%         ` [EXTERNAL] " Jerin Jacob
@ 2024-11-15  1:55  5%         ` Huichao Cai
  2024-11-15 14:23  0%           ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: Huichao Cai @ 2024-11-15  1:55 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

The members "dispatch" and "xstat_off" of the structure "rte_node"
can be min cache aligned to make room for future expansion and to
make sure have better performance. Add corresponding comments.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 doc/guides/rel_notes/release_24_11.rst |  2 ++
 lib/graph/rte_graph_worker_common.h    | 10 +++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 5063badf39..32800e8cb0 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -491,6 +491,8 @@ ABI Changes
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
   added ``xstat_off`` to ``rte_node``.
 
+* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been
+  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..d3ec88519d 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,16 +104,20 @@ struct __rte_cache_aligned rte_node {
 	/** Original process function when pcap is enabled. */
 	rte_node_process_t original_process;
 
+	/** Fast schedule area for mcore dispatch model. */
 	union {
-		/* Fast schedule area for mcore dispatch model */
-		struct {
+		alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
 		} dispatch;
 	};
+
+	/** Fast path area cache line 1. */
+	alignas(RTE_CACHE_LINE_MIN_SIZE)
 	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
-	/* Fast path area  */
+
+	/** Fast path area cache line 2. */
 	__extension__ struct __rte_cache_aligned {
 #define RTE_NODE_CTX_SZ 16
 		union {
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* RE: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of rte_node
  2024-11-14  8:45  5%       ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-11-14 10:05  0%         ` Jerin Jacob
  2024-11-15  1:55  5%         ` [PATCH v5 1/1] graph: improve node layout Huichao Cai
  1 sibling, 0 replies; 200+ results
From: Jerin Jacob @ 2024-11-14 10:05 UTC (permalink / raw)
  To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163, david.marchand
  Cc: dev



> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Thursday, November 14, 2024 2:15 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH v4 2/2] graph: add alignment to the member of
> rte_node
> 
> The members dispatch and xstat_off of the structure rte_node can be min cache
> aligned to make room for future expansion and to make sure have better
> performance. Add corresponding comments. Due to the modification of the
> alignment of some members 
> The members dispatch and xstat_off of the structure rte_node can be min cache
> aligned to make room for future expansion and to make sure have better
> performance. Add corresponding comments.
> 

Please change subject to graph: improve node layout


> Due to the modification of the alignment of some members of the rte_node
> structure, update file release_24_11.rst.

The above section is not needed.


> 
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
>  doc/guides/rel_notes/release_24_11.rst | 3 +++
>  lib/graph/rte_graph_worker_common.h    | 7 ++++++-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_11.rst
> b/doc/guides/rel_notes/release_24_11.rst
> index 592116b979..6903b1d0f0 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -425,6 +425,9 @@ ABI Changes
> 
>  * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node``
> structure.
> 
> +* graph: The members ``dispatch`` and ``xstat_off`` of the structure
> +``rte_node`` have been
> +  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
> +
>  Known Issues
>  ------------
> 
> diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index 4c2432b47f..d36abec08b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -104,16 +104,21 @@ struct __rte_cache_aligned rte_node {
>  	/** Original process function when pcap is enabled. */
>  	rte_node_process_t original_process;
> 
> +	/** Fast path area cache line 1. */


Fast schedule area for mcore dispatch model

>  	union {
>  		/* Fast schedule area for mcore dispatch model */

Above comment you can remove it

> -		struct {
> +		alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
>  			unsigned int lcore_id;  /**< Node running lcore. */
>  			uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
>  			uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
>  			struct rte_graph *graph;  /**< Graph corresponding to
> lcore_id. */
>  		} dispatch;
>  	};
> +
> +	/** Fast path area cache line 2. */
> +	alignas(RTE_CACHE_LINE_MIN_SIZE)
>  	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> +
>  	/* Fast path area  */


Fast path area cache line 1

>  	__extension__ struct __rte_cache_aligned {  #define RTE_NODE_CTX_SZ
> 16

With above: Acked-by: Jerin Jacob <jerinj@marvell.com>

Looks loke we cannot merge new feature in rc3. I would suggest skip 1/2 and send only this patch so that 1/2 can merged in next release.

Please add @david.marchand@redhat.com in Cc.


^ permalink raw reply	[relevance 0%]

* [PATCH v4 2/2] graph: add alignment to the member of rte_node
  2024-11-14  8:45  5%     ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
@ 2024-11-14  8:45  5%       ` Huichao Cai
  2024-11-14 10:05  0%         ` [EXTERNAL] " Jerin Jacob
  2024-11-15  1:55  5%         ` [PATCH v5 1/1] graph: improve node layout Huichao Cai
  2024-12-13  2:21 10%       ` [PATCH v5] graph: mcore: optimize graph search Huichao Cai
  1 sibling, 2 replies; 200+ results
From: Huichao Cai @ 2024-11-14  8:45 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

The members dispatch and xstat_off of the structure rte_node
can be min cache aligned to make room for future expansion and to
make sure have better performance. Add corresponding comments.

Due to the modification of the alignment of some members of the
rte_node structure, update file release_24_11.rst.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 doc/guides/rel_notes/release_24_11.rst | 3 +++
 lib/graph/rte_graph_worker_common.h    | 7 ++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 592116b979..6903b1d0f0 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -425,6 +425,9 @@ ABI Changes
 
 * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
 
+* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been
+  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
+
 Known Issues
 ------------
 
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4c2432b47f..d36abec08b 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,16 +104,21 @@ struct __rte_cache_aligned rte_node {
 	/** Original process function when pcap is enabled. */
 	rte_node_process_t original_process;
 
+	/** Fast path area cache line 1. */
 	union {
 		/* Fast schedule area for mcore dispatch model */
-		struct {
+		alignas(RTE_CACHE_LINE_MIN_SIZE) struct {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
 			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
+
+	/** Fast path area cache line 2. */
+	alignas(RTE_CACHE_LINE_MIN_SIZE)
 	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
+
 	/* Fast path area  */
 	__extension__ struct __rte_cache_aligned {
 #define RTE_NODE_CTX_SZ 16
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* [PATCH v4 1/2] graph: mcore: optimize graph search
  2024-11-13  7:35  5%   ` [PATCH v3 1/2] " Huichao Cai
  2024-11-13  7:35  5%     ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-11-14  8:45  5%     ` Huichao Cai
  2024-11-14  8:45  5%       ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
  2024-12-13  2:21 10%       ` [PATCH v5] graph: mcore: optimize graph search Huichao Cai
  1 sibling, 2 replies; 200+ results
From: Huichao Cai @ 2024-11-14  8:45 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Due to the addition of a "graph" field in the "rte_node" structure,
update file release_24_11.rst.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 doc/guides/rel_notes/release_24_11.rst     |  1 +
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 9dc739c4cb..592116b979 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -423,6 +423,7 @@ ABI Changes
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
   added ``xstat_off`` to ``rte_node``.
 
+* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..4c2432b47f 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* RE: [EXTERNAL] [PATCH v3 2/2] graph: add alignment to the member of rte_node
  2024-11-13  7:35  5%     ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
@ 2024-11-14  7:14  0%       ` Jerin Jacob
  0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2024-11-14  7:14 UTC (permalink / raw)
  To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163
  Cc: dev



> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Wednesday, November 13, 2024 1:06 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH v3 2/2] graph: add alignment to the member of
> rte_node
> 
> The members "dispatch" and "xstat_off" of the structure "rte_node" can be min
> cache aligned to make room for future expansion and to make sure have better
> performance. Due to the modification of the alignment of some members of the
> "rte_node"
> 
> The members "dispatch" and "xstat_off" of the structure "rte_node"
> can be min cache aligned to make room for future expansion and to make sure
> have better performance.
> 
> Due to the modification of the alignment of some members of the "rte_node"
> structure, update file release_24_11.rst.
> 
> Signed-off-by: Huichao Cai <chcchc88@163.com>
> ---
>  doc/guides/rel_notes/release_24_11.rst | 3 +++
>  lib/graph/rte_graph_worker_common.h    | 5 ++++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_11.rst
> b/doc/guides/rel_notes/release_24_11.rst
> index 592116b979..6903b1d0f0 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -425,6 +425,9 @@ ABI Changes
> 
>  * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node``
> structure.
> 
> +* graph: The members ``dispatch`` and ``xstat_off`` of the structure
> +``rte_node`` have been
> +  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
> +
>  Known Issues
>  ------------
> 
> diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index 4c2432b47f..9e99278a0a 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
>  	/** Original process function when pcap is enabled. */
>  	rte_node_process_t original_process;
> 
> +	alignas(RTE_CACHE_LINE_MIN_SIZE)
>  	union {
>  		/* Fast schedule area for mcore dispatch model */
>  		struct {
> @@ -113,8 +114,10 @@ struct __rte_cache_aligned rte_node {
>  			struct rte_graph *graph;  /**< Graph corresponding to
> lcore_id. */
>  		} dispatch;
>  	};
> -	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> +
>  	/* Fast path area  */

Make it as two separate comment, Fast path area cache line 1 and Fastpath area cache line 2.

> +	alignas(RTE_CACHE_LINE_MIN_SIZE)
> +	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
>  	__extension__ struct __rte_cache_aligned {  #define RTE_NODE_CTX_SZ
> 16
>  		union {
> --
> 2.27.0


^ permalink raw reply	[relevance 0%]

* [PATCH v3 2/2] graph: add alignment to the member of rte_node
  2024-11-13  7:35  5%   ` [PATCH v3 1/2] " Huichao Cai
@ 2024-11-13  7:35  5%     ` Huichao Cai
  2024-11-14  7:14  0%       ` [EXTERNAL] " Jerin Jacob
  2024-11-14  8:45  5%     ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
  1 sibling, 1 reply; 200+ results
From: Huichao Cai @ 2024-11-13  7:35 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

The members "dispatch" and "xstat_off" of the structure "rte_node"
can be min cache aligned to make room for future expansion and to
make sure have better performance.

Due to the modification of the alignment of some members of the
"rte_node" structure, update file release_24_11.rst.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 doc/guides/rel_notes/release_24_11.rst | 3 +++
 lib/graph/rte_graph_worker_common.h    | 5 ++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 592116b979..6903b1d0f0 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -425,6 +425,9 @@ ABI Changes
 
 * graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
 
+* graph: The members ``dispatch`` and ``xstat_off`` of the structure ``rte_node`` have been
+  marked as RTE_CACHE_LINE_MIN_SIZE bytes aligned.
+
 Known Issues
 ------------
 
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index 4c2432b47f..9e99278a0a 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
 	/** Original process function when pcap is enabled. */
 	rte_node_process_t original_process;
 
+	alignas(RTE_CACHE_LINE_MIN_SIZE)
 	union {
 		/* Fast schedule area for mcore dispatch model */
 		struct {
@@ -113,8 +114,10 @@ struct __rte_cache_aligned rte_node {
 			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
-	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
+
 	/* Fast path area  */
+	alignas(RTE_CACHE_LINE_MIN_SIZE)
+	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
 	__extension__ struct __rte_cache_aligned {
 #define RTE_NODE_CTX_SZ 16
 		union {
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* [PATCH v3 1/2] graph: mcore: optimize graph search
    2024-11-11  5:46  3%   ` [EXTERNAL] " Jerin Jacob
@ 2024-11-13  7:35  5%   ` Huichao Cai
  2024-11-13  7:35  5%     ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
  2024-11-14  8:45  5%     ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
  1 sibling, 2 replies; 200+ results
From: Huichao Cai @ 2024-11-13  7:35 UTC (permalink / raw)
  To: jerinj, kirankumark, ndabilpuram, yanzhirun_163; +Cc: dev

In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
use a slower loop to search for the graph, modify the search logic
to record the result of the first search, and use this record for
subsequent searches to improve search speed.

Due to the addition of a "graph" field in the "rte_node" structure,
update file release_24_11.rst.

Signed-off-by: Huichao Cai <chcchc88@163.com>
---
 doc/guides/rel_notes/release_24_11.rst     |  1 +
 lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
 lib/graph/rte_graph_worker_common.h        |  1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 9dc739c4cb..592116b979 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -423,6 +423,7 @@ ABI Changes
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
   added ``xstat_off`` to ``rte_node``.
 
+* graph: added ``graph`` field to the ``dispatch`` structure in the ``rte_node`` structure.
 
 Known Issues
 ------------
diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c b/lib/graph/rte_graph_model_mcore_dispatch.c
index a590fc9497..a81d338227 100644
--- a/lib/graph/rte_graph_model_mcore_dispatch.c
+++ b/lib/graph/rte_graph_model_mcore_dispatch.c
@@ -118,11 +118,14 @@ __rte_graph_mcore_dispatch_sched_node_enqueue(struct rte_node *node,
 					      struct rte_graph_rq_head *rq)
 {
 	const unsigned int lcore_id = node->dispatch.lcore_id;
-	struct rte_graph *graph;
+	struct rte_graph *graph = node->dispatch.graph;
 
-	SLIST_FOREACH(graph, rq, next)
-		if (graph->dispatch.lcore_id == lcore_id)
-			break;
+	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
+		SLIST_FOREACH(graph, rq, next)
+			if (graph->dispatch.lcore_id == lcore_id)
+				break;
+		node->dispatch.graph = graph;
+	}
 
 	return graph != NULL ? __graph_sched_node_enqueue(node, graph) : false;
 }
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..4c2432b47f 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
 			unsigned int lcore_id;  /**< Node running lcore. */
 			uint64_t total_sched_objs; /**< Number of objects scheduled. */
 			uint64_t total_sched_fail; /**< Number of scheduled failure. */
+			struct rte_graph *graph;  /**< Graph corresponding to lcore_id. */
 		} dispatch;
 	};
 	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
-- 
2.27.0


^ permalink raw reply	[relevance 5%]

* [PATCH v3] power: fix a typo in the PM QoS guide
  2024-11-11 12:52  5% [PATCH] power: fix a typo in the PM QoS guide Huisong Li
  2024-11-12  8:35  5% ` [PATCH v2] " Huisong Li
@ 2024-11-13  0:59  5% ` Huisong Li
  1 sibling, 0 replies; 200+ results
From: Huisong Li @ 2024-11-13  0:59 UTC (permalink / raw)
  To: dev
  Cc: thomas, ferruh.yigit, david.hunt, sivaprasad.tummala,
	konstantin.ananyev, fengchengwen, liuyonglong, lihuisong

The typo in the guide is hard to understand. Necessary to fix it.

Fixes: dd6fd75bf662 ("power: introduce PM QoS API on CPU wide")

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 doc/guides/prog_guide/power_man.rst | 2 +-
 lib/power/rte_power_qos.h           | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 22e6e4fe1d..74039e5786 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -118,7 +118,7 @@ based on this CPU resume latency in their idle task.
 
 The deeper the idle state, the lower the power consumption,
 but the longer the resume time.
-Some services are latency sensitive and very except the low resume time,
+Some services are latency sensitive and request a low resume time,
 like interrupt packet receiving mode.
 
 Applications can set and get the CPU resume latency with
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
index 7a8dab9272..05a3f51ae2 100644
--- a/lib/power/rte_power_qos.h
+++ b/lib/power/rte_power_qos.h
@@ -24,8 +24,8 @@ extern "C" {
  * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
  *
  * The deeper the idle state, the lower the power consumption, but the
- * longer the resume time. Some service are delay sensitive and very except the
- * low resume time, like interrupt packet receiving mode.
+ * longer the resume time. Some services are latency sensitive and request
+ * a low resume time, like interrupt packet receiving mode.
  *
  * In these case, per-CPU PM QoS API can be used to control this CPU's idle
  * state selection and limit just enter the shallowest idle state to low the
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* Re: [PATCH v15 4/4] eal: add PMU support to tracing library
  @ 2024-11-12 23:09  3%     ` Stephen Hemminger
  2024-11-15 10:24  0%       ` [EXTERNAL] " Tomasz Duszynski
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2024-11-12 23:09 UTC (permalink / raw)
  To: Tomasz Duszynski
  Cc: Jerin Jacob, Sunil Kumar Kori, Tyler Retzlaff, Ruifeng.Wang,
	bruce.richardson, david.marchand, dev, konstantin.v.ananyev,
	mattias.ronnblom, mb, thomas, zhoumin

On Fri, 25 Oct 2024 10:54:14 +0200
Tomasz Duszynski <tduszynski@marvell.com> wrote:

> In order to profile app one needs to store significant amount of samples
> somewhere for an analysis later on. Since trace library supports
> storing data in a CTF format lets take advantage of that and add a
> dedicated PMU tracepoint.
> 
> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
> ---
>  app/test/test_trace_perf.c               | 10 ++++
>  doc/guides/prog_guide/profile_app.rst    |  5 ++
>  doc/guides/prog_guide/trace_lib.rst      | 32 +++++++++++
>  lib/eal/common/eal_common_trace.c        |  5 +-
>  lib/eal/common/eal_common_trace_pmu.c    | 38 ++++++++++++++
>  lib/eal/common/eal_common_trace_points.c |  5 ++
>  lib/eal/common/eal_trace.h               |  4 ++
>  lib/eal/common/meson.build               |  1 +
>  lib/eal/include/rte_eal_trace.h          | 11 ++++
>  lib/eal/version.map                      |  1 +
>  lib/pmu/rte_pmu.c                        | 67 +++++++++++++++++++++++-
>  lib/pmu/rte_pmu.h                        | 24 +++++++--
>  lib/pmu/version.map                      |  1 +
>  13 files changed, 198 insertions(+), 6 deletions(-)
>  create mode 100644 lib/eal/common/eal_common_trace_pmu.c


There is an issue with calling a rte_experimental function.

-------------------------------BEGIN LOGS----------------------------
####################################################################################
#### [Begin job log] "ubuntu-22.04-gcc-debug+doc+examples+tests" at step Build and test
####################################################################################
[3384/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o
FAILED: buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o 
ccache gcc -Ibuildtools/chkincs/chkincs.p -Ibuildtools/chkincs -I../buildtools/chkincs -Iexamples/l3fwd -I../examples/l3fwd -I../examples/common -Idrivers/bus/vdev -I../drivers/bus/vdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/pmu -I../lib/pmu -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -Ilib/argparse -I../lib/argparse -Ilib/ptr_compress -I../lib/ptr_compress -Ilib/ring -I../lib/ring -Ilib/rcu -I../lib/rcu -Ilib/mempool -I../lib/mempool -Ilib/mbuf -I../lib/mbuf -Ilib/net -I../lib/net -Ilib/meter -I../lib/meter -Ilib/ethdev -I../lib/ethdev -Ilib/cmdline -I../lib/cmdline -Ilib/hash -I../lib/hash -Ilib/timer -I../lib/timer -Ilib/acl -I../lib/acl -Ilib/bbdev -I../lib/bbdev -Ilib/bitratestats -I../lib/bitratestats -Ilib/bpf -I../lib/bpf -Ilib/cfgfile -I../lib/cfgfile -Ilib/compressdev -I../lib/compressdev -Ilib/cryptodev -I../lib/cryptodev -Ilib/distributor -I../lib/distributor -Ilib/dmadev -I../lib/dmadev -Ilib/efd -I../lib/efd -Ilib/eventdev -I../lib/eventdev -Ilib/dispatcher -I../lib/dispatcher -Ilib/gpudev -I../lib/gpudev -Ilib/gro -I../lib/gro -Ilib/gso -I../lib/gso -Ilib/ip_frag -I../lib/ip_frag -Ilib/jobstats -I../lib/jobstats -Ilib/latencystats -I../lib/latencystats -Ilib/lpm -I../lib/lpm -Ilib/member -I../lib/member -Ilib/pcapng -I../lib/pcapng -Ilib/power -I../lib/power -Ilib/rawdev -I../lib/rawdev -Ilib/regexdev -I../lib/regexdev -Ilib/mldev -I../lib/mldev -Ilib/rib -I../lib/rib -Ilib/reorder -I../lib/reorder -Ilib/sched -I../lib/sched -Ilib/security -I../lib/security -Ilib/stack -I../lib/stack -Ilib/vhost -I../lib/vhost -Ilib/ipsec -I../lib/ipsec -Ilib/pdcp -I../lib/pdcp -Ilib/fib -I../lib/fib -Ilib/port -I../lib/port -Ilib/pdump -I../lib/pdump -Ilib/table -I../lib/table -Ilib/pipeline -I../lib/pipeline -Ilib/graph -I../lib/graph -Ilib/node -I../lib/node -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned -Wno-missing-field-initializers -D_GNU_SOURCE -march=corei7 -mrtm -MD -MQ buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -MF buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o.d -o buildtools/chkincs/chkincs.p/meson-generated_rte_pmu.c.o -c buildtools/chkincs/chkincs.p/rte_pmu.c
In file included from buildtools/chkincs/chkincs.p/rte_pmu.c:1:
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h: In function ‘rte_pmu_read’:
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:214:17: error: ‘__rte_pmu_enable_group’ is deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
  214 |                 ret = __rte_pmu_enable_group(group);
      |                 ^~~
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:132:1: note: declared here
  132 | __rte_pmu_enable_group(struct rte_pmu_event_group *group);
      | ^~~~~~~~~~~~~~~~~~~~~~
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:222:9: error: ‘__rte_pmu_read_userpage’ is deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
  222 |         return __rte_pmu_read_userpage(group->mmap_pages[index]);
      |         ^~~~~~
/home/runner/work/dpdk/dpdk/lib/pmu/rte_pmu.h:86:1: note: declared here
   86 | __rte_pmu_read_userpage(struct perf_event_mmap_page *pc)
      | ^~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
[3385/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_byteorder.c.o
[3386/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_atomic.c.o
[3387/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_rtm.c.o
[3388/6468] Compiling C object buildtools/chkincs/chkincs.p/meson-generated_rte_memcpy.c.o
[3389/6468] Compiling C object app/dpdk-test.p/test_test_memcpy_perf.c.o
ninja: build stopped: subcommand failed.
##[error]Process completed with exit code 1.

^ permalink raw reply	[relevance 3%]

* Re:RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-12  9:35  3%             ` Jerin Jacob
@ 2024-11-12 12:57  0%               ` Huichao Cai
  0 siblings, 0 replies; 200+ results
From: Huichao Cai @ 2024-11-12 12:57 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: David Marchand, Kiran Kumar Kokkilagadda,
	Nithin Kumar Dabilpuram, yanzhirun_163, dev, Thomas Monjalon,
	Robin Jarry

[-- Attachment #1: Type: text/plain, Size: 205 bytes --]

>OK. @Huichao Cai Please send two patches (a) new proposal and (b) your improvement as series.
>Update ABI Changes section in  doc/guides/rel_notes/release_24_11.rst  Ok.I will send these two patches soon.

[-- Attachment #2: Type: text/html, Size: 380 bytes --]

^ permalink raw reply	[relevance 0%]

* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-12  8:51  0%           ` David Marchand
@ 2024-11-12  9:35  3%             ` Jerin Jacob
  2024-11-12 12:57  0%               ` Huichao Cai
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2024-11-12  9:35 UTC (permalink / raw)
  To: David Marchand, Huichao Cai
  Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
	dev, Thomas Monjalon, Robin Jarry



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, November 12, 2024 2:21 PM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
> 
> On Mon, Nov 11, 2024 at 6: 39 AM Jerin Jacob <jerinj@ marvell. com> wrote: >
> > > > > -----Original Message----- > > From: David Marchand
> <david. marchand@ redhat. com> > > Sent: Friday, November 8, 2024 7: 08
> 
> On Mon, Nov 11, 2024 at 6:39 AM Jerin Jacob <jerinj@marvell.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: David Marchand <david.marchand@redhat.com>
> > > Sent: Friday, November 8, 2024 7:08 PM
> > > To: Jerin Jacob <jerinj@marvell.com>
> > > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> > > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> > > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> > > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry
> > > <rjarry@redhat.com>
> > > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search
> > > when scheduling nodes
> > >
> > > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ 
> > > marvell. com>
> > > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the
> > > ABI, or is there any special operation required to modify > > Hello
> > > Jerin,
> >
> > Hello David,
> >
> > >
> > > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > > > > Is n't breaking the ABI?
> > > > >
> > > > > So can't we modify the ABI, or is there any special operation
> > > > > required to modify the ABI?
> > > >
> > > > Only LTS release (xx.11) can change the ABI after sending deprecation
> notice.
> > > > Looking at the pahole output, one option will be making dispatch
> > > > and new semi fastpath Additions like  xstat_off can be min cache
> > > > aligned to make room for future expansion and to make sure have
> > > > better
> > > performance.
> > >
> > > Adding holes may be a short term solution, but in my opinion, the
> > > slow path part should be entirely hidden and we only expose the fp part.
> >
> > The new cache line alignment items are proposed are fastpath items only.
> 
> I had only noticed the second comment:
> 
> +       alignas(RTE_CACHE_LINE_MIN_SIZE)
>         rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
>         /* Fast path area  */
>         ^^^^^^^^^^^^
> 
> And I assumed the part in the struct before was slow path.
> (it may be worth enhancing these comments, with a single limit of slow/fast
> path areas)

Yes. Xstat_off was new addition as a fastpath item in this release and there was no space
in original Fastpath area. And, Yes, the comment needs to be updated.


> 
> 
> >
> > > Reminder, those holes must be in a "known state" as we release
> > > v24.11 so that the presence of future additions can be safely detected.
> 
> If the rte_node objects are allocated by the graph library and zero'd, then we
> are good.
> It seems to be the case in graph_nodes_populate(), and the rte_node objects
> are embedded in the rte_graph object.
> 
> Is there another location in the graph library where a rte_node object is
> allocated?

No

> 
> If not, and an application can not create a rte_node object, your proposal looks
> good to me.

OK. @Huichao Cai Please send two patches (a) new proposal and (b) your improvement as series.
Update ABI Changes section in  doc/guides/rel_notes/release_24_11.rst  


> 
> 
> --
> David Marchand


^ permalink raw reply	[relevance 3%]

* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-11  5:38  0%         ` Jerin Jacob
@ 2024-11-12  8:51  0%           ` David Marchand
  2024-11-12  9:35  3%             ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2024-11-12  8:51 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163, dev, Thomas Monjalon, Robin Jarry

On Mon, Nov 11, 2024 at 6:39 AM Jerin Jacob <jerinj@marvell.com> wrote:
>
>
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Friday, November 8, 2024 7:08 PM
> > To: Jerin Jacob <jerinj@marvell.com>
> > Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> > <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> > <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> > Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> > Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> > scheduling nodes
> >
> > Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ marvell. com>
> > wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is
> > there any special operation required to modify > >
> > Hello Jerin,
>
> Hello David,
>
> >
> > On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > > > Is n't breaking the ABI?
> > > >
> > > > So can't we modify the ABI, or is there any special operation
> > > > required to modify the ABI?
> > >
> > > Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> > > Looking at the pahole output, one option will be making dispatch and
> > > new semi fastpath Additions like  xstat_off can be min cache aligned
> > > to make room for future expansion and to make sure have better
> > performance.
> >
> > Adding holes may be a short term solution, but in my opinion, the slow path
> > part should be entirely hidden and we only expose the fp part.
>
> The new cache line alignment items are proposed are fastpath items only.

I had only noticed the second comment:

+       alignas(RTE_CACHE_LINE_MIN_SIZE)
        rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
        /* Fast path area  */
        ^^^^^^^^^^^^

And I assumed the part in the struct before was slow path.
(it may be worth enhancing these comments, with a single limit of
slow/fast path areas)


>
> > Reminder, those holes must be in a "known state" as we release v24.11 so that
> > the presence of future additions can be safely detected.

If the rte_node objects are allocated by the graph library and zero'd,
then we are good.
It seems to be the case in graph_nodes_populate(), and the rte_node
objects are embedded in the rte_graph object.

Is there another location in the graph library where a rte_node object
is allocated?

If not, and an application can not create a rte_node object, your
proposal looks good to me.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* [PATCH v2] power: fix a typo in the PM QoS guide
  2024-11-11 12:52  5% [PATCH] power: fix a typo in the PM QoS guide Huisong Li
@ 2024-11-12  8:35  5% ` Huisong Li
  2024-11-13  0:59  5% ` [PATCH v3] " Huisong Li
  1 sibling, 0 replies; 200+ results
From: Huisong Li @ 2024-11-12  8:35 UTC (permalink / raw)
  To: dev
  Cc: thomas, ferruh.yigit, david.hunt, sivaprasad.tummala,
	konstantin.ananyev, fengchengwen, liuyonglong, lihuisong

The typo in the guide is hard to understand. Necessary to fix it.

Fixes: dd6fd75bf662 ("power: introduce PM QoS API on CPU wide")

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 doc/guides/prog_guide/power_man.rst | 2 +-
 lib/power/rte_power_qos.h           | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 22e6e4fe1d..74039e5786 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -118,7 +118,7 @@ based on this CPU resume latency in their idle task.
 
 The deeper the idle state, the lower the power consumption,
 but the longer the resume time.
-Some services are latency sensitive and very except the low resume time,
+Some services are latency sensitive and request a low resume time,
 like interrupt packet receiving mode.
 
 Applications can set and get the CPU resume latency with
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
index 7a8dab9272..ce0c6eda15 100644
--- a/lib/power/rte_power_qos.h
+++ b/lib/power/rte_power_qos.h
@@ -24,7 +24,7 @@ extern "C" {
  * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
  *
  * The deeper the idle state, the lower the power consumption, but the
- * longer the resume time. Some service are delay sensitive and very except the
+ * longer the resume time. Some service are delay sensitive and request a
  * low resume time, like interrupt packet receiving mode.
  *
  * In these case, per-CPU PM QoS API can be used to control this CPU's idle
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [PATCH] power: fix a typo in the PM QoS guide
@ 2024-11-11 12:52  5% Huisong Li
  2024-11-12  8:35  5% ` [PATCH v2] " Huisong Li
  2024-11-13  0:59  5% ` [PATCH v3] " Huisong Li
  0 siblings, 2 replies; 200+ results
From: Huisong Li @ 2024-11-11 12:52 UTC (permalink / raw)
  To: dev
  Cc: thomas, ferruh.yigit, david.hunt, sivaprasad.tummala,
	konstantin.ananyev, fengchengwen, liuyonglong, lihuisong

The typo in the guide is hard to understand. Necessary to fix it.

Fixes: dd6fd75bf662 ("power: introduce PM QoS API on CPU wide")

Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
 doc/guides/prog_guide/power_man.rst | 2 +-
 lib/power/rte_power_qos.h           | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 22e6e4fe1d..024670a9b4 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -118,7 +118,7 @@ based on this CPU resume latency in their idle task.
 
 The deeper the idle state, the lower the power consumption,
 but the longer the resume time.
-Some services are latency sensitive and very except the low resume time,
+Some services are latency sensitive and request the low resume time,
 like interrupt packet receiving mode.
 
 Applications can set and get the CPU resume latency with
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
index 7a8dab9272..a6d3677409 100644
--- a/lib/power/rte_power_qos.h
+++ b/lib/power/rte_power_qos.h
@@ -24,7 +24,7 @@ extern "C" {
  * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
  *
  * The deeper the idle state, the lower the power consumption, but the
- * longer the resume time. Some service are delay sensitive and very except the
+ * longer the resume time. Some service are delay sensitive and request the
  * low resume time, like interrupt packet receiving mode.
  *
  * In these case, per-CPU PM QoS API can be used to control this CPU's idle
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* Re: [PATCH v15 0/3] power: introduce PM QoS interface
  2024-11-11  2:25  4% ` [PATCH v15 " Huisong Li
  2024-11-11  2:25  5%   ` [PATCH v15 1/3] power: introduce PM QoS API on CPU wide Huisong Li
@ 2024-11-11 10:29  0%   ` Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2024-11-11 10:29 UTC (permalink / raw)
  To: Huisong Li
  Cc: dev, mb, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong

11/11/2024 03:25, Huisong Li:
> The deeper the idle state, the lower the power consumption, but the longer
> the resume time. Some service are delay sensitive and very except the low
> resume time, like interrupt packet receiving mode.
> 
> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> interface is used to set and get the resume latency limit on the cpuX for
> userspace. Please see the description in kernel document[1].
> Each cpuidle governor in Linux select which idle state to enter based on
> this CPU resume latency in their idle task.
> 
> The per-CPU PM QoS API can be used to control this CPU's idle state
> selection and limit just enter the shallowest idle state to low the delay
> when wake up from idle state by setting strict resume latency (zero value).
> 
> [1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

Applied, thanks.




^ permalink raw reply	[relevance 0%]

* [RESEND PATCH v15 1/3] power: introduce PM QoS API on CPU wide
  2024-11-11  9:14  4% ` [RESEND PATCH " Huisong Li
@ 2024-11-11  9:14  5%   ` Huisong Li
  0 siblings, 0 replies; 200+ results
From: Huisong Li @ 2024-11-11  9:14 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 1ebab77ee9..ecae6b46ef 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 543becba28..187e6823d7 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -276,6 +276,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 4f4dc19687..313aaa6701 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -16,6 +16,7 @@ sources = files(
         'rte_power_cpufreq.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+        'rte_power_qos.c',
 )
 headers = files(
         'power_cpufreq.h',
@@ -24,6 +25,7 @@ headers = files(
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+        'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..4dd0532b36
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == RTE_POWER_QOS_STRICT_LATENCY_VALUE)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = RTE_POWER_QOS_STRICT_LATENCY_VALUE;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..7a8dab9272
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE		0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT	INT32_MAX
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index f442329bbc..920c8e79b3 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,6 +51,10 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
 
 INTERNAL {
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [RESEND PATCH v15 0/3] power: introduce PM QoS interface
                     ` (4 preceding siblings ...)
  2024-11-11  2:25  4% ` [PATCH v15 " Huisong Li
@ 2024-11-11  9:14  4% ` Huisong Li
  2024-11-11  9:14  5%   ` [RESEND PATCH v15 1/3] power: introduce PM QoS API on CPU wide Huisong Li
  5 siblings, 1 reply; 200+ results
From: Huisong Li @ 2024-11-11  9:14 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Please see the description in kernel document[1].
Each cpuidle governor in Linux select which idle state to enter based on
this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from idle state by setting strict resume latency (zero value).

[1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

---
 v15:
  - fix conflicts due to the new merged patches that rework power lib.
  - add Acked-by: Konstantin Ananyev for patch 3/3.
 v14:
  - use parse_uint to parse --cpu-resume-latency instead of adding a new
    parse_int()
 v13:
  - not allow negative value for --cpu-resume-latency.
  - restore to the original value as Konstantin suggested.
 v12:
  - add Acked-by Chengwen and Konstantin
  - fix overflow issue in l3fwd-power when parse command line
  - add a command parameter to set CPU resume latency
 v11:
  - operate the cpu id the lcore mapped by the new function
    power_get_lcore_mapped_cpu_id().
 v10:
  - replace LINE_MAX with a custom macro and fix two typos.
 v9:
  - move new feature description from release_24_07.rst to release_24_11.rst.
 v8:
  - update the latest code to resolve CI warning
 v7:
  - remove a dead code rte_lcore_is_enabled in patch[2/2]
 v6:
  - update release_24_07.rst based on dpdk repo to resolve CI warning.
 v5:
  - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
 v4:
  - fix some comments basd on Stephen
  - add stdint.h include
  - add Acked-by Morten Brørup <mb@smartsharesystems.com>
 v3:
  - add RTE_POWER_xxx prefix for some macro in header
  - add the check for lcore_id with rte_lcore_is_enabled
 v2:
  - use PM QoS on CPU wide to replace the one on system wide

Huisong Li (3):
  power: introduce PM QoS API on CPU wide
  examples/l3fwd-power: fix data overflow when parse command line
  examples/l3fwd-power: add PM QoS configuration

 doc/guides/prog_guide/power_man.rst           |  19 +++
 doc/guides/rel_notes/release_24_11.rst        |   5 +
 .../sample_app_ug/l3_forward_power_man.rst    |   5 +-
 examples/l3fwd-power/main.c                   |  96 +++++++++++---
 lib/power/meson.build                         |   2 +
 lib/power/rte_power_qos.c                     | 123 ++++++++++++++++++
 lib/power/rte_power_qos.h                     |  73 +++++++++++
 lib/power/version.map                         |   4 +
 8 files changed, 306 insertions(+), 21 deletions(-)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

-- 
2.22.0


^ permalink raw reply	[relevance 4%]

* RE: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search
  @ 2024-11-11  5:46  3%   ` Jerin Jacob
  2024-11-13  7:35  5%   ` [PATCH v3 1/2] " Huichao Cai
  1 sibling, 0 replies; 200+ results
From: Jerin Jacob @ 2024-11-11  5:46 UTC (permalink / raw)
  To: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163, david.marchand, Thomas Monjalon
  Cc: dev



> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Monday, November 11, 2024 9:33 AM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org; Huichao cai <chcchc88@163.com>
> Subject: [EXTERNAL] [PATCH v2] graph: mcore: optimize graph search
> 
> From: Huichao cai <chcchc88@ 163. com> In the function
> __rte_graph_mcore_dispatch_sched_node_enqueue, use a slower loop to
> search for the graph, modify the search logic to record the result of the first
> search, and use this record for subsequent 
> From: Huichao cai <chcchc88@163.com>
> 
> In the function __rte_graph_mcore_dispatch_sched_node_enqueue,
> use a slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed.
> 
> Signed-off-by: Huichao cai <chcchc88@163.com>
> ---
>  lib/graph/rte_graph_model_mcore_dispatch.c | 11 +++++++----
>  lib/graph/rte_graph_worker_common.h        |  1 +
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/graph/rte_graph_model_mcore_dispatch.c
> b/lib/graph/rte_graph_model_mcore_dispatch.c
> index a590fc9..a81d338 100644
> --- a/lib/graph/rte_graph_model_mcore_dispatch.c
> +++ b/lib/graph/rte_graph_model_mcore_dispatch.c
> @@ -118,11 +118,14 @@
>  					      struct rte_graph_rq_head *rq)  {
>  	const unsigned int lcore_id = node->dispatch.lcore_id;
> -	struct rte_graph *graph;
> +	struct rte_graph *graph = node->dispatch.graph;
> 
> -	SLIST_FOREACH(graph, rq, next)
> -		if (graph->dispatch.lcore_id == lcore_id)
> -			break;
> +	if (unlikely((!graph) || (graph->dispatch.lcore_id != lcore_id))) {
> +		SLIST_FOREACH(graph, rq, next)
> +			if (graph->dispatch.lcore_id == lcore_id)
> +				break;
> +		node->dispatch.graph = graph;
> +	}
> 
>  	return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false;  } diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index a518af2..4c2432b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>  			unsigned int lcore_id;  /**< Node running lcore. */
>  			uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
>  			uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> +			struct rte_graph *graph;  /**< Graph corresponding to
> lcore_id. */

Need to conclude the ABI related discussion here before making change
 https://patches.dpdk.org/project/dpdk/patch/1730966682-2632-1-git-send-email-chcchc88@163.com/

>  		} dispatch;
>  	};
>  	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> --
> 1.8.3.1


^ permalink raw reply	[relevance 3%]

* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-08 13:38  0%       ` David Marchand
@ 2024-11-11  5:38  0%         ` Jerin Jacob
  2024-11-12  8:51  0%           ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2024-11-11  5:38 UTC (permalink / raw)
  To: David Marchand
  Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163, dev, Thomas Monjalon, Robin Jarry



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, November 8, 2024 7:08 PM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Huichao Cai <chcchc88@163.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com; dev@dpdk.org;
> Thomas Monjalon <thomas@monjalon.net>; Robin Jarry <rjarry@redhat.com>
> Subject: Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
> 
> Hello Jerin, On Fri, Nov 8, 2024 at 1: 22 PM Jerin Jacob <jerinj@ marvell. com>
> wrote: > > > Is n't breaking the ABI? > > > > So can't we modify the ABI, or is
> there any special operation required to modify > > 
> Hello Jerin,

Hello David,

> 
> On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > > Is n't breaking the ABI?
> > >
> > > So can't we modify the ABI, or is there any special operation
> > > required to modify the ABI?
> >
> > Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> > Looking at the pahole output, one option will be making dispatch and
> > new semi fastpath Additions like  xstat_off can be min cache aligned
> > to make room for future expansion and to make sure have better
> performance.
> 
> Adding holes may be a short term solution, but in my opinion, the slow path
> part should be entirely hidden and we only expose the fp part.

The new cache line alignment items are proposed are fastpath items only.

> Reminder, those holes must be in a "known state" as we release v24.11 so that
> the presence of future additions can be safely detected.
> 
> 
> --
> David Marchand


^ permalink raw reply	[relevance 0%]

* [PATCH v15 1/3] power: introduce PM QoS API on CPU wide
  2024-11-11  2:25  4% ` [PATCH v15 " Huisong Li
@ 2024-11-11  2:25  5%   ` Huisong Li
  2024-11-11 10:29  0%   ` [PATCH v15 0/3] power: introduce PM QoS interface Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Huisong Li @ 2024-11-11  2:25 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 1ebab77ee9..ecae6b46ef 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 543becba28..187e6823d7 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -276,6 +276,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 4f4dc19687..313aaa6701 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -16,6 +16,7 @@ sources = files(
         'rte_power_cpufreq.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+        'rte_power_qos.c',
 )
 headers = files(
         'power_cpufreq.h',
@@ -24,6 +25,7 @@ headers = files(
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+        'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..4dd0532b36
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == RTE_POWER_QOS_STRICT_LATENCY_VALUE)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = RTE_POWER_QOS_STRICT_LATENCY_VALUE;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..7a8dab9272
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE		0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT	INT32_MAX
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index f442329bbc..920c8e79b3 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,6 +51,10 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
 
 INTERNAL {
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [PATCH v15 0/3] power: introduce PM QoS interface
                     ` (3 preceding siblings ...)
  2024-10-29 13:28  4% ` [PATCH v14 0/3] power: introduce PM QoS interface Huisong Li
@ 2024-11-11  2:25  4% ` Huisong Li
  2024-11-11  2:25  5%   ` [PATCH v15 1/3] power: introduce PM QoS API on CPU wide Huisong Li
  2024-11-11 10:29  0%   ` [PATCH v15 0/3] power: introduce PM QoS interface Thomas Monjalon
  2024-11-11  9:14  4% ` [RESEND PATCH " Huisong Li
  5 siblings, 2 replies; 200+ results
From: Huisong Li @ 2024-11-11  2:25 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Please see the description in kernel document[1].
Each cpuidle governor in Linux select which idle state to enter based on
this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from idle state by setting strict resume latency (zero value).

[1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

---
 v15:
  - fix conflicts due to the new merged patches that rework power lib.
  - add Acked-by: Konstantin Ananyev for patch 3/3.
 v14:
  - use parse_uint to parse --cpu-resume-latency instead of adding a new
    parse_int()
 v13:
  - not allow negative value for --cpu-resume-latency.
  - restore to the original value as Konstantin suggested.
 v12:
  - add Acked-by Chengwen and Konstantin
  - fix overflow issue in l3fwd-power when parse command line
  - add a command parameter to set CPU resume latency
 v11:
  - operate the cpu id the lcore mapped by the new function
    power_get_lcore_mapped_cpu_id().
 v10:
  - replace LINE_MAX with a custom macro and fix two typos.
 v9:
  - move new feature description from release_24_07.rst to release_24_11.rst.
 v8:
  - update the latest code to resolve CI warning
 v7:
  - remove a dead code rte_lcore_is_enabled in patch[2/2]
 v6:
  - update release_24_07.rst based on dpdk repo to resolve CI warning.
 v5:
  - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
 v4:
  - fix some comments basd on Stephen
  - add stdint.h include
  - add Acked-by Morten Brørup <mb@smartsharesystems.com>
 v3:
  - add RTE_POWER_xxx prefix for some macro in header
  - add the check for lcore_id with rte_lcore_is_enabled
 v2:
  - use PM QoS on CPU wide to replace the one on system wide

Huisong Li (3):
  power: introduce PM QoS API on CPU wide
  examples/l3fwd-power: fix data overflow when parse command line
  examples/l3fwd-power: add PM QoS configuration

 doc/guides/prog_guide/power_man.rst           |  19 +++
 doc/guides/rel_notes/release_24_11.rst        |   5 +
 .../sample_app_ug/l3_forward_power_man.rst    |   5 +-
 examples/l3fwd-power/main.c                   |  96 +++++++++++---
 lib/power/meson.build                         |   2 +
 lib/power/rte_power_qos.c                     | 123 ++++++++++++++++++
 lib/power/rte_power_qos.h                     |  73 +++++++++++
 lib/power/version.map                         |   4 +
 8 files changed, 306 insertions(+), 21 deletions(-)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

-- 
2.22.0


^ permalink raw reply	[relevance 4%]

* Re: [PATCH 0/2] gpudev: annotate memory allocation
  @ 2024-11-09  0:22  3% ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-11-09  0:22 UTC (permalink / raw)
  To: Elena Agostini; +Cc: dev

On Thu, 17 Oct 2024 15:58:02 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:

> Use function attributes to catch misuse of GPU memory
> at compile time.
> 
> Stephen Hemminger (2):
>   test-gpudev: avoid use-after-free and free-non-heap warnings
>   gpudev: add malloc annotations to rte_gpu_mem_alloc
> 
>  app/test-gpudev/main.c  | 10 ++++++++-
>  lib/gpudev/rte_gpudev.h | 46 +++++++++++++++++++++--------------------
>  2 files changed, 33 insertions(+), 23 deletions(-)
> 


The problem is that include checker can't handle this.
####################################################################################
#### [Begin job log] "ubuntu-22.04-gcc-debug+doc+examples+tests" at step Build and test
####################################################################################
                 from buildtools/chkincs/chkincs.p/gpudev_driver.c:1:
/home/runner/work/dpdk/dpdk/lib/gpudev/rte_gpudev.h:411:9: error: ‘rte_gpu_mem_free’ is deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]
  411 |         __rte_malloc __rte_dealloc(rte_gpu_mem_free, 2);
      |         ^~~~~~~~~~~~
/home/runner/work/dpdk/dpdk/lib/gpudev/rte_gpudev.h:380:5: note: declared here
  380 | int rte_gpu_mem_free(int16_t dev_id, void *ptr);
      |     ^~~~~~~~~~~~~~~~

Either include checker needs to be able to handle experimental symbols
or maybe it is time for gpudev to be moved out experimental status for 25.03?


^ permalink raw reply	[relevance 3%]

* RE: [PATCH] config: limit lcore variable maximum size to 4k
  @ 2024-11-08 22:49  3%       ` Morten Brørup
  0 siblings, 0 replies; 200+ results
From: Morten Brørup @ 2024-11-08 22:49 UTC (permalink / raw)
  To: Thomas Monjalon, David Marchand, Mattias Rönnblom
  Cc: dev, Bruce Richardson, Stephen Hemminger, Chengwen Feng,
	Konstantin Ananyev

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Friday, 8 November 2024 23.13

> Let's consider based on the need.
> The lcore variables are new and we don't want it to degrade the DPDK
> footprint,
> at least not in this first version.
> 4 KB is a memory page on common systems,
> it looks reasonnable and big enough for a "variable".
> 
> Applied, thanks.

Changing this breaks the API/ABI.

I consider the 4 KB patch a temporary fix, only to make progress on the release candidate, but not the value to go into the final LTS release.
In other words: I formally NAK this patch for the LTS release, but ACK it for the release candidate.

Let's postpone the discussion until after the release candidate.


^ permalink raw reply	[relevance 3%]

* Re: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-08 12:22  3%     ` Jerin Jacob
@ 2024-11-08 13:38  0%       ` David Marchand
  2024-11-11  5:38  0%         ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2024-11-08 13:38 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Huichao Cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163, dev, Thomas Monjalon, Robin Jarry

Hello Jerin,

On Fri, Nov 8, 2024 at 1:22 PM Jerin Jacob <jerinj@marvell.com> wrote:
> > > Is n't breaking the ABI?
> >
> > So can't we modify the ABI, or is there any special operation required to modify
> > the ABI?
>
> Only LTS release (xx.11) can change the ABI after sending deprecation notice.
> Looking at the pahole output, one option will be making dispatch and new semi fastpath
> Additions like  xstat_off can be min cache aligned to make room for future expansion and
> to make sure have better performance.

Adding holes may be a short term solution, but in my opinion, the slow
path part should be entirely hidden and we only expose the fp part.
Reminder, those holes must be in a "known state" as we release v24.11
so that the presence of future additions can be safely detected.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* RE: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-08  1:39  4%   ` Huichao Cai
@ 2024-11-08 12:22  3%     ` Jerin Jacob
  2024-11-08 13:38  0%       ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2024-11-08 12:22 UTC (permalink / raw)
  To: Huichao Cai
  Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163,
	dev, Thomas Monjalon, david.marchand, Robin Jarry



> -----Original Message-----
> From: Huichao Cai <chcchc88@163.com>
> Sent: Friday, November 8, 2024 7:10 AM
> To: Jerin Jacob <jerinj@marvell.com>
> Cc: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; yanzhirun_163@163.com;
> dev@dpdk.org
> Subject: Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when
> scheduling nodes
> 
> > Is n't breaking the ABI? So can't we modify the ABI, or is there any
> > special operation required to modify the ABI? ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> > ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> > ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> > ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍
> 
> ZjQcmQRYFpfptBannerEnd
> 
> > Is n't breaking the ABI?
> 
> So can't we modify the ABI, or is there any special operation required to modify
> the ABI?

Only LTS release (xx.11) can change the ABI after sending deprecation notice.
Looking at the pahole output, one option will be making dispatch and new semi fastpath
Additions like  xstat_off can be min cache aligned to make room for future expansion and 
to make sure have better performance.

For xstat_off addition, there was deprecation notice to update rte_node.
If there are no objection, may be we can try following in this release to not wait
Huichao for one more year.


[main] [dpdk.org] $ git diff
diff --git a/lib/graph/rte_graph_worker_common.h b/lib/graph/rte_graph_worker_common.h
index a518af2b2a..ec9a82186d 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -104,6 +104,7 @@ struct __rte_cache_aligned rte_node {
        /** Original process function when pcap is enabled. */
        rte_node_process_t original_process;

+       alignas(RTE_CACHE_LINE_MIN_SIZE)
        union {
                /* Fast schedule area for mcore dispatch model */
                struct {
@@ -112,6 +113,7 @@ struct __rte_cache_aligned rte_node {
                        uint64_t total_sched_fail; /**< Number of scheduled failure. */
                } dispatch;
        };
+       alignas(RTE_CACHE_LINE_MIN_SIZE)
        rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
        /* Fast path area  */
        __extension__ struct __rte_cache_aligned {

^ permalink raw reply	[relevance 3%]

* Re:RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  2024-11-07  9:37  3% ` [EXTERNAL] " Jerin Jacob
@ 2024-11-08  1:39  4%   ` Huichao Cai
  2024-11-08 12:22  3%     ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Huichao Cai @ 2024-11-08  1:39 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram, yanzhirun_163, dev

[-- Attachment #1: Type: text/plain, Size: 117 bytes --]

> Is n't breaking the ABI?

So can't we modify the ABI, or is there any special operation required to modify the ABI?

[-- Attachment #2: Type: text/html, Size: 490 bytes --]

^ permalink raw reply	[relevance 4%]

* RE: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling nodes
  @ 2024-11-07  9:37  3% ` Jerin Jacob
  2024-11-08  1:39  4%   ` Huichao Cai
    1 sibling, 1 reply; 200+ results
From: Jerin Jacob @ 2024-11-07  9:37 UTC (permalink / raw)
  To: Huichao cai, Kiran Kumar Kokkilagadda, Nithin Kumar Dabilpuram,
	yanzhirun_163
  Cc: dev


> -----Original Message-----
> From: Huichao cai <chcchc88@163.com>
> Sent: Thursday, November 7, 2024 1:35 PM
> To: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; yanzhirun_163@163.com
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] [PATCH] graph: optimize graph search when scheduling
> nodes
> 
> In the function __rte_graph_ccore_ispatch_stched_node_dequeue, use a slower
> loop to search for the graph, modify the search logic to record the result of the
> first search, and use this record for subsequent searches to improve search
> speed
> In the function __rte_graph_ccore_ispatch_stched_node_dequeue,
> use a slower loop to search for the graph, modify the search logic to record the
> result of the first search, and use this record for subsequent searches to
> improve search speed.
> 
> Signed-off-by: Huichao cai <chcchc88@163.com>
> ---
>  	return graph != NULL ? __graph_sched_node_enqueue(node, graph) :
> false;  } diff --git a/lib/graph/rte_graph_worker_common.h
> b/lib/graph/rte_graph_worker_common.h
> index a518af2..4c2432b 100644
> --- a/lib/graph/rte_graph_worker_common.h
> +++ b/lib/graph/rte_graph_worker_common.h
> @@ -110,6 +110,7 @@ struct __rte_cache_aligned rte_node {
>  			unsigned int lcore_id;  /**< Node running lcore. */
>  			uint64_t total_sched_objs; /**< Number of objects
> scheduled. */
>  			uint64_t total_sched_fail; /**< Number of scheduled
> failure. */
> +			struct rte_graph *graph;  /**< Graph corresponding to
> lcore_id. */

Is n't breaking the ABI?

Also, please change commit as following for mcore specific changes 

graph: mcore: ...

>  		} dispatch;
>  	};
>  	rte_graph_off_t xstat_off; /**< Offset to xstat counters. */
> --
> 1.8.3.1


^ permalink raw reply	[relevance 3%]

* RE: [EXTERNAL] [PATCH v8 1/3] cryptodev: add ec points to sm2 op
  2024-11-06 10:08  0%   ` [EXTERNAL] " Akhil Goyal
@ 2024-11-06 15:17  0%     ` Kusztal, ArkadiuszX
  0 siblings, 0 replies; 200+ results
From: Kusztal, ArkadiuszX @ 2024-11-06 15:17 UTC (permalink / raw)
  To: Akhil Goyal, dev; +Cc: Dooley, Brian

Hi Akhil,

> -----Original Message-----
> From: Akhil Goyal <gakhil@marvell.com>
> Sent: Wednesday, November 6, 2024 11:09 AM
> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; dev@dpdk.org
> Cc: Dooley, Brian <brian.dooley@intel.com>
> Subject: RE: [EXTERNAL] [PATCH v8 1/3] cryptodev: add ec points to sm2 op
> 
> > In the case when PMD cannot support the full process of the SM2, but
> > elliptic curve computation only, additional fields are needed to
> > handle such a case.
> >
> > Points C1, kP therefore were added to the SM2 crypto operation struct.
> >
> > Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> > ---
> 
> Please rebase. CI failed to apply patch.
> Please be proactive to fix CI issues if reported.

I have deferred the whole patchset, no further action is necessary.

> 
> >  doc/guides/rel_notes/release_24_11.rst |  3 ++
> >  lib/cryptodev/rte_crypto_asym.h        | 56 +++++++++++++++++++-------
> >  2 files changed, 45 insertions(+), 14 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_24_11.rst
> > b/doc/guides/rel_notes/release_24_11.rst
> > index 53a5ffebe5..ee9e2cea3c 100644
> > --- a/doc/guides/rel_notes/release_24_11.rst
> > +++ b/doc/guides/rel_notes/release_24_11.rst
> > @@ -413,6 +413,9 @@ ABI Changes
> >    added new structure ``rte_node_xstats`` to ``rte_node_register`` and
> >    added ``xstat_off`` to ``rte_node``.
> >
> > +* cryptodev: The ``rte_crypto_sm2_op_param`` struct member to hold
> > ciphertext
> > +  is changed to union data type. This change is to support partial SM2
> calculation.
> > +
> >
> >  Known Issues
> >  ------------
> > diff --git a/lib/cryptodev/rte_crypto_asym.h
> > b/lib/cryptodev/rte_crypto_asym.h index aeb46e688e..f095cebcd0 100644
> > --- a/lib/cryptodev/rte_crypto_asym.h
> > +++ b/lib/cryptodev/rte_crypto_asym.h
> > @@ -646,6 +646,8 @@ enum rte_crypto_sm2_op_capa {
> >  	/**< Random number generator supported in SM2 ops. */
> >  	RTE_CRYPTO_SM2_PH,
> >  	/**< Prehash message before crypto op. */
> > +	RTE_CRYPTO_SM2_PARTIAL,
> > +	/**< Calculate elliptic curve points only. */
> >  };
> >
> >  /**
> > @@ -673,20 +675,46 @@ struct rte_crypto_sm2_op_param {
> >  	 * will be overwritten by the PMD with the decrypted length.
> >  	 */
> >
> > -	rte_crypto_param cipher;
> > -	/**<
> > -	 * Pointer to input data
> > -	 * - to be decrypted for SM2 private decrypt.
> > -	 *
> > -	 * Pointer to output data
> > -	 * - for SM2 public encrypt.
> > -	 * In this case the underlying array should have been allocated
> > -	 * with enough memory to hold ciphertext output (at least X bytes
> > -	 * for prime field curve of N bytes and for message M bytes,
> > -	 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
> > -	 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
> > -	 * be overwritten by the PMD with the encrypted length.
> > -	 */
> > +	union {
> > +		rte_crypto_param cipher;
> > +		/**<
> > +		 * Pointer to input data
> > +		 * - to be decrypted for SM2 private decrypt.
> > +		 *
> > +		 * Pointer to output data
> > +		 * - for SM2 public encrypt.
> > +		 * In this case the underlying array should have been allocated
> > +		 * with enough memory to hold ciphertext output (at least X
> > bytes
> > +		 * for prime field curve of N bytes and for message M bytes,
> > +		 * where X = (C1 || C2 || C3) and computed based on SM2 RFC
> > as
> > +		 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
> > +		 * be overwritten by the PMD with the encrypted length.
> > +		 */
> > +		struct {
> > +			struct rte_crypto_ec_point c1;
> > +			/**<
> > +			 * This field is used only when PMD does not support
> the
> > full
> > +			 * process of the SM2 encryption/decryption, but the
> > elliptic
> > +			 * curve part only.
> > +			 *
> > +			 * In the case of encryption, it is an output - point C1 =
> > (x1,y1).
> > +			 * In the case of decryption, if is an input - point C1 =
> > (x1,y1).
> > +			 *
> > +			 * Must be used along with the
> > RTE_CRYPTO_SM2_PARTIAL flag.
> > +			 */
> > +			struct rte_crypto_ec_point kp;
> > +			/**<
> > +			 * This field is used only when PMD does not support
> the
> > full
> > +			 * process of the SM2 encryption/decryption, but the
> > elliptic
> > +			 * curve part only.
> > +			 *
> > +			 * It is an output in the encryption case, it is a point
> > +			 * [k]P = (x2,y2).
> > +			 *
> > +			 * Must be used along with the
> > RTE_CRYPTO_SM2_PARTIAL flag.
> > +			 */
> > +		};
> > +	};
> >
> >  	rte_crypto_uint id;
> >  	/**< The SM2 id used by signer and verifier. */
> > --
> > 2.34.1


^ permalink raw reply	[relevance 0%]

* RE: [EXTERNAL] [PATCH v8 1/3] cryptodev: add ec points to sm2 op
  2024-11-04  9:36  4% ` [PATCH v8 1/3] cryptodev: " Arkadiusz Kusztal
@ 2024-11-06 10:08  0%   ` Akhil Goyal
  2024-11-06 15:17  0%     ` Kusztal, ArkadiuszX
  0 siblings, 1 reply; 200+ results
From: Akhil Goyal @ 2024-11-06 10:08 UTC (permalink / raw)
  To: Arkadiusz Kusztal, dev; +Cc: brian.dooley

> In the case when PMD cannot support the full process of the SM2,
> but elliptic curve computation only, additional fields
> are needed to handle such a case.
> 
> Points C1, kP therefore were added to the SM2 crypto operation struct.
> 
> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> ---

Please rebase. CI failed to apply patch.
Please be proactive to fix CI issues if reported.

>  doc/guides/rel_notes/release_24_11.rst |  3 ++
>  lib/cryptodev/rte_crypto_asym.h        | 56 +++++++++++++++++++-------
>  2 files changed, 45 insertions(+), 14 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_11.rst
> b/doc/guides/rel_notes/release_24_11.rst
> index 53a5ffebe5..ee9e2cea3c 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -413,6 +413,9 @@ ABI Changes
>    added new structure ``rte_node_xstats`` to ``rte_node_register`` and
>    added ``xstat_off`` to ``rte_node``.
> 
> +* cryptodev: The ``rte_crypto_sm2_op_param`` struct member to hold
> ciphertext
> +  is changed to union data type. This change is to support partial SM2 calculation.
> +
> 
>  Known Issues
>  ------------
> diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
> index aeb46e688e..f095cebcd0 100644
> --- a/lib/cryptodev/rte_crypto_asym.h
> +++ b/lib/cryptodev/rte_crypto_asym.h
> @@ -646,6 +646,8 @@ enum rte_crypto_sm2_op_capa {
>  	/**< Random number generator supported in SM2 ops. */
>  	RTE_CRYPTO_SM2_PH,
>  	/**< Prehash message before crypto op. */
> +	RTE_CRYPTO_SM2_PARTIAL,
> +	/**< Calculate elliptic curve points only. */
>  };
> 
>  /**
> @@ -673,20 +675,46 @@ struct rte_crypto_sm2_op_param {
>  	 * will be overwritten by the PMD with the decrypted length.
>  	 */
> 
> -	rte_crypto_param cipher;
> -	/**<
> -	 * Pointer to input data
> -	 * - to be decrypted for SM2 private decrypt.
> -	 *
> -	 * Pointer to output data
> -	 * - for SM2 public encrypt.
> -	 * In this case the underlying array should have been allocated
> -	 * with enough memory to hold ciphertext output (at least X bytes
> -	 * for prime field curve of N bytes and for message M bytes,
> -	 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
> -	 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
> -	 * be overwritten by the PMD with the encrypted length.
> -	 */
> +	union {
> +		rte_crypto_param cipher;
> +		/**<
> +		 * Pointer to input data
> +		 * - to be decrypted for SM2 private decrypt.
> +		 *
> +		 * Pointer to output data
> +		 * - for SM2 public encrypt.
> +		 * In this case the underlying array should have been allocated
> +		 * with enough memory to hold ciphertext output (at least X
> bytes
> +		 * for prime field curve of N bytes and for message M bytes,
> +		 * where X = (C1 || C2 || C3) and computed based on SM2 RFC
> as
> +		 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
> +		 * be overwritten by the PMD with the encrypted length.
> +		 */
> +		struct {
> +			struct rte_crypto_ec_point c1;
> +			/**<
> +			 * This field is used only when PMD does not support the
> full
> +			 * process of the SM2 encryption/decryption, but the
> elliptic
> +			 * curve part only.
> +			 *
> +			 * In the case of encryption, it is an output - point C1 =
> (x1,y1).
> +			 * In the case of decryption, if is an input - point C1 =
> (x1,y1).
> +			 *
> +			 * Must be used along with the
> RTE_CRYPTO_SM2_PARTIAL flag.
> +			 */
> +			struct rte_crypto_ec_point kp;
> +			/**<
> +			 * This field is used only when PMD does not support the
> full
> +			 * process of the SM2 encryption/decryption, but the
> elliptic
> +			 * curve part only.
> +			 *
> +			 * It is an output in the encryption case, it is a point
> +			 * [k]P = (x2,y2).
> +			 *
> +			 * Must be used along with the
> RTE_CRYPTO_SM2_PARTIAL flag.
> +			 */
> +		};
> +	};
> 
>  	rte_crypto_uint id;
>  	/**< The SM2 id used by signer and verifier. */
> --
> 2.34.1


^ permalink raw reply	[relevance 0%]

* [PATCH v8 1/3] cryptodev: add ec points to sm2 op
  2024-11-04  9:36  3% [PATCH v8 0/3] add ec points to sm2 op Arkadiusz Kusztal
@ 2024-11-04  9:36  4% ` Arkadiusz Kusztal
  2024-11-06 10:08  0%   ` [EXTERNAL] " Akhil Goyal
  0 siblings, 1 reply; 200+ results
From: Arkadiusz Kusztal @ 2024-11-04  9:36 UTC (permalink / raw)
  To: dev; +Cc: gakhil, brian.dooley, Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

Points C1, kP therefore were added to the SM2 crypto operation struct.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
 doc/guides/rel_notes/release_24_11.rst |  3 ++
 lib/cryptodev/rte_crypto_asym.h        | 56 +++++++++++++++++++-------
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 53a5ffebe5..ee9e2cea3c 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -413,6 +413,9 @@ ABI Changes
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
   added ``xstat_off`` to ``rte_node``.
 
+* cryptodev: The ``rte_crypto_sm2_op_param`` struct member to hold ciphertext
+  is changed to union data type. This change is to support partial SM2 calculation.
+
 
 Known Issues
 ------------
diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
index aeb46e688e..f095cebcd0 100644
--- a/lib/cryptodev/rte_crypto_asym.h
+++ b/lib/cryptodev/rte_crypto_asym.h
@@ -646,6 +646,8 @@ enum rte_crypto_sm2_op_capa {
 	/**< Random number generator supported in SM2 ops. */
 	RTE_CRYPTO_SM2_PH,
 	/**< Prehash message before crypto op. */
+	RTE_CRYPTO_SM2_PARTIAL,
+	/**< Calculate elliptic curve points only. */
 };
 
 /**
@@ -673,20 +675,46 @@ struct rte_crypto_sm2_op_param {
 	 * will be overwritten by the PMD with the decrypted length.
 	 */
 
-	rte_crypto_param cipher;
-	/**<
-	 * Pointer to input data
-	 * - to be decrypted for SM2 private decrypt.
-	 *
-	 * Pointer to output data
-	 * - for SM2 public encrypt.
-	 * In this case the underlying array should have been allocated
-	 * with enough memory to hold ciphertext output (at least X bytes
-	 * for prime field curve of N bytes and for message M bytes,
-	 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
-	 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
-	 * be overwritten by the PMD with the encrypted length.
-	 */
+	union {
+		rte_crypto_param cipher;
+		/**<
+		 * Pointer to input data
+		 * - to be decrypted for SM2 private decrypt.
+		 *
+		 * Pointer to output data
+		 * - for SM2 public encrypt.
+		 * In this case the underlying array should have been allocated
+		 * with enough memory to hold ciphertext output (at least X bytes
+		 * for prime field curve of N bytes and for message M bytes,
+		 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
+		 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
+		 * be overwritten by the PMD with the encrypted length.
+		 */
+		struct {
+			struct rte_crypto_ec_point c1;
+			/**<
+			 * This field is used only when PMD does not support the full
+			 * process of the SM2 encryption/decryption, but the elliptic
+			 * curve part only.
+			 *
+			 * In the case of encryption, it is an output - point C1 = (x1,y1).
+			 * In the case of decryption, if is an input - point C1 = (x1,y1).
+			 *
+			 * Must be used along with the RTE_CRYPTO_SM2_PARTIAL flag.
+			 */
+			struct rte_crypto_ec_point kp;
+			/**<
+			 * This field is used only when PMD does not support the full
+			 * process of the SM2 encryption/decryption, but the elliptic
+			 * curve part only.
+			 *
+			 * It is an output in the encryption case, it is a point
+			 * [k]P = (x2,y2).
+			 *
+			 * Must be used along with the RTE_CRYPTO_SM2_PARTIAL flag.
+			 */
+		};
+	};
 
 	rte_crypto_uint id;
 	/**< The SM2 id used by signer and verifier. */
-- 
2.34.1


^ permalink raw reply	[relevance 4%]

* [PATCH v8 0/3] add ec points to sm2 op
@ 2024-11-04  9:36  3% Arkadiusz Kusztal
  2024-11-04  9:36  4% ` [PATCH v8 1/3] cryptodev: " Arkadiusz Kusztal
  0 siblings, 1 reply; 200+ results
From: Arkadiusz Kusztal @ 2024-11-04  9:36 UTC (permalink / raw)
  To: dev; +Cc: gakhil, brian.dooley, Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

v2:
- rebased against the 24.11 code
v3:
- added feature flag
- added QAT patches
- added test patches
v4:
- replaced feature flag with capability
- split API patches
v5:
- rebased
- clarified usage of the partial flag
v6:
- removed already applied patch 1
- added ABI relase notes comment
- removed camel case
- added flag reference
v7:
- removed SM2 from auth features, in asym it was added in SM2 ECDSA patch
v8:
- fixed an openssl test issue
- added the partial_flag to QAT capabilities

Arkadiusz Kusztal (3):
  cryptodev: add ec points to sm2 op
  crypto/qat: add sm2 encryption/decryption function
  app/test: add test sm2 C1/Kp test cases

 app/test/test_cryptodev_asym.c                | 134 +++++++++++++++++
 app/test/test_cryptodev_sm2_test_vectors.h    | 112 +++++++++++++-
 doc/guides/rel_notes/release_24_11.rst        |   7 +
 .../common/qat/qat_adf/icp_qat_fw_mmp_ids.h   |   3 +
 drivers/common/qat/qat_adf/qat_pke.h          |  20 +++
 drivers/crypto/qat/dev/qat_crypto_pmd_gen4.c  |  72 ++++++++-
 drivers/crypto/qat/qat_asym.c                 | 140 +++++++++++++++++-
 lib/cryptodev/rte_crypto_asym.h               |  56 +++++--
 8 files changed, 520 insertions(+), 24 deletions(-)

-- 
2.34.1


^ permalink raw reply	[relevance 3%]

* Re: [PATCH v14 0/3] power: introduce PM QoS interface
  2024-10-29 13:28  4% ` [PATCH v14 0/3] power: introduce PM QoS interface Huisong Li
  2024-10-29 13:28  5%   ` [PATCH v14 1/3] power: introduce PM QoS API on CPU wide Huisong Li
@ 2024-11-04  9:13  0%   ` lihuisong (C)
  1 sibling, 0 replies; 200+ results
From: lihuisong (C) @ 2024-11-04  9:13 UTC (permalink / raw)
  To: dev, ferruh.yigit, thomas
  Cc: mb, anatoly.burakov, david.hunt, sivaprasad.tummala, stephen,
	konstantin.ananyev, david.marchand, fengchengwen, liuyonglong

Hi Ferruh and Thomas,

Kindly ping for merge.


在 2024/10/29 21:28, Huisong Li 写道:
> The deeper the idle state, the lower the power consumption, but the longer
> the resume time. Some service are delay sensitive and very except the low
> resume time, like interrupt packet receiving mode.
>
> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> interface is used to set and get the resume latency limit on the cpuX for
> userspace. Please see the description in kernel document[1].
> Each cpuidle governor in Linux select which idle state to enter based on
> this CPU resume latency in their idle task.
>
> The per-CPU PM QoS API can be used to control this CPU's idle state
> selection and limit just enter the shallowest idle state to low the delay
> when wake up from idle state by setting strict resume latency (zero value).
>
> [1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
>
> ---
>   v14:
>    - use parse_uint to parse --cpu-resume-latency instead of adding a new
>      parse_int()
>   v13:
>    - not allow negative value for --cpu-resume-latency.
>    - restore to the original value as Konstantin suggested.
>   v12:
>    - add Acked-by Chengwen and Konstantin
>    - fix overflow issue in l3fwd-power when parse command line
>    - add a command parameter to set CPU resume latency
>   v11:
>    - operate the cpu id the lcore mapped by the new function
>      power_get_lcore_mapped_cpu_id().
>   v10:
>    - replace LINE_MAX with a custom macro and fix two typos.
>   v9:
>    - move new feature description from release_24_07.rst to release_24_11.rst.
>   v8:
>    - update the latest code to resolve CI warning
>   v7:
>    - remove a dead code rte_lcore_is_enabled in patch[2/2]
>   v6:
>    - update release_24_07.rst based on dpdk repo to resolve CI warning.
>   v5:
>    - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
>   v4:
>    - fix some comments basd on Stephen
>    - add stdint.h include
>    - add Acked-by Morten Brørup <mb@smartsharesystems.com>
>   v3:
>    - add RTE_POWER_xxx prefix for some macro in header
>    - add the check for lcore_id with rte_lcore_is_enabled
>   v2:
>    - use PM QoS on CPU wide to replace the one on system wide
>
>
> Huisong Li (3):
>    power: introduce PM QoS API on CPU wide
>    examples/l3fwd-power: fix data overflow when parse command line
>    examples/l3fwd-power: add PM QoS configuration
>
>   doc/guides/prog_guide/power_man.rst           |  19 +++
>   doc/guides/rel_notes/release_24_11.rst        |   5 +
>   .../sample_app_ug/l3_forward_power_man.rst    |   5 +-
>   examples/l3fwd-power/main.c                   |  96 +++++++++++---
>   lib/power/meson.build                         |   2 +
>   lib/power/rte_power_qos.c                     | 123 ++++++++++++++++++
>   lib/power/rte_power_qos.h                     |  73 +++++++++++
>   lib/power/version.map                         |   4 +
>   8 files changed, 306 insertions(+), 21 deletions(-)
>   create mode 100644 lib/power/rte_power_qos.c
>   create mode 100644 lib/power/rte_power_qos.h
>

^ permalink raw reply	[relevance 0%]

* [PATCH v4 4/4] net/nfp: add LED support
  @ 2024-11-04  1:34  6%       ` Chaoyong He
  0 siblings, 0 replies; 200+ results
From: Chaoyong He @ 2024-11-04  1:34 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Chaoyong He, James Hershaw

Implement the necessary functions to allow user to visually identify a
physical port associated with a netdev by blinking an LED on that port.

Signed-off-by: James Hershaw <james.hershaw@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
---
 doc/guides/nics/features/nfp.ini              |  1 +
 .../net/nfp/flower/nfp_flower_representor.c   | 30 ++++++++++++++++
 drivers/net/nfp/nfp_ethdev.c                  |  2 ++
 drivers/net/nfp/nfp_net_common.c              | 32 +++++++++++++++++
 drivers/net/nfp/nfp_net_common.h              |  2 ++
 drivers/net/nfp/nfpcore/nfp_nsp.h             |  1 +
 drivers/net/nfp/nfpcore/nfp_nsp_eth.c         | 36 +++++++++++++++++++
 7 files changed, 104 insertions(+)

diff --git a/doc/guides/nics/features/nfp.ini b/doc/guides/nics/features/nfp.ini
index 5303b3abf5..124663ae00 100644
--- a/doc/guides/nics/features/nfp.ini
+++ b/doc/guides/nics/features/nfp.ini
@@ -27,6 +27,7 @@ Basic stats          = Y
 Stats per queue      = Y
 EEPROM dump          = Y
 Module EEPROM dump   = Y
+LED                  = Y
 Linux                = Y
 Multiprocess aware   = Y
 x86-64               = Y
diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
index 04536ce15f..4017f602a2 100644
--- a/drivers/net/nfp/flower/nfp_flower_representor.c
+++ b/drivers/net/nfp/flower/nfp_flower_representor.c
@@ -88,6 +88,30 @@ nfp_repr_get_module_eeprom(struct rte_eth_dev *dev,
 	return nfp_net_get_module_eeprom(dev, info);
 }
 
+static int
+nfp_flower_repr_led_on(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_on(dev);
+}
+
+static int
+nfp_flower_repr_led_off(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_off(dev);
+}
+
 static int
 nfp_flower_repr_link_update(struct rte_eth_dev *dev,
 		__rte_unused int wait_to_complete)
@@ -623,6 +647,9 @@ static const struct eth_dev_ops nfp_flower_multiple_pf_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
@@ -661,6 +688,9 @@ static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static uint32_t
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 2ee76d309c..f54483822f 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -983,6 +983,8 @@ static const struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.set_eeprom             = nfp_net_set_eeprom,
 	.get_module_info        = nfp_net_get_module_info,
 	.get_module_eeprom      = nfp_net_get_module_eeprom,
+	.dev_led_on             = nfp_net_led_on,
+	.dev_led_off            = nfp_net_led_off,
 };
 
 static inline void
diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c
index a45837353a..e68ce68229 100644
--- a/drivers/net/nfp/nfp_net_common.c
+++ b/drivers/net/nfp/nfp_net_common.c
@@ -3181,3 +3181,35 @@ nfp_net_get_module_eeprom(struct rte_eth_dev *dev,
 	nfp_nsp_close(nsp);
 	return ret;
 }
+
+static int
+nfp_net_led_control(struct rte_eth_dev *dev,
+		bool is_on)
+{
+	int ret;
+	uint32_t nfp_idx;
+	struct nfp_net_hw_priv *hw_priv;
+
+	hw_priv = dev->process_private;
+	nfp_idx = nfp_net_get_nfp_index(dev);
+
+	ret = nfp_eth_set_idmode(hw_priv->pf_dev->cpp, nfp_idx, is_on);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Set nfp idmode failed.");
+		return ret;
+	}
+
+	return 0;
+}
+
+int
+nfp_net_led_on(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, true);
+}
+
+int
+nfp_net_led_off(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, false);
+}
diff --git a/drivers/net/nfp/nfp_net_common.h b/drivers/net/nfp/nfp_net_common.h
index 5ad698cad2..d85a00a75e 100644
--- a/drivers/net/nfp/nfp_net_common.h
+++ b/drivers/net/nfp/nfp_net_common.h
@@ -399,6 +399,8 @@ int nfp_net_get_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eepr
 int nfp_net_set_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eeprom);
 int nfp_net_get_module_info(struct rte_eth_dev *dev, struct rte_eth_dev_module_info *info);
 int nfp_net_get_module_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info);
+int nfp_net_led_on(struct rte_eth_dev *dev);
+int nfp_net_led_off(struct rte_eth_dev *dev);
 
 #define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
 	((struct nfp_app_fw_nic *)app_fw_priv)
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp.h b/drivers/net/nfp/nfpcore/nfp_nsp.h
index 0ae10dabfb..6230a84e34 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp.h
+++ b/drivers/net/nfp/nfpcore/nfp_nsp.h
@@ -216,6 +216,7 @@ int nfp_eth_set_speed(struct nfp_nsp *nsp, uint32_t speed);
 int nfp_eth_set_split(struct nfp_nsp *nsp, uint32_t lanes);
 int nfp_eth_set_tx_pause(struct nfp_nsp *nsp, bool tx_pause);
 int nfp_eth_set_rx_pause(struct nfp_nsp *nsp, bool rx_pause);
+int nfp_eth_set_idmode(struct nfp_cpp *cpp, uint32_t idx, bool is_on);
 
 /* NSP static information */
 struct nfp_nsp_identify {
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
index 1fcd54656a..404690d05f 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
+++ b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
@@ -44,6 +44,7 @@
 #define NSP_ETH_CTRL_SET_LANES          RTE_BIT64(5)
 #define NSP_ETH_CTRL_SET_ANEG           RTE_BIT64(6)
 #define NSP_ETH_CTRL_SET_FEC            RTE_BIT64(7)
+#define NSP_ETH_CTRL_SET_IDMODE         RTE_BIT64(8)
 #define NSP_ETH_CTRL_SET_TX_PAUSE       RTE_BIT64(10)
 #define NSP_ETH_CTRL_SET_RX_PAUSE       RTE_BIT64(11)
 
@@ -736,3 +737,38 @@ nfp_eth_set_rx_pause(struct nfp_nsp *nsp,
 	return NFP_ETH_SET_BIT_CONFIG(nsp, NSP_ETH_RAW_STATE,
 			NSP_ETH_STATE_RX_PAUSE, rx_pause, NSP_ETH_CTRL_SET_RX_PAUSE);
 }
+
+int
+nfp_eth_set_idmode(struct nfp_cpp *cpp,
+		uint32_t idx,
+		bool is_on)
+{
+	uint64_t reg;
+	struct nfp_nsp *nsp;
+	union eth_table_entry *entries;
+
+	nsp = nfp_eth_config_start(cpp, idx);
+	if (nsp == NULL)
+		return -EIO;
+
+	/*
+	 * Older ABI versions did support this feature, however this has only
+	 * been reliable since ABI 32.
+	 */
+	if (nfp_nsp_get_abi_ver_minor(nsp) < 32) {
+		PMD_DRV_LOG(ERR, "Operation only supported on ABI 32 or newer.");
+		nfp_eth_config_cleanup_end(nsp);
+		return -ENOTSUP;
+	}
+
+	entries = nfp_nsp_config_entries(nsp);
+
+	reg = rte_le_to_cpu_64(entries[idx].control);
+	reg &= ~NSP_ETH_CTRL_SET_IDMODE;
+	reg |= FIELD_PREP(NSP_ETH_CTRL_SET_IDMODE, is_on);
+	entries[idx].control = rte_cpu_to_le_64(reg);
+
+	nfp_nsp_config_set_modified(nsp, 1);
+
+	return nfp_eth_config_commit_end(nsp);
+}
-- 
2.43.5


^ permalink raw reply	[relevance 6%]

* [PATCH v3 4/4] net/nfp: add support for port identify
  @ 2024-11-01  2:57  6%     ` Chaoyong He
    1 sibling, 0 replies; 200+ results
From: Chaoyong He @ 2024-11-01  2:57 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Chaoyong He, James Hershaw

Implement the necessary functions to allow user to visually identify a
physical port associated with a netdev by blinking an LED on that port.

Signed-off-by: James Hershaw <james.hershaw@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
---
 .../net/nfp/flower/nfp_flower_representor.c   | 30 ++++++++++++++++
 drivers/net/nfp/nfp_ethdev.c                  |  2 ++
 drivers/net/nfp/nfp_net_common.c              | 32 +++++++++++++++++
 drivers/net/nfp/nfp_net_common.h              |  2 ++
 drivers/net/nfp/nfpcore/nfp_nsp.h             |  1 +
 drivers/net/nfp/nfpcore/nfp_nsp_eth.c         | 36 +++++++++++++++++++
 6 files changed, 103 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
index 04536ce15f..4017f602a2 100644
--- a/drivers/net/nfp/flower/nfp_flower_representor.c
+++ b/drivers/net/nfp/flower/nfp_flower_representor.c
@@ -88,6 +88,30 @@ nfp_repr_get_module_eeprom(struct rte_eth_dev *dev,
 	return nfp_net_get_module_eeprom(dev, info);
 }
 
+static int
+nfp_flower_repr_led_on(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_on(dev);
+}
+
+static int
+nfp_flower_repr_led_off(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_off(dev);
+}
+
 static int
 nfp_flower_repr_link_update(struct rte_eth_dev *dev,
 		__rte_unused int wait_to_complete)
@@ -623,6 +647,9 @@ static const struct eth_dev_ops nfp_flower_multiple_pf_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
@@ -661,6 +688,9 @@ static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static uint32_t
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 2ee76d309c..f54483822f 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -983,6 +983,8 @@ static const struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.set_eeprom             = nfp_net_set_eeprom,
 	.get_module_info        = nfp_net_get_module_info,
 	.get_module_eeprom      = nfp_net_get_module_eeprom,
+	.dev_led_on             = nfp_net_led_on,
+	.dev_led_off            = nfp_net_led_off,
 };
 
 static inline void
diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c
index a45837353a..e68ce68229 100644
--- a/drivers/net/nfp/nfp_net_common.c
+++ b/drivers/net/nfp/nfp_net_common.c
@@ -3181,3 +3181,35 @@ nfp_net_get_module_eeprom(struct rte_eth_dev *dev,
 	nfp_nsp_close(nsp);
 	return ret;
 }
+
+static int
+nfp_net_led_control(struct rte_eth_dev *dev,
+		bool is_on)
+{
+	int ret;
+	uint32_t nfp_idx;
+	struct nfp_net_hw_priv *hw_priv;
+
+	hw_priv = dev->process_private;
+	nfp_idx = nfp_net_get_nfp_index(dev);
+
+	ret = nfp_eth_set_idmode(hw_priv->pf_dev->cpp, nfp_idx, is_on);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Set nfp idmode failed.");
+		return ret;
+	}
+
+	return 0;
+}
+
+int
+nfp_net_led_on(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, true);
+}
+
+int
+nfp_net_led_off(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, false);
+}
diff --git a/drivers/net/nfp/nfp_net_common.h b/drivers/net/nfp/nfp_net_common.h
index 5ad698cad2..d85a00a75e 100644
--- a/drivers/net/nfp/nfp_net_common.h
+++ b/drivers/net/nfp/nfp_net_common.h
@@ -399,6 +399,8 @@ int nfp_net_get_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eepr
 int nfp_net_set_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eeprom);
 int nfp_net_get_module_info(struct rte_eth_dev *dev, struct rte_eth_dev_module_info *info);
 int nfp_net_get_module_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info);
+int nfp_net_led_on(struct rte_eth_dev *dev);
+int nfp_net_led_off(struct rte_eth_dev *dev);
 
 #define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
 	((struct nfp_app_fw_nic *)app_fw_priv)
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp.h b/drivers/net/nfp/nfpcore/nfp_nsp.h
index 0ae10dabfb..6230a84e34 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp.h
+++ b/drivers/net/nfp/nfpcore/nfp_nsp.h
@@ -216,6 +216,7 @@ int nfp_eth_set_speed(struct nfp_nsp *nsp, uint32_t speed);
 int nfp_eth_set_split(struct nfp_nsp *nsp, uint32_t lanes);
 int nfp_eth_set_tx_pause(struct nfp_nsp *nsp, bool tx_pause);
 int nfp_eth_set_rx_pause(struct nfp_nsp *nsp, bool rx_pause);
+int nfp_eth_set_idmode(struct nfp_cpp *cpp, uint32_t idx, bool is_on);
 
 /* NSP static information */
 struct nfp_nsp_identify {
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
index 1fcd54656a..404690d05f 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
+++ b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
@@ -44,6 +44,7 @@
 #define NSP_ETH_CTRL_SET_LANES          RTE_BIT64(5)
 #define NSP_ETH_CTRL_SET_ANEG           RTE_BIT64(6)
 #define NSP_ETH_CTRL_SET_FEC            RTE_BIT64(7)
+#define NSP_ETH_CTRL_SET_IDMODE         RTE_BIT64(8)
 #define NSP_ETH_CTRL_SET_TX_PAUSE       RTE_BIT64(10)
 #define NSP_ETH_CTRL_SET_RX_PAUSE       RTE_BIT64(11)
 
@@ -736,3 +737,38 @@ nfp_eth_set_rx_pause(struct nfp_nsp *nsp,
 	return NFP_ETH_SET_BIT_CONFIG(nsp, NSP_ETH_RAW_STATE,
 			NSP_ETH_STATE_RX_PAUSE, rx_pause, NSP_ETH_CTRL_SET_RX_PAUSE);
 }
+
+int
+nfp_eth_set_idmode(struct nfp_cpp *cpp,
+		uint32_t idx,
+		bool is_on)
+{
+	uint64_t reg;
+	struct nfp_nsp *nsp;
+	union eth_table_entry *entries;
+
+	nsp = nfp_eth_config_start(cpp, idx);
+	if (nsp == NULL)
+		return -EIO;
+
+	/*
+	 * Older ABI versions did support this feature, however this has only
+	 * been reliable since ABI 32.
+	 */
+	if (nfp_nsp_get_abi_ver_minor(nsp) < 32) {
+		PMD_DRV_LOG(ERR, "Operation only supported on ABI 32 or newer.");
+		nfp_eth_config_cleanup_end(nsp);
+		return -ENOTSUP;
+	}
+
+	entries = nfp_nsp_config_entries(nsp);
+
+	reg = rte_le_to_cpu_64(entries[idx].control);
+	reg &= ~NSP_ETH_CTRL_SET_IDMODE;
+	reg |= FIELD_PREP(NSP_ETH_CTRL_SET_IDMODE, is_on);
+	entries[idx].control = rte_cpu_to_le_64(reg);
+
+	nfp_nsp_config_set_modified(nsp, 1);
+
+	return nfp_eth_config_commit_end(nsp);
+}
-- 
2.43.5


^ permalink raw reply	[relevance 6%]

* [PATCH 4/4] net/nfp: add support for port identify
  @ 2024-10-30  8:27  6%   ` Chaoyong He
    1 sibling, 0 replies; 200+ results
From: Chaoyong He @ 2024-10-30  8:27 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Chaoyong He, James Hershaw

Implement the necessary functions to allow user to visually identify a
physical port associated with a netdev by blinking an LED on that port.

Signed-off-by: James Hershaw <james.hershaw@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
---
 .../net/nfp/flower/nfp_flower_representor.c   | 30 ++++++++++++++++
 drivers/net/nfp/nfp_ethdev.c                  |  2 ++
 drivers/net/nfp/nfp_net_common.c              | 32 +++++++++++++++++
 drivers/net/nfp/nfp_net_common.h              |  2 ++
 drivers/net/nfp/nfpcore/nfp_nsp.h             |  1 +
 drivers/net/nfp/nfpcore/nfp_nsp_eth.c         | 36 +++++++++++++++++++
 6 files changed, 103 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
index 3d043e052a..01ca8a6768 100644
--- a/drivers/net/nfp/flower/nfp_flower_representor.c
+++ b/drivers/net/nfp/flower/nfp_flower_representor.c
@@ -88,6 +88,30 @@ nfp_repr_get_module_eeprom(struct rte_eth_dev *dev,
 	return nfp_net_get_module_eeprom(dev, info);
 }
 
+static int
+nfp_flower_repr_led_on(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_on(dev);
+}
+
+static int
+nfp_flower_repr_led_off(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_off(dev);
+}
+
 static int
 nfp_flower_repr_link_update(struct rte_eth_dev *dev,
 		__rte_unused int wait_to_complete)
@@ -623,6 +647,9 @@ static const struct eth_dev_ops nfp_flower_multiple_pf_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
@@ -661,6 +688,9 @@ static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static uint32_t
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 2ee76d309c..f54483822f 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -983,6 +983,8 @@ static const struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.set_eeprom             = nfp_net_set_eeprom,
 	.get_module_info        = nfp_net_get_module_info,
 	.get_module_eeprom      = nfp_net_get_module_eeprom,
+	.dev_led_on             = nfp_net_led_on,
+	.dev_led_off            = nfp_net_led_off,
 };
 
 static inline void
diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c
index a45837353a..e68ce68229 100644
--- a/drivers/net/nfp/nfp_net_common.c
+++ b/drivers/net/nfp/nfp_net_common.c
@@ -3181,3 +3181,35 @@ nfp_net_get_module_eeprom(struct rte_eth_dev *dev,
 	nfp_nsp_close(nsp);
 	return ret;
 }
+
+static int
+nfp_net_led_control(struct rte_eth_dev *dev,
+		bool is_on)
+{
+	int ret;
+	uint32_t nfp_idx;
+	struct nfp_net_hw_priv *hw_priv;
+
+	hw_priv = dev->process_private;
+	nfp_idx = nfp_net_get_nfp_index(dev);
+
+	ret = nfp_eth_set_idmode(hw_priv->pf_dev->cpp, nfp_idx, is_on);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Set nfp idmode failed.");
+		return ret;
+	}
+
+	return 0;
+}
+
+int
+nfp_net_led_on(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, true);
+}
+
+int
+nfp_net_led_off(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, false);
+}
diff --git a/drivers/net/nfp/nfp_net_common.h b/drivers/net/nfp/nfp_net_common.h
index 5ad698cad2..d85a00a75e 100644
--- a/drivers/net/nfp/nfp_net_common.h
+++ b/drivers/net/nfp/nfp_net_common.h
@@ -399,6 +399,8 @@ int nfp_net_get_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eepr
 int nfp_net_set_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eeprom);
 int nfp_net_get_module_info(struct rte_eth_dev *dev, struct rte_eth_dev_module_info *info);
 int nfp_net_get_module_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info);
+int nfp_net_led_on(struct rte_eth_dev *dev);
+int nfp_net_led_off(struct rte_eth_dev *dev);
 
 #define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
 	((struct nfp_app_fw_nic *)app_fw_priv)
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp.h b/drivers/net/nfp/nfpcore/nfp_nsp.h
index 0ae10dabfb..6230a84e34 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp.h
+++ b/drivers/net/nfp/nfpcore/nfp_nsp.h
@@ -216,6 +216,7 @@ int nfp_eth_set_speed(struct nfp_nsp *nsp, uint32_t speed);
 int nfp_eth_set_split(struct nfp_nsp *nsp, uint32_t lanes);
 int nfp_eth_set_tx_pause(struct nfp_nsp *nsp, bool tx_pause);
 int nfp_eth_set_rx_pause(struct nfp_nsp *nsp, bool rx_pause);
+int nfp_eth_set_idmode(struct nfp_cpp *cpp, uint32_t idx, bool is_on);
 
 /* NSP static information */
 struct nfp_nsp_identify {
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
index 1fcd54656a..404690d05f 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
+++ b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
@@ -44,6 +44,7 @@
 #define NSP_ETH_CTRL_SET_LANES          RTE_BIT64(5)
 #define NSP_ETH_CTRL_SET_ANEG           RTE_BIT64(6)
 #define NSP_ETH_CTRL_SET_FEC            RTE_BIT64(7)
+#define NSP_ETH_CTRL_SET_IDMODE         RTE_BIT64(8)
 #define NSP_ETH_CTRL_SET_TX_PAUSE       RTE_BIT64(10)
 #define NSP_ETH_CTRL_SET_RX_PAUSE       RTE_BIT64(11)
 
@@ -736,3 +737,38 @@ nfp_eth_set_rx_pause(struct nfp_nsp *nsp,
 	return NFP_ETH_SET_BIT_CONFIG(nsp, NSP_ETH_RAW_STATE,
 			NSP_ETH_STATE_RX_PAUSE, rx_pause, NSP_ETH_CTRL_SET_RX_PAUSE);
 }
+
+int
+nfp_eth_set_idmode(struct nfp_cpp *cpp,
+		uint32_t idx,
+		bool is_on)
+{
+	uint64_t reg;
+	struct nfp_nsp *nsp;
+	union eth_table_entry *entries;
+
+	nsp = nfp_eth_config_start(cpp, idx);
+	if (nsp == NULL)
+		return -EIO;
+
+	/*
+	 * Older ABI versions did support this feature, however this has only
+	 * been reliable since ABI 32.
+	 */
+	if (nfp_nsp_get_abi_ver_minor(nsp) < 32) {
+		PMD_DRV_LOG(ERR, "Operation only supported on ABI 32 or newer.");
+		nfp_eth_config_cleanup_end(nsp);
+		return -ENOTSUP;
+	}
+
+	entries = nfp_nsp_config_entries(nsp);
+
+	reg = rte_le_to_cpu_64(entries[idx].control);
+	reg &= ~NSP_ETH_CTRL_SET_IDMODE;
+	reg |= FIELD_PREP(NSP_ETH_CTRL_SET_IDMODE, is_on);
+	entries[idx].control = rte_cpu_to_le_64(reg);
+
+	nfp_nsp_config_set_modified(nsp, 1);
+
+	return nfp_eth_config_commit_end(nsp);
+}
-- 
2.43.5


^ permalink raw reply	[relevance 6%]

* Re: [PATCH V3 7/7] mlx5: add backward compatibility for RDMA monitor
  2024-10-29 16:26  3%     ` Stephen Hemminger
@ 2024-10-30  8:25  0%       ` Minggang(Gavin) Li
  0 siblings, 0 replies; 200+ results
From: Minggang(Gavin) Li @ 2024-10-30  8:25 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: viacheslavo, matan, orika, thomas, Dariusz Sosnowski, Bing Zhao,
	Suanming Mou, dev, rasland


On 10/30/2024 12:26 AM, Stephen Hemminger wrote:
> On Tue, 29 Oct 2024 15:42:56 +0200
> "Minggang Li(Gavin)" <gavinl@nvidia.com> wrote:
>
>>   
>> +* **Updated NVIDIA mlx5 driver.**
>> +
>> +  Optimized port probe in large scale.
>> +  This feature enhances the efficiency of probing VF/SFs on a large scale
>> +  by significantly reducing the probing time. To activate this feature,
>> +  set ``probe_opt_en`` to a non-zero value during device probing. It
>> +  leverages a capability from the RDMA driver, expected to be released in
>> +  the upcoming kernel version 6.13 or its equivalent in OFED 24.10,
>> +  specifically the RDMA monitor. For additional details on the limitations
>> +  of devargs, refer to "doc/guides/nics/mlx5.rst".
>> +
>> +  If there are lots of VFs/SFs to be probed by the application, eg, 300
>> +  VFs/SFs, the option should be enabled to save probing time.
> IMHO the kernel parts have to be available in a released kernel version.
> Otherwise the kernel API/ABI is not stable and there is a possibility of user confusion.
>
> This needs to stay in "awaiting upstream" state until kernel is released
Sorry, it's a typo. The dependent kernel is 6.12 which is in RC. Do you 
think we should wait for it to be released to push the patch to DPDK 
upstream?

^ permalink raw reply	[relevance 0%]

* [PATCH 3/3] net/nfp: add support for port identify
  @ 2024-10-30  8:19  6% ` Chaoyong He
    1 sibling, 0 replies; 200+ results
From: Chaoyong He @ 2024-10-30  8:19 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, Chaoyong He, James Hershaw

Implement the necessary functions to allow user to visually identify a
physical port associated with a netdev by blinking an LED on that port.

Signed-off-by: James Hershaw <james.hershaw@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
---
 .../net/nfp/flower/nfp_flower_representor.c   | 30 ++++++++++++++++
 drivers/net/nfp/nfp_ethdev.c                  |  2 ++
 drivers/net/nfp/nfp_net_common.c              | 32 +++++++++++++++++
 drivers/net/nfp/nfp_net_common.h              |  2 ++
 drivers/net/nfp/nfpcore/nfp_nsp.h             |  1 +
 drivers/net/nfp/nfpcore/nfp_nsp_eth.c         | 36 +++++++++++++++++++
 6 files changed, 103 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
index 3d043e052a..01ca8a6768 100644
--- a/drivers/net/nfp/flower/nfp_flower_representor.c
+++ b/drivers/net/nfp/flower/nfp_flower_representor.c
@@ -88,6 +88,30 @@ nfp_repr_get_module_eeprom(struct rte_eth_dev *dev,
 	return nfp_net_get_module_eeprom(dev, info);
 }
 
+static int
+nfp_flower_repr_led_on(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_on(dev);
+}
+
+static int
+nfp_flower_repr_led_off(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = dev->data->dev_private;
+	if (!nfp_flower_repr_is_phy(repr))
+		return -EOPNOTSUPP;
+
+	return nfp_net_led_off(dev);
+}
+
 static int
 nfp_flower_repr_link_update(struct rte_eth_dev *dev,
 		__rte_unused int wait_to_complete)
@@ -623,6 +647,9 @@ static const struct eth_dev_ops nfp_flower_multiple_pf_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
@@ -661,6 +688,9 @@ static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
 	.set_eeprom           = nfp_repr_set_eeprom,
 	.get_module_info      = nfp_repr_get_module_info,
 	.get_module_eeprom    = nfp_repr_get_module_eeprom,
+
+	.dev_led_on           = nfp_flower_repr_led_on,
+	.dev_led_off          = nfp_flower_repr_led_off,
 };
 
 static uint32_t
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 2ee76d309c..f54483822f 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -983,6 +983,8 @@ static const struct eth_dev_ops nfp_net_eth_dev_ops = {
 	.set_eeprom             = nfp_net_set_eeprom,
 	.get_module_info        = nfp_net_get_module_info,
 	.get_module_eeprom      = nfp_net_get_module_eeprom,
+	.dev_led_on             = nfp_net_led_on,
+	.dev_led_off            = nfp_net_led_off,
 };
 
 static inline void
diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c
index a45837353a..e68ce68229 100644
--- a/drivers/net/nfp/nfp_net_common.c
+++ b/drivers/net/nfp/nfp_net_common.c
@@ -3181,3 +3181,35 @@ nfp_net_get_module_eeprom(struct rte_eth_dev *dev,
 	nfp_nsp_close(nsp);
 	return ret;
 }
+
+static int
+nfp_net_led_control(struct rte_eth_dev *dev,
+		bool is_on)
+{
+	int ret;
+	uint32_t nfp_idx;
+	struct nfp_net_hw_priv *hw_priv;
+
+	hw_priv = dev->process_private;
+	nfp_idx = nfp_net_get_nfp_index(dev);
+
+	ret = nfp_eth_set_idmode(hw_priv->pf_dev->cpp, nfp_idx, is_on);
+	if (ret < 0) {
+		PMD_DRV_LOG(ERR, "Set nfp idmode failed.");
+		return ret;
+	}
+
+	return 0;
+}
+
+int
+nfp_net_led_on(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, true);
+}
+
+int
+nfp_net_led_off(struct rte_eth_dev *dev)
+{
+	return nfp_net_led_control(dev, false);
+}
diff --git a/drivers/net/nfp/nfp_net_common.h b/drivers/net/nfp/nfp_net_common.h
index 5ad698cad2..d85a00a75e 100644
--- a/drivers/net/nfp/nfp_net_common.h
+++ b/drivers/net/nfp/nfp_net_common.h
@@ -399,6 +399,8 @@ int nfp_net_get_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eepr
 int nfp_net_set_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *eeprom);
 int nfp_net_get_module_info(struct rte_eth_dev *dev, struct rte_eth_dev_module_info *info);
 int nfp_net_get_module_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info);
+int nfp_net_led_on(struct rte_eth_dev *dev);
+int nfp_net_led_off(struct rte_eth_dev *dev);
 
 #define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
 	((struct nfp_app_fw_nic *)app_fw_priv)
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp.h b/drivers/net/nfp/nfpcore/nfp_nsp.h
index 0ae10dabfb..6230a84e34 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp.h
+++ b/drivers/net/nfp/nfpcore/nfp_nsp.h
@@ -216,6 +216,7 @@ int nfp_eth_set_speed(struct nfp_nsp *nsp, uint32_t speed);
 int nfp_eth_set_split(struct nfp_nsp *nsp, uint32_t lanes);
 int nfp_eth_set_tx_pause(struct nfp_nsp *nsp, bool tx_pause);
 int nfp_eth_set_rx_pause(struct nfp_nsp *nsp, bool rx_pause);
+int nfp_eth_set_idmode(struct nfp_cpp *cpp, uint32_t idx, bool is_on);
 
 /* NSP static information */
 struct nfp_nsp_identify {
diff --git a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
index 1fcd54656a..404690d05f 100644
--- a/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
+++ b/drivers/net/nfp/nfpcore/nfp_nsp_eth.c
@@ -44,6 +44,7 @@
 #define NSP_ETH_CTRL_SET_LANES          RTE_BIT64(5)
 #define NSP_ETH_CTRL_SET_ANEG           RTE_BIT64(6)
 #define NSP_ETH_CTRL_SET_FEC            RTE_BIT64(7)
+#define NSP_ETH_CTRL_SET_IDMODE         RTE_BIT64(8)
 #define NSP_ETH_CTRL_SET_TX_PAUSE       RTE_BIT64(10)
 #define NSP_ETH_CTRL_SET_RX_PAUSE       RTE_BIT64(11)
 
@@ -736,3 +737,38 @@ nfp_eth_set_rx_pause(struct nfp_nsp *nsp,
 	return NFP_ETH_SET_BIT_CONFIG(nsp, NSP_ETH_RAW_STATE,
 			NSP_ETH_STATE_RX_PAUSE, rx_pause, NSP_ETH_CTRL_SET_RX_PAUSE);
 }
+
+int
+nfp_eth_set_idmode(struct nfp_cpp *cpp,
+		uint32_t idx,
+		bool is_on)
+{
+	uint64_t reg;
+	struct nfp_nsp *nsp;
+	union eth_table_entry *entries;
+
+	nsp = nfp_eth_config_start(cpp, idx);
+	if (nsp == NULL)
+		return -EIO;
+
+	/*
+	 * Older ABI versions did support this feature, however this has only
+	 * been reliable since ABI 32.
+	 */
+	if (nfp_nsp_get_abi_ver_minor(nsp) < 32) {
+		PMD_DRV_LOG(ERR, "Operation only supported on ABI 32 or newer.");
+		nfp_eth_config_cleanup_end(nsp);
+		return -ENOTSUP;
+	}
+
+	entries = nfp_nsp_config_entries(nsp);
+
+	reg = rte_le_to_cpu_64(entries[idx].control);
+	reg &= ~NSP_ETH_CTRL_SET_IDMODE;
+	reg |= FIELD_PREP(NSP_ETH_CTRL_SET_IDMODE, is_on);
+	entries[idx].control = rte_cpu_to_le_64(reg);
+
+	nfp_nsp_config_set_modified(nsp, 1);
+
+	return nfp_eth_config_commit_end(nsp);
+}
-- 
2.43.5


^ permalink raw reply	[relevance 6%]

* Re: [PATCH RESEND v7 0/5] app/testpmd: support multiple process attach and detach port
  2024-10-29 22:12  0%         ` Ferruh Yigit
@ 2024-10-30  4:06  0%           ` lihuisong (C)
  0 siblings, 0 replies; 200+ results
From: lihuisong (C) @ 2024-10-30  4:06 UTC (permalink / raw)
  To: Ferruh Yigit, thomas, andrew.rybchenko, Stephen Hemminger
  Cc: dev, fengchengwen, liuyonglong


在 2024/10/30 6:12, Ferruh Yigit 写道:
> On 10/18/2024 3:48 AM, lihuisong (C) wrote:
>> Hi Ferruh,
>>
>> Thanks for your considering again. please see reply inline.
>>
>> 在 2024/10/18 9:04, Ferruh Yigit 写道:
>>> On 10/8/2024 3:32 AM, lihuisong (C) wrote:
>>>> Hi Thomas and Ferruh,
>>>>
>>>> We've discussed it on and off a few times, and we've reached some
>>>> consensus.
>>>> They've been going through more than 2 years😅
>>>> Can you have a look at this series again?
>>>> If we really don't need it, I will drop it from my upstreaming list.
>>>>
>>> Hi Huisong,
>>>
>>> I was not really convinced with the patch series, but did not want to
>>> block it outright, sorry that this caused patch series stay around.
>>>
>>> As checked again, still feels like adding unnecessary complexity, and I
>>> am for rejecting this series.
>>>
>>> Overall target is to be able to support hotplug with primary/secondary
>>> process, and uses event handlers for this but this requires adding a new
>>> ethdev state to be able iterate over devices etc...
>>> Perhaps better way to support this without relying on event handlers.
>> Ignoring the modification of tesptmd is ok to me.
>> But we need to restrict testpmd not to support attach and detach port in
>> multiple process case.
>> Otherwise. these issues this series solved will be encountered.
>>
>> BTW, I want to say the patch [2/5] which introduced
>> RTE_ETH_DEV_ALLOCATED should be thought again.
>> Because it is an real issue in ethdev layer. This is also the fruit that
>> Thomas, you and I discussed before.
>> Please look at this patch again.
>>
> RTE_ETH_DEV_ALLOCATED is added to run RTE_ETH_FOREACH_DEV in the event
> handler, more specifically on the 'RTE_ETH_EVENT_NEW' event handler, right?
Yes
> Without testpmd event handler update, what is the reason/usecase for
> above ethdev change?
no testpmd event handler modification, other applications may also 
encounter it.
Please take a  look at the commit of patch 2/5 and the modification in 
patch 3/5.

>
> Thomas, Andrew, Stephen, please feel free to chime in.
>
>
>> /Huisong
>>>
>>>> /Huisong
>>>>
>>>>
>>>> 在 2024/9/29 13:52, Huisong Li 写道:
>>>>> This patchset fix some bugs and support attaching and detaching port
>>>>> in primary and secondary.
>>>>>
>>>>> ---
>>>>>     -v7: fix conflicts
>>>>>     -v6: adjust rte_eth_dev_is_used position based on alphabetical order
>>>>>          in version.map
>>>>>     -v5: move 'ALLOCATED' state to the back of 'REMOVED' to avoid abi
>>>>> break.
>>>>>     -v4: fix a misspelling.
>>>>>     -v3:
>>>>>       #1 merge patch 1/6 and patch 2/6 into patch 1/5, and add
>>>>> modification
>>>>>          for other bus type.
>>>>>       #2 add a RTE_ETH_DEV_ALLOCATED state in rte_eth_dev_state to
>>>>> resolve
>>>>>          the probelm in patch 2/5.
>>>>>     -v2: resend due to CI unexplained failure.
>>>>>
>>>>> Huisong Li (5):
>>>>>      drivers/bus: restore driver assignment at front of probing
>>>>>      ethdev: fix skip valid port in probing callback
>>>>>      app/testpmd: check the validity of the port
>>>>>      app/testpmd: add attach and detach port for multiple process
>>>>>      app/testpmd: stop forwarding in new or destroy event
>>>>>
>>>>>     app/test-pmd/testpmd.c                   | 47 ++++++++++++++
>>>>> +---------
>>>>>     app/test-pmd/testpmd.h                   |  1 -
>>>>>     drivers/bus/auxiliary/auxiliary_common.c |  9 ++++-
>>>>>     drivers/bus/dpaa/dpaa_bus.c              |  9 ++++-
>>>>>     drivers/bus/fslmc/fslmc_bus.c            |  8 +++-
>>>>>     drivers/bus/ifpga/ifpga_bus.c            | 12 ++++--
>>>>>     drivers/bus/pci/pci_common.c             |  9 ++++-
>>>>>     drivers/bus/vdev/vdev.c                  | 10 ++++-
>>>>>     drivers/bus/vmbus/vmbus_common.c         |  9 ++++-
>>>>>     drivers/net/bnxt/bnxt_ethdev.c           |  3 +-
>>>>>     drivers/net/bonding/bonding_testpmd.c    |  1 -
>>>>>     drivers/net/mlx5/mlx5.c                  |  2 +-
>>>>>     lib/ethdev/ethdev_driver.c               | 13 +++++--
>>>>>     lib/ethdev/ethdev_driver.h               | 12 ++++++
>>>>>     lib/ethdev/ethdev_pci.h                  |  2 +-
>>>>>     lib/ethdev/rte_class_eth.c               |  2 +-
>>>>>     lib/ethdev/rte_ethdev.c                  |  4 +-
>>>>>     lib/ethdev/rte_ethdev.h                  |  4 +-
>>>>>     lib/ethdev/version.map                   |  1 +
>>>>>     19 files changed, 114 insertions(+), 44 deletions(-)
>>>>>
>>> .
> .

^ permalink raw reply	[relevance 0%]

* Re: [PATCH RESEND v7 0/5] app/testpmd: support multiple process attach and detach port
    2024-10-26  4:11  0%         ` lihuisong (C)
@ 2024-10-29 22:12  0%         ` Ferruh Yigit
  2024-10-30  4:06  0%           ` lihuisong (C)
  1 sibling, 1 reply; 200+ results
From: Ferruh Yigit @ 2024-10-29 22:12 UTC (permalink / raw)
  To: lihuisong (C), thomas, andrew.rybchenko, Stephen Hemminger
  Cc: dev, fengchengwen, liuyonglong

On 10/18/2024 3:48 AM, lihuisong (C) wrote:
> Hi Ferruh,
> 
> Thanks for your considering again. please see reply inline.
> 
> 在 2024/10/18 9:04, Ferruh Yigit 写道:
>> On 10/8/2024 3:32 AM, lihuisong (C) wrote:
>>> Hi Thomas and Ferruh,
>>>
>>> We've discussed it on and off a few times, and we've reached some
>>> consensus.
>>> They've been going through more than 2 years😅
>>> Can you have a look at this series again?
>>> If we really don't need it, I will drop it from my upstreaming list.
>>>
>> Hi Huisong,
>>
>> I was not really convinced with the patch series, but did not want to
>> block it outright, sorry that this caused patch series stay around.
>>
>> As checked again, still feels like adding unnecessary complexity, and I
>> am for rejecting this series.
>>
>> Overall target is to be able to support hotplug with primary/secondary
>> process, and uses event handlers for this but this requires adding a new
>> ethdev state to be able iterate over devices etc...
>> Perhaps better way to support this without relying on event handlers.
> Ignoring the modification of tesptmd is ok to me.
> But we need to restrict testpmd not to support attach and detach port in
> multiple process case.
> Otherwise. these issues this series solved will be encountered.
> 
> BTW, I want to say the patch [2/5] which introduced
> RTE_ETH_DEV_ALLOCATED should be thought again.
> Because it is an real issue in ethdev layer. This is also the fruit that
> Thomas, you and I discussed before.
> Please look at this patch again.
> 

RTE_ETH_DEV_ALLOCATED is added to run RTE_ETH_FOREACH_DEV in the event
handler, more specifically on the 'RTE_ETH_EVENT_NEW' event handler, right?
Without testpmd event handler update, what is the reason/usecase for
above ethdev change?

Thomas, Andrew, Stephen, please feel free to chime in.


> /Huisong
>>
>>
>>> /Huisong
>>>
>>>
>>> 在 2024/9/29 13:52, Huisong Li 写道:
>>>> This patchset fix some bugs and support attaching and detaching port
>>>> in primary and secondary.
>>>>
>>>> ---
>>>>    -v7: fix conflicts
>>>>    -v6: adjust rte_eth_dev_is_used position based on alphabetical order
>>>>         in version.map
>>>>    -v5: move 'ALLOCATED' state to the back of 'REMOVED' to avoid abi
>>>> break.
>>>>    -v4: fix a misspelling.
>>>>    -v3:
>>>>      #1 merge patch 1/6 and patch 2/6 into patch 1/5, and add
>>>> modification
>>>>         for other bus type.
>>>>      #2 add a RTE_ETH_DEV_ALLOCATED state in rte_eth_dev_state to
>>>> resolve
>>>>         the probelm in patch 2/5.
>>>>    -v2: resend due to CI unexplained failure.
>>>>
>>>> Huisong Li (5):
>>>>     drivers/bus: restore driver assignment at front of probing
>>>>     ethdev: fix skip valid port in probing callback
>>>>     app/testpmd: check the validity of the port
>>>>     app/testpmd: add attach and detach port for multiple process
>>>>     app/testpmd: stop forwarding in new or destroy event
>>>>
>>>>    app/test-pmd/testpmd.c                   | 47 ++++++++++++++
>>>> +---------
>>>>    app/test-pmd/testpmd.h                   |  1 -
>>>>    drivers/bus/auxiliary/auxiliary_common.c |  9 ++++-
>>>>    drivers/bus/dpaa/dpaa_bus.c              |  9 ++++-
>>>>    drivers/bus/fslmc/fslmc_bus.c            |  8 +++-
>>>>    drivers/bus/ifpga/ifpga_bus.c            | 12 ++++--
>>>>    drivers/bus/pci/pci_common.c             |  9 ++++-
>>>>    drivers/bus/vdev/vdev.c                  | 10 ++++-
>>>>    drivers/bus/vmbus/vmbus_common.c         |  9 ++++-
>>>>    drivers/net/bnxt/bnxt_ethdev.c           |  3 +-
>>>>    drivers/net/bonding/bonding_testpmd.c    |  1 -
>>>>    drivers/net/mlx5/mlx5.c                  |  2 +-
>>>>    lib/ethdev/ethdev_driver.c               | 13 +++++--
>>>>    lib/ethdev/ethdev_driver.h               | 12 ++++++
>>>>    lib/ethdev/ethdev_pci.h                  |  2 +-
>>>>    lib/ethdev/rte_class_eth.c               |  2 +-
>>>>    lib/ethdev/rte_ethdev.c                  |  4 +-
>>>>    lib/ethdev/rte_ethdev.h                  |  4 +-
>>>>    lib/ethdev/version.map                   |  1 +
>>>>    19 files changed, 114 insertions(+), 44 deletions(-)
>>>>
>> .


^ permalink raw reply	[relevance 0%]

* Re: release candidate 24.11-rc1
  2024-10-18 21:47  4% release candidate 24.11-rc1 Thomas Monjalon
  2024-10-29 10:19  0% ` Xu, HailinX
@ 2024-10-29 19:31  0% ` Thinh Tran
  1 sibling, 0 replies; 200+ results
From: Thinh Tran @ 2024-10-29 19:31 UTC (permalink / raw)
  To: Thomas Monjalon, dpdk-dev

IBM - Power Systems
DPDK v24.11-rc1-6-g90cb8ff819


* Build CI on Fedora 38,39,40 container images for ppc64le
* Basic PF on Mellanox: No issue found
* Performance: not tested.
* OS: RHEL 9.4  kernel: 5.14.0-427.40.1.el9_4.ppc64le
         with gcc version 11.4.1 20231218 (Red Hat 11.4.1-3) (GCC)
       SLES15 SP5  kernel: 5.14.21-150500.55.49-default
         with gcc version 13.2.1 20230912 (SUSE Linux)

Systems tested:
  - LPARs on IBM Power10 CHRP IBM,9105-22A
     NICs:
     - Mellanox Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
     - firmware version: 26.42.1000
     - MLNX_OFED_LINUX-24.07-0.6.1.5

Thinh Tran

On 10/18/2024 4:47 PM, Thomas Monjalon wrote:
> A new DPDK release candidate is ready for testing:
> 	https://git.dpdk.org/dpdk/tag/?id=v24.11-rc1
> 
> There are 630 new patches in this snapshot,
> including many API/ABI compatibility breakages.
> This release won't be ABI-compatible with previous ones.
> 
> Release notes:
> 	https://doc.dpdk.org/guides/rel_notes/release_24_11.html
> 
> Highlights of 24.11-rc1:
> 	- bit set and atomic bit manipulation
> 	- IPv6 address API
> 	- Ethernet link lanes
> 	- flow table index action
> 	- Cisco enic VF
> 	- Marvell CN20K
> 	- symmetric crypto SM4
> 	- asymmetric crypto EdDSA
> 	- event device pre-scheduling
> 	- event device independent enqueue
> 
> Please test and report issues on bugs.dpdk.org.
> 
> Few more new APIs may be added in -rc2.
> DPDK 24.11-rc2 is expected in more than two weeks (early November).
> 
> Thank you everyone
> 
> 


^ permalink raw reply	[relevance 0%]

* Re: [PATCH V3 7/7] mlx5: add backward compatibility for RDMA monitor
  @ 2024-10-29 16:26  3%     ` Stephen Hemminger
  2024-10-30  8:25  0%       ` Minggang(Gavin) Li
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2024-10-29 16:26 UTC (permalink / raw)
  To: Minggang Li(Gavin)
  Cc: viacheslavo, matan, orika, thomas, Dariusz Sosnowski, Bing Zhao,
	Suanming Mou, dev, rasland

On Tue, 29 Oct 2024 15:42:56 +0200
"Minggang Li(Gavin)" <gavinl@nvidia.com> wrote:

>  
> +* **Updated NVIDIA mlx5 driver.**
> +
> +  Optimized port probe in large scale.
> +  This feature enhances the efficiency of probing VF/SFs on a large scale
> +  by significantly reducing the probing time. To activate this feature,
> +  set ``probe_opt_en`` to a non-zero value during device probing. It
> +  leverages a capability from the RDMA driver, expected to be released in
> +  the upcoming kernel version 6.13 or its equivalent in OFED 24.10,
> +  specifically the RDMA monitor. For additional details on the limitations
> +  of devargs, refer to "doc/guides/nics/mlx5.rst".
> +
> +  If there are lots of VFs/SFs to be probed by the application, eg, 300
> +  VFs/SFs, the option should be enabled to save probing time.

IMHO the kernel parts have to be available in a released kernel version.
Otherwise the kernel API/ABI is not stable and there is a possibility of user confusion.

This needs to stay in "awaiting upstream" state until kernel is released

^ permalink raw reply	[relevance 3%]

* [PATCH v14 1/3] power: introduce PM QoS API on CPU wide
  2024-10-29 13:28  4% ` [PATCH v14 0/3] power: introduce PM QoS interface Huisong Li
@ 2024-10-29 13:28  5%   ` Huisong Li
  2024-11-04  9:13  0%   ` [PATCH v14 0/3] power: introduce PM QoS interface lihuisong (C)
  1 sibling, 0 replies; 200+ results
From: Huisong Li @ 2024-10-29 13:28 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Acked-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..91358b04f3 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..d9e268274b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -237,6 +237,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9b5d3e8315 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
         'rte_power.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+	'rte_power_qos.c',
 )
 headers = files(
         'rte_power.h',
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+	'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..4dd0532b36
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == RTE_POWER_QOS_STRICT_LATENCY_VALUE)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = RTE_POWER_QOS_STRICT_LATENCY_VALUE;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..7a8dab9272
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE		0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT	INT32_MAX
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..08f178a39d 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,8 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [PATCH v14 0/3] power: introduce PM QoS interface
                     ` (2 preceding siblings ...)
  2024-10-25  9:18  4% ` [PATCH v13 0/3] power: introduce PM QoS interface Huisong Li
@ 2024-10-29 13:28  4% ` Huisong Li
  2024-10-29 13:28  5%   ` [PATCH v14 1/3] power: introduce PM QoS API on CPU wide Huisong Li
  2024-11-04  9:13  0%   ` [PATCH v14 0/3] power: introduce PM QoS interface lihuisong (C)
  2024-11-11  2:25  4% ` [PATCH v15 " Huisong Li
  2024-11-11  9:14  4% ` [RESEND PATCH " Huisong Li
  5 siblings, 2 replies; 200+ results
From: Huisong Li @ 2024-10-29 13:28 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Please see the description in kernel document[1].
Each cpuidle governor in Linux select which idle state to enter based on
this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from idle state by setting strict resume latency (zero value).

[1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

---
 v14:
  - use parse_uint to parse --cpu-resume-latency instead of adding a new
    parse_int()
 v13:
  - not allow negative value for --cpu-resume-latency.
  - restore to the original value as Konstantin suggested.
 v12:
  - add Acked-by Chengwen and Konstantin
  - fix overflow issue in l3fwd-power when parse command line
  - add a command parameter to set CPU resume latency
 v11:
  - operate the cpu id the lcore mapped by the new function
    power_get_lcore_mapped_cpu_id().
 v10:
  - replace LINE_MAX with a custom macro and fix two typos.
 v9:
  - move new feature description from release_24_07.rst to release_24_11.rst.
 v8:
  - update the latest code to resolve CI warning
 v7:
  - remove a dead code rte_lcore_is_enabled in patch[2/2]
 v6:
  - update release_24_07.rst based on dpdk repo to resolve CI warning.
 v5:
  - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
 v4:
  - fix some comments basd on Stephen
  - add stdint.h include
  - add Acked-by Morten Brørup <mb@smartsharesystems.com>
 v3:
  - add RTE_POWER_xxx prefix for some macro in header
  - add the check for lcore_id with rte_lcore_is_enabled
 v2:
  - use PM QoS on CPU wide to replace the one on system wide


Huisong Li (3):
  power: introduce PM QoS API on CPU wide
  examples/l3fwd-power: fix data overflow when parse command line
  examples/l3fwd-power: add PM QoS configuration

 doc/guides/prog_guide/power_man.rst           |  19 +++
 doc/guides/rel_notes/release_24_11.rst        |   5 +
 .../sample_app_ug/l3_forward_power_man.rst    |   5 +-
 examples/l3fwd-power/main.c                   |  96 +++++++++++---
 lib/power/meson.build                         |   2 +
 lib/power/rte_power_qos.c                     | 123 ++++++++++++++++++
 lib/power/rte_power_qos.h                     |  73 +++++++++++
 lib/power/version.map                         |   4 +
 8 files changed, 306 insertions(+), 21 deletions(-)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

-- 
2.22.0


^ permalink raw reply	[relevance 4%]

* RE: release candidate 24.11-rc1
  2024-10-18 21:47  4% release candidate 24.11-rc1 Thomas Monjalon
@ 2024-10-29 10:19  0% ` Xu, HailinX
  2024-10-29 19:31  0% ` Thinh Tran
  1 sibling, 0 replies; 200+ results
From: Xu, HailinX @ 2024-10-29 10:19 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: Kovacevic, Marko, Mcnamara, John, Richardson, Bruce, Ferruh Yigit

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Saturday, October 19, 2024 5:47 AM
> To: announce@dpdk.org
> Subject: release candidate 24.11-rc1
> 
> A new DPDK release candidate is ready for testing:
> 	https://git.dpdk.org/dpdk/tag/?id=v24.11-rc1
> 
> There are 630 new patches in this snapshot, including many API/ABI
> compatibility breakages.
> This release won't be ABI-compatible with previous ones.
> 
> Release notes:
> 	https://doc.dpdk.org/guides/rel_notes/release_24_11.html
> 
> Highlights of 24.11-rc1:
> 	- bit set and atomic bit manipulation
> 	- IPv6 address API
> 	- Ethernet link lanes
> 	- flow table index action
> 	- Cisco enic VF
> 	- Marvell CN20K
> 	- symmetric crypto SM4
> 	- asymmetric crypto EdDSA
> 	- event device pre-scheduling
> 	- event device independent enqueue
> 
> Please test and report issues on bugs.dpdk.org.
> 
> Few more new APIs may be added in -rc2.
> DPDK 24.11-rc2 is expected in more than two weeks (early November).
> 
> Thank you everyone
> 
Update the test status for Intel part. dpdk24.11-rc1 all test is done. found four new issues.

New issues:
1. [dpdk-24.11]ice_fdir/mac_ipv6_udp: match to an irregular message    -> Intel dev is under investigating
2. [DPDK-24.11.0-RC1] cryptodev_qat_asym_autotest is failing    -> Intel dev is under investigating
3. cpfl_vf_representor_rte_flow/split_queue_mac_ipv4_udp_vf_to_io: rule create failed    -> Intel dev is under investigating
4. [DPDK-24.11] E830 200G port can't up when starting testpmd    -> Intel dev is under investigating

# Basic Intel(R) NIC testing
* Build or compile:  
 *Build: cover the build test combination with latest GCC/Clang version and the popular OS revision such as Ubuntu24.10, Ubuntu24.04.1, Fedora40, RHEL8.10 RHEL9.4, FreeBSD14.1, SUSE15.6, OpenAnolis8.9, AzureLinux 3.0 etc.
  - All test passed.
 *Compile: cover the CFLAGES(O0/O1/O2/O3) with popular OS such as Ubuntu24.04.1 and RHEL9.4.
  - All test passed with latest dpdk.
* PF/VF(i40e, ixgbe): test scenarios including PF/VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. 
	- All test case is done. No new issue is found.
* PF/VF(ice): test scenarios including Switch features/Package Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible Descriptor, etc.
	- Execution rate is done. found the 1 issue.
* CPF/APF(MEV): test scenarios including APF-HOST,CPF-HOST,CPF-ACC,cpfl_rte_flow/MTU/Jumboframe/checksum offload, etc.
	- Execution rate is done. found the 3 issue.
* Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test, RFC2544 Zero packet loss performance test, etc.
	- Execution rate is done. No new issue is found.
* Power and IPsec: 
 * Power: test scenarios including bi-direction/Telemetry/Empty Poll Lib/Priority Base Frequency, etc. 
	- Execution rate is done. No new issue is found.
 * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - QAT&SW/FIB library, etc.
	- Execution rate is done. No new issue is found. 
# Basic cryptodev and virtio testing
* Virtio: both function and performance test are covered. Such as PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing/VMAWARE ESXI 8.0U1, etc.
	- Execution rate is done. No new issue is found.
* Cryptodev: 
 *Function test: test scenarios including Cryptodev API testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc.
	- Execution rate is done. found the 2 issue. 
 *Performance test: test scenarios including Throughput Performance /Cryptodev Latency, etc.
	- Execution rate is done. No performance drop.

Regards,
Xu, Hailin

^ permalink raw reply	[relevance 0%]

* Re: [EXTERNAL] Re: [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs
  @ 2024-10-28 11:15  3%         ` Dodji Seketeli
  0 siblings, 0 replies; 200+ results
From: Dodji Seketeli @ 2024-10-28 11:15 UTC (permalink / raw)
  To: Akhil Goyal
  Cc: Ferruh Yigit, David Marchand, dev, Dodji Seketeli, thomas,
	hemant.agrawal, Anoob Joseph, pablo.de.lara.guarch, fiona.trahe,
	declan.doherty, matan, g.singh, fanzhang.oss, jianjay.zhou,
	asomalap, ruifeng.wang, konstantin.v.ananyev, radu.nicolau,
	ajit.khaparde, Nagadheeraj Rottela, mdr

Akhil Goyal <gakhil@marvell.com> writes:

>> >>> Now added inline APIs for getting the list end which need to be updated
>> >>> for each new entry to the enum. This shall help in avoiding ABI break
>> >>> for adding new algo.
>> >>>
>> >>
>> >> Hi Akhil,
>> >>
>> >> *I think* this hides the problem instead of fixing it, and this may be
>> >> partially because of the tooling (libabigail) limitation.
>> >>
>> >> This patch prevents the libabigail warning, true, but it doesn't improve
>> >> anything from the application's perspective.
>> >> Before or after this patch, application knows a fixed value as END value.
>> >>
>> >> Not all changes in the END (MAX) enum values cause ABI break, but tool
>> >> warns on all, that is why I think this may be tooling limitation [1].
>> >> (Just to stress, I am NOT talking about regular enum values change, I am
>> >> talking about only END (MAX) value changes caused by appending new enum
>> >> items.)
>> >>
>> >> As far as I can see (please Dodji, David correct me if I am wrong) ABI
>> >> break only happens if application and library exchange enum values in
>> >> the API (directly or within a struct).
>> >
>> > - There is also the following issue:
>> > A library publicly exports an array sized against a END (MAX) enum in the API.
>> > https://developers.redhat.com/blog/2019/05/06/how-c-array-sizes-become-part-of-the-binary-interface-of-a-library
>> >
>> 
>> I see. And Dodji explained this requires source code to detect.
>> 
>> I don't remember seeing a public array whose size is defined by an enum,
>> are you aware of any instance of this usage?
>
> https://patches.dpdk.org/project/dpdk/patch/20241009071151.1106-1-gmuthukrishn@marvell.com/
> This was merged yesterday.

I guess the problematic piece of the code is this:

    diff --git a/lib/cryptodev/rte_cryptodev.h
    b/lib/cryptodev/rte_cryptodev.h
    index bec947f6d5..aa6ef3a94d 100644
    --- a/lib/cryptodev/rte_cryptodev.h
    +++ b/lib/cryptodev/rte_cryptodev.h
    @@ -185,6 +185,9 @@  struct rte_cryptodev_asymmetric_xform_capability {
               * Value 0 means unavailable, and application should pass the
               required
                     * random value. Otherwise, PMD would internally compute
                     the random number.
                          */
    +
    +               uint32_t op_capa[RTE_CRYPTO_ASYM_OP_LIST_END];
    +                        /**< Operation specific capabilities. */
                             };


Is it possible for the struct rte_cryptodev_asymmetric_xform_capability
to be made an opaque struct which definition would be present only in a
.c file of the library?

Its data members would then be retrieved by getter functions that take a
pointer to that struct in parameter.

That way, the uint32_t op_capa[RTE_CRYPTO_ASYM_OP_LIST_END] data member
would be "private" to the .c file and thus would not be part of the
ABI.  Any change to the RTE_CRYPTO_ASYM_OP enum would then become
harmless to that struct.

I hope this helps.

-- 
		Dodji


^ permalink raw reply	[relevance 3%]

* Re: [EXTERNAL] Re: [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs
  @ 2024-10-28 10:55  4%       ` Dodji Seketeli
  0 siblings, 0 replies; 200+ results
From: Dodji Seketeli @ 2024-10-28 10:55 UTC (permalink / raw)
  To: Akhil Goyal
  Cc: Dodji Seketeli, Ferruh Yigit, dev, thomas, david.marchand,
	hemant.agrawal, Anoob Joseph, pablo.de.lara.guarch, fiona.trahe,
	declan.doherty, matan, g.singh, fanzhang.oss, jianjay.zhou,
	asomalap, ruifeng.wang, konstantin.v.ananyev, radu.nicolau,
	ajit.khaparde, Nagadheeraj Rottela, mdr

Hello,

Akhil Goyal <gakhil@marvell.com> writes:

[...]


>> I believe that if you want to know if an enumerator value is *USED* by a
>> type (which I believe is at the root of what you are alluding to), then
>> you would need a static analysis tool that works at the source level.
>> Or, you need a human review of the code once the binary analysis tool
>> told you that that value of the enumerator changed.
>> 
>> Why ? please let me give you an example:
>> 
>>     enum foo_enum
>>     {
>>      FOO_FIRST,
>>      FOO_SECOND,
>>      FOO_END
>>     };
>> 
>>     int array[FOO_END];
>> 
>> Once this is compiled into binary, what libabigail is going to see by
>> analyzing the binary is that 'array' is an array of 2 integers.  The
>> information about the size of the array being initially an enumerator
>> value is lost.  To detect that, you need source level analysis.
>> 
>> But then, by reviewing the code, this is a construct that you can spot
>> and allow or forbid, depending on your goals as a project.
>> 
> In the above example if in newer library a FOO_THIRD is added.
> FOO_END value will change and will cause ABI break for change in existing value.
> But if we replace it with inline function to get the list_end and use it in array like below.
> So if FOO_THIRD is added, we will also update foo_enum_list_end() function to return (FOO_THIRD+1)
>
>      enum foo_enum
>      {
>       FOO_FIRST,
>       FOO_SECOND,
>      };
>      static inline int foo_enum_list_end()
>      {
>             return FOO_SECOND + 1;
>      }
>      int array[foo_enum_list_end()];
>
> Will this cause an ABI break if we add this array in application or in library?

I think this (inline function) construct is essentially the same as just
adding a FOO_END enumerator after FOO_SECOND.  Using either
foo_enum_list_end() or FOO_END result in having the value '2' in the
application using the library to get FOO_END or foo_enum_list_end().

Newer versions of the library being linked to the application won't
change that value '2', regardless of the newer values of FOO_END or
foo_enum_list_end().

So, adding a FOO_THIRD right after FOO_END, induces and ABI change.

This change being "breaking" (incompatible) or not, really depends on
what the application expects, I would say.  Sorry if that looks "vague",
but this whole concept is quite blurry.

For instance, if you add FOO_THIRD after FOO_SECOND in the newer version
of the library and the application still gets the value '2' rather than
getting the value '3', and that value is actually multiplied by "two
trillions" in the application to get the value of the dividend to be
payed to investors, then, then that's a very big error induced by that
change.  That might be considered by the application as a "breaking" ABI
change and you might get a call or two from the CEO of an S&P500 company
that uses the library.

Other applications might consider that "off-by-one" error not being
problematic at all and thus might consider it not "breaking".

Note that REMOVING an enumerator however is always considered an
incompatible (breaking) ABI change.

Adding an enumerator however has this annoying "grey topic" (not black or
white) property that I am not sure how to address at this point.

Cheers,

-- 
		Dodji


^ permalink raw reply	[relevance 4%]

* Re: [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs
  @ 2024-10-28 10:12  4%       ` Dodji Seketeli
  0 siblings, 0 replies; 200+ results
From: Dodji Seketeli @ 2024-10-28 10:12 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Akhil Goyal, dev, thomas, david.marchand, hemant.agrawal, anoobj,
	pablo.de.lara.guarch, fiona.trahe, declan.doherty, matan,
	g.singh, fanzhang.oss, jianjay.zhou, asomalap, ruifeng.wang,
	konstantin.v.ananyev, radu.nicolau, ajit.khaparde, rnagadheeraj,
	mdr

Hello,

Ferruh Yigit <ferruh.yigit@amd.com> writes:

[...]

>> This change cause the value of the the FOOD_END enumerator to increase.
>> And that increase might be problematic.  At the moment, for it being
>> problematic or not has to be the result of a careful review.
>> 
>
> As you said, FOOD_END value change can be sometimes problematic, but
> sometimes it is not.
> This what I referred as limitation that tool is not reporting only
> problematic case, but require human review.

Oh, I see. Thank you for clarifying.

> (btw, this is a very useful tool, I don't want to sound like negative
> about it, only want to address this recurring subject in dpdk.)

No problem, I never assume you mean anything negative :-)

[...]


>> So, by default, abidiff will complain by saying that the value of
>> FOO_END was changed.
>> 
>> But you, as a community of practice, can decide that this kind of change
>> to the value of the last enumerator is not a problematic change, after
>> careful review of your code and practice.  You thus can specify that
>> the tool using a suppression specification which has the following
>> syntax:
>> 
>>     [suppress_type]
>>       type_kind = enum
>>       changed_enumerators = FOO_END, ANOTHER_ENUM_END, AND_ANOTHER_ENUM_END
>> 
>> or, alternatively, you can specify the enumerators you want to suppress
>> the changes for as a list of regular expressions:
>> 
>>     [suppress_type]
>>       type_kind = enum
>>       changed_enumerators_regexp = .*_END$, .*_LAST$, .*_LIST_END$
>> 
>> Wouldn't that be enough to address your use case here (honest question)?
>> 
>
> We are already using suppress feature in dpdk.
>
> But difficulty is to decide if END (MAX) enum value change warning is an
> actual ABI break or not.
>
> When tool gives warning, tendency is to just make warning go away,
> mostly by removing END (MAX) enum without really analyzing if it is a
> real ABI break.

I see.

[...]

>>> [1] It would be better if tool gives END (MAX) enum value warnings only
>>> if it is exchanged in an API, but not sure if this can be possible to
>>> detect.
>> 
>> I believe that if you want to know if an enumerator value is *USED* by a
>> type (which I believe is at the root of what you are alluding to), then
>> you would need a static analysis tool that works at the source level.
>> Or, you need a human review of the code once the binary analysis tool
>> told you that that value of the enumerator changed.
>> 
>> Why ? please let me give you an example:
>> 
>>     enum foo_enum
>>     {
>>      FOO_FIRST,
>>      FOO_SECOND,
>>      FOO_END
>>     };
>> 
>>     int array[FOO_END];
>> 
>> Once this is compiled into binary, what libabigail is going to see by
>> analyzing the binary is that 'array' is an array of 2 integers.  The
>> information about the size of the array being initially an enumerator
>> value is lost.  To detect that, you need source level analysis.
>> 
>
> I see the problem.
>
> Is this the main reason that changing FOO_END value reported as warning?

Yes, it is because of this class of issues.

Actually if ANY enumerator value is changed, that is actually an ABI
change.  And that ABI change is either compatible or not.

> If we forbid this kind of usage of the FOO_END, can we ignore this
> warning safely?

I would think so.


But then, you'd have to also forbid the use of all enumerators,
basically.  I am not sure that would be practical.

Rather I would tend to lean toward reviewing the use of enumerators, on
a case by case basis, using tools like 'grep' and whatnot.

What I would advise to forbid is the use of complicated macros or
constructs that makes the review of the use of enumerators
non-practical.  You should be able to grep "FOO_END" and see where it's
used in the source code.  Reviewing that shouldn't take more than a few
minutes whenever a tool warns about the change of its value.

>
>> But then, by reviewing the code, this is a construct that you can spot
>> and allow or forbid, depending on your goals as a project.
>> 
>> [...]
>> 
>> Cheers,
>> 
>

-- 
		Dodji


^ permalink raw reply	[relevance 4%]

* [PATCH v30 13/13] doc: add release note about log library
  @ 2024-10-27 17:24  4%   ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-10-27 17:24 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Bruce Richardson, Chengwen Feng

Significant enough to warrant a release note.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/rel_notes/release_24_11.rst | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index 53a5ffebe5..b96042ea14 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -353,6 +353,26 @@ API Changes
   and replaced it with a new shared devarg ``llq_policy`` that keeps the same logic.
 
 
+* **Logging library changes**
+
+  * The log is initialized earlier in startup so all messages go through the library.
+
+  * Added a new option to timestamp log messages, which is useful for
+    debugging delays in application and driver startup.
+
+  * If the application is a systemd service and the log output is being
+    sent to standard error then DPDK will switch to journal native protocol.
+    This allows the more data such as severity to be sent.
+
+  * The syslog option has changed. By default, messages are no longer sent
+    to syslog unless the *--syslog* option is specified.
+    Syslog is also supported on FreeBSD (but not on Windows).
+
+  * Log messages can be timestamped with *--log-timestamp* option.
+
+  * Log messages can be colorized with the *--log-color* option.
+
+
 ABI Changes
 -----------
 
-- 
2.45.2


^ permalink raw reply	[relevance 4%]

* Re: [PATCH RESEND v7 0/5] app/testpmd: support multiple process attach and detach port
  @ 2024-10-26  4:11  0%         ` lihuisong (C)
  2024-10-29 22:12  0%         ` Ferruh Yigit
  1 sibling, 0 replies; 200+ results
From: lihuisong (C) @ 2024-10-26  4:11 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, fengchengwen, liuyonglong, thomas, andrew.rybchenko

Hi Ferruh,


在 2024/10/18 10:48, lihuisong (C) 写道:
> Hi Ferruh,
>
> Thanks for your considering again. please see reply inline.
>
> 在 2024/10/18 9:04, Ferruh Yigit 写道:
>> On 10/8/2024 3:32 AM, lihuisong (C) wrote:
>>> Hi Thomas and Ferruh,
>>>
>>> We've discussed it on and off a few times, and we've reached some
>>> consensus.
>>> They've been going through more than 2 years😅
>>> Can you have a look at this series again?
>>> If we really don't need it, I will drop it from my upstreaming list.
>>>
>> Hi Huisong,
>>
>> I was not really convinced with the patch series, but did not want to
>> block it outright, sorry that this caused patch series stay around.
>>
>> As checked again, still feels like adding unnecessary complexity, and I
>> am for rejecting this series.
>>
>> Overall target is to be able to support hotplug with primary/secondary
>> process, and uses event handlers for this but this requires adding a new
>> ethdev state to be able iterate over devices etc...
>> Perhaps better way to support this without relying on event handlers.
> Ignoring the modification of tesptmd is ok to me.
> But we need to restrict testpmd not to support attach and detach port 
> in multiple process case.
> Otherwise. these issues this series solved will be encountered.
>
> BTW, I want to say the patch [2/5] which introduced 
> RTE_ETH_DEV_ALLOCATED should be thought again.
> Because it is an real issue in ethdev layer. This is also the fruit 
> that Thomas, you and I discussed before.
> Please look at this patch again.
Can you please take a look at my above reply?
>
> /Huisong
>>
>>
>>> /Huisong
>>>
>>>
>>> 在 2024/9/29 13:52, Huisong Li 写道:
>>>> This patchset fix some bugs and support attaching and detaching port
>>>> in primary and secondary.
>>>>
>>>> ---
>>>>    -v7: fix conflicts
>>>>    -v6: adjust rte_eth_dev_is_used position based on alphabetical 
>>>> order
>>>>         in version.map
>>>>    -v5: move 'ALLOCATED' state to the back of 'REMOVED' to avoid abi
>>>> break.
>>>>    -v4: fix a misspelling.
>>>>    -v3:
>>>>      #1 merge patch 1/6 and patch 2/6 into patch 1/5, and add 
>>>> modification
>>>>         for other bus type.
>>>>      #2 add a RTE_ETH_DEV_ALLOCATED state in rte_eth_dev_state to 
>>>> resolve
>>>>         the probelm in patch 2/5.
>>>>    -v2: resend due to CI unexplained failure.
>>>>
>>>> Huisong Li (5):
>>>>     drivers/bus: restore driver assignment at front of probing
>>>>     ethdev: fix skip valid port in probing callback
>>>>     app/testpmd: check the validity of the port
>>>>     app/testpmd: add attach and detach port for multiple process
>>>>     app/testpmd: stop forwarding in new or destroy event
>>>>
>>>>    app/test-pmd/testpmd.c                   | 47 
>>>> +++++++++++++++---------
>>>>    app/test-pmd/testpmd.h                   |  1 -
>>>>    drivers/bus/auxiliary/auxiliary_common.c |  9 ++++-
>>>>    drivers/bus/dpaa/dpaa_bus.c              |  9 ++++-
>>>>    drivers/bus/fslmc/fslmc_bus.c            |  8 +++-
>>>>    drivers/bus/ifpga/ifpga_bus.c            | 12 ++++--
>>>>    drivers/bus/pci/pci_common.c             |  9 ++++-
>>>>    drivers/bus/vdev/vdev.c                  | 10 ++++-
>>>>    drivers/bus/vmbus/vmbus_common.c         |  9 ++++-
>>>>    drivers/net/bnxt/bnxt_ethdev.c           |  3 +-
>>>>    drivers/net/bonding/bonding_testpmd.c    |  1 -
>>>>    drivers/net/mlx5/mlx5.c                  |  2 +-
>>>>    lib/ethdev/ethdev_driver.c               | 13 +++++--
>>>>    lib/ethdev/ethdev_driver.h               | 12 ++++++
>>>>    lib/ethdev/ethdev_pci.h                  |  2 +-
>>>>    lib/ethdev/rte_class_eth.c               |  2 +-
>>>>    lib/ethdev/rte_ethdev.c                  |  4 +-
>>>>    lib/ethdev/rte_ethdev.h                  |  4 +-
>>>>    lib/ethdev/version.map                   |  1 +
>>>>    19 files changed, 114 insertions(+), 44 deletions(-)
>>>>
>> .

^ permalink raw reply	[relevance 0%]

* [PATCH v29 13/13] doc: add release note about log library
  @ 2024-10-25 21:45  4%   ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-10-25 21:45 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Bruce Richardson, Chengwen Feng

Significant enough to warrant a release note.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/rel_notes/release_24_11.rst | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..1d2e60231b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -349,6 +349,26 @@ API Changes
   and replaced it with a new shared devarg ``llq_policy`` that keeps the same logic.
 
 
+* **Logging library changes**
+
+  * The log is initialized earlier in startup so all messages go through the library.
+
+  * Added a new option to timestamp log messages, which is useful for
+    debugging delays in application and driver startup.
+
+  * If the application is a systemd service and the log output is being
+    sent to standard error then DPDK will switch to journal native protocol.
+    This allows the more data such as severity to be sent.
+
+  * The syslog option has changed. By default, messages are no longer sent
+    to syslog unless the *--syslog* option is specified.
+    Syslog is also now now supported on FreeBSD (but not on Windows).
+
+  * Log messages can be timestamped with *--log-timestamp* option.
+
+  * Log messages can be colorized with the *--log-color* option.
+
+
 ABI Changes
 -----------
 
-- 
2.45.2


^ permalink raw reply	[relevance 4%]

* RE: [PATCH v13 1/3] power: introduce PM QoS API on CPU wide
  2024-10-25  9:18  5%   ` [PATCH v13 1/3] power: introduce PM QoS API on CPU wide Huisong Li
@ 2024-10-25 12:08  0%     ` Tummala, Sivaprasad
  0 siblings, 0 replies; 200+ results
From: Tummala, Sivaprasad @ 2024-10-25 12:08 UTC (permalink / raw)
  To: Huisong Li, dev
  Cc: mb, thomas, Yigit, Ferruh, anatoly.burakov, david.hunt, stephen,
	konstantin.ananyev, david.marchand, fengchengwen, liuyonglong

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Huisong,

LGTM! One comment to update the doxygen documentation for the new APIs.

> -----Original Message-----
> From: Huisong Li <lihuisong@huawei.com>
> Sent: Friday, October 25, 2024 2:49 PM
> To: dev@dpdk.org
> Cc: mb@smartsharesystems.com; thomas@monjalon.net; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; anatoly.burakov@intel.com; david.hunt@intel.com;
> Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
> stephen@networkplumber.org; konstantin.ananyev@huawei.com;
> david.marchand@redhat.com; fengchengwen@huawei.com;
> liuyonglong@huawei.com; lihuisong@huawei.com
> Subject: [PATCH v13 1/3] power: introduce PM QoS API on CPU wide
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> The deeper the idle state, the lower the power consumption, but the longer the
> resume time. Some service are delay sensitive and very except the low resume
> time, like interrupt packet receiving mode.
>
> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> interface is used to set and get the resume latency limit on the cpuX for userspace.
> Each cpuidle governor in Linux select which idle state to enter based on this CPU
> resume latency in their idle task.
>
> The per-CPU PM QoS API can be used to control this CPU's idle state selection
> and limit just enter the shallowest idle state to low the delay when wake up from by
> setting strict resume latency (zero value).
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> Acked-by: Chengwen Feng <fengchengwen@huawei.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> ---
>  doc/guides/prog_guide/power_man.rst    |  19 ++++
>  doc/guides/rel_notes/release_24_11.rst |   5 +
>  lib/power/meson.build                  |   2 +
>  lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
>  lib/power/rte_power_qos.h              |  73 +++++++++++++++
>  lib/power/version.map                  |   4 +
>  6 files changed, 226 insertions(+)
>  create mode 100644 lib/power/rte_power_qos.c  create mode 100644
> lib/power/rte_power_qos.h
>
> diff --git a/doc/guides/prog_guide/power_man.rst
> b/doc/guides/prog_guide/power_man.rst
> index f6674efe2d..91358b04f3 100644
> --- a/doc/guides/prog_guide/power_man.rst
> +++ b/doc/guides/prog_guide/power_man.rst
> @@ -107,6 +107,25 @@ User Cases
>  The power management mechanism is used to save power when performing L3
> forwarding.
>
>
> +PM QoS
> +------
> +
> +The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> +interface is used to set and get the resume latency limit on the cpuX
> +for userspace. Each cpuidle governor in Linux select which idle state
> +to enter based on this CPU resume latency in their idle task.
> +
> +The deeper the idle state, the lower the power consumption, but the
> +longer the resume time. Some service are latency sensitive and very
> +except the low resume time, like interrupt packet receiving mode.
> +
> +Applications can set and get the CPU resume latency by the
> +``rte_power_qos_set_cpu_resume_latency()`` and
> +``rte_power_qos_get_cpu_resume_latency()``
> +respectively. Applications can set a strict resume latency (zero value)
> +by the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume
> +latency and get better performance (instead, the power consumption of platform
> may increase).
> +
> +
>  Ethernet PMD Power Management API
>  ---------------------------------
>
> diff --git a/doc/guides/rel_notes/release_24_11.rst
> b/doc/guides/rel_notes/release_24_11.rst
> index fa4822d928..d9e268274b 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -237,6 +237,11 @@ New Features
>    This field is used to pass an extra configuration settings such as ability
>    to lookup IPv4 addresses in network byte order.
>
> +* **Introduce per-CPU PM QoS interface.**
> +
> +  * Add per-CPU PM QoS interface to low the resume latency when wake up from
> +    idle state.
> +
>  * **Added new API to register telemetry endpoint callbacks with private
> arguments.**
>
>    A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque
> value to diff --git a/lib/power/meson.build b/lib/power/meson.build index
> 2f0f3d26e9..9b5d3e8315 100644
> --- a/lib/power/meson.build
> +++ b/lib/power/meson.build
> @@ -23,12 +23,14 @@ sources = files(
>          'rte_power.c',
>          'rte_power_uncore.c',
>          'rte_power_pmd_mgmt.c',
> +       'rte_power_qos.c',
>  )
>  headers = files(
>          'rte_power.h',
>          'rte_power_guest_channel.h',
>          'rte_power_pmd_mgmt.h',
>          'rte_power_uncore.h',
> +       'rte_power_qos.h',
>  )
>
>  deps += ['timer', 'ethdev']
> diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c new file mode
> 100644 index 0000000000..4dd0532b36
> --- /dev/null
> +++ b/lib/power/rte_power_qos.c
> @@ -0,0 +1,123 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 HiSilicon Limited
> + */
> +
> +#include <errno.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include <rte_lcore.h>
> +#include <rte_log.h>
> +
> +#include "power_common.h"
> +#include "rte_power_qos.h"
> +
> +#define PM_QOS_SYSFILE_RESUME_LATENCY_US       \
> +       "/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
> +
> +#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN      32
> +
> +int
> +rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency) {
> +       char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
> +       uint32_t cpu_id;
> +       FILE *f;
> +       int ret;
> +
> +       if (!rte_lcore_is_enabled(lcore_id)) {
> +               POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
> +               return -EINVAL;
> +       }
> +       ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
> +       if (ret != 0)
> +               return ret;
> +
> +       if (latency < 0) {
> +               POWER_LOG(ERR, "latency should be greater than and equal to 0");
> +               return -EINVAL;
> +       }
> +
> +       ret = open_core_sysfs_file(&f, "w",
> PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
> +       if (ret != 0) {
> +               POWER_LOG(ERR, "Failed to open
> "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +                         cpu_id, strerror(errno));
> +               return ret;
> +       }
> +
> +       /*
> +        * Based on the sysfs interface pm_qos_resume_latency_us under
> +        * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their
> meaning
> +        * is as follows for different input string.
> +        * 1> the resume latency is 0 if the input is "n/a".
> +        * 2> the resume latency is no constraint if the input is "0".
> +        * 3> the resume latency is the actual value to be set.
> +        */
> +       if (latency == RTE_POWER_QOS_STRICT_LATENCY_VALUE)
> +               snprintf(buf, sizeof(buf), "%s", "n/a");
> +       else if (latency ==
> RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
> +               snprintf(buf, sizeof(buf), "%u", 0);
> +       else
> +               snprintf(buf, sizeof(buf), "%u", latency);
> +
> +       ret = write_core_sysfs_s(f, buf);
> +       if (ret != 0)
> +               POWER_LOG(ERR, "Failed to write
> "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +                         cpu_id, strerror(errno));
> +
> +       fclose(f);
> +
> +       return ret;
> +}
> +
> +int
> +rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id) {
> +       char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
> +       int latency = -1;
> +       uint32_t cpu_id;
> +       FILE *f;
> +       int ret;
> +
> +       if (!rte_lcore_is_enabled(lcore_id)) {
> +               POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
> +               return -EINVAL;
> +       }
> +       ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
> +       if (ret != 0)
> +               return ret;
> +
> +       ret = open_core_sysfs_file(&f, "r",
> PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
> +       if (ret != 0) {
> +               POWER_LOG(ERR, "Failed to open
> "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +                         cpu_id, strerror(errno));
> +               return ret;
> +       }
> +
> +       ret = read_core_sysfs_s(f, buf, sizeof(buf));
> +       if (ret != 0) {
> +               POWER_LOG(ERR, "Failed to read
> "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +                         cpu_id, strerror(errno));
> +               goto out;
> +       }
> +
> +       /*
> +        * Based on the sysfs interface pm_qos_resume_latency_us under
> +        * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their
> meaning
> +        * is as follows for different output string.
> +        * 1> the resume latency is 0 if the output is "n/a".
> +        * 2> the resume latency is no constraint if the output is "0".
> +        * 3> the resume latency is the actual value in used for other string.
> +        */
> +       if (strcmp(buf, "n/a") == 0)
> +               latency = RTE_POWER_QOS_STRICT_LATENCY_VALUE;
> +       else {
> +               latency = strtoul(buf, NULL, 10);
> +               latency = latency == 0 ?
> RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
> +       }
> +
> +out:
> +       fclose(f);
> +
> +       return latency != -1 ? latency : ret; }
> diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h new file mode
> 100644 index 0000000000..7a8dab9272
> --- /dev/null
> +++ b/lib/power/rte_power_qos.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 HiSilicon Limited
> + */
> +
> +#ifndef RTE_POWER_QOS_H
> +#define RTE_POWER_QOS_H
> +
> +#include <stdint.h>
> +
> +#include <rte_compat.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @file rte_power_qos.h
> + *
> + * PM QoS API.
> + *
> + * The CPU-wide resume latency limit has a positive impact on this
> +CPU's idle
> + * state selection in each cpuidle governor.
> + * Please see the PM QoS on CPU wide in the following link:
> + *
> +https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?hig
> +hlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-lat
> +ency-us
> + *
> + * The deeper the idle state, the lower the power consumption, but the
> + * longer the resume time. Some service are delay sensitive and very
> +except the
> + * low resume time, like interrupt packet receiving mode.
> + *
> + * In these case, per-CPU PM QoS API can be used to control this CPU's
> +idle
> + * state selection and limit just enter the shallowest idle state to
> +low the
> + * delay after sleep by setting strict resume latency (zero value).
> + */
> +
> +#define RTE_POWER_QOS_STRICT_LATENCY_VALUE             0
> +#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
> INT32_MAX
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * @param lcore_id
> + *   target logical core id
> + *
> + * @param latency
> + *   The latency should be greater than and equal to zero in microseconds unit.
> + *
> + * @return
> + *   0 on success. Otherwise negative value is returned.
> + */
> +__rte_experimental
> +int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int
> +latency);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the current resume latency of this logical core.
> + * The default value in kernel is @see
> +RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
> + * if don't set it.
> + *
> + * @return
> + *   Negative value on failure.
> + *   >= 0 means the actual resume latency limit on this core.
> + */
> +__rte_experimental
> +int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_POWER_QOS_H */
> diff --git a/lib/power/version.map b/lib/power/version.map index
> c9a226614e..08f178a39d 100644
> --- a/lib/power/version.map
> +++ b/lib/power/version.map
> @@ -51,4 +51,8 @@ EXPERIMENTAL {
>         rte_power_set_uncore_env;
>         rte_power_uncore_freqs;
>         rte_power_unset_uncore_env;
> +
> +       # added in 24.11
> +       rte_power_qos_get_cpu_resume_latency;
> +       rte_power_qos_set_cpu_resume_latency;
>  };
> --
> 2.22.0

Acked-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>

^ permalink raw reply	[relevance 0%]

* [PATCH v13 1/3] power: introduce PM QoS API on CPU wide
  2024-10-25  9:18  4% ` [PATCH v13 0/3] power: introduce PM QoS interface Huisong Li
@ 2024-10-25  9:18  5%   ` Huisong Li
  2024-10-25 12:08  0%     ` Tummala, Sivaprasad
  0 siblings, 1 reply; 200+ results
From: Huisong Li @ 2024-10-25  9:18 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..91358b04f3 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..d9e268274b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -237,6 +237,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9b5d3e8315 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
         'rte_power.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+	'rte_power_qos.c',
 )
 headers = files(
         'rte_power.h',
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+	'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..4dd0532b36
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == RTE_POWER_QOS_STRICT_LATENCY_VALUE)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = RTE_POWER_QOS_STRICT_LATENCY_VALUE;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..7a8dab9272
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE		0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT	INT32_MAX
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..08f178a39d 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,8 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [PATCH v13 0/3] power: introduce PM QoS interface
    2024-10-21 11:42  4% ` [PATCH v11 0/2] power: " Huisong Li
  2024-10-23  4:09  4% ` [PATCH v12 0/3] power: introduce PM QoS interface Huisong Li
@ 2024-10-25  9:18  4% ` Huisong Li
  2024-10-25  9:18  5%   ` [PATCH v13 1/3] power: introduce PM QoS API on CPU wide Huisong Li
  2024-10-29 13:28  4% ` [PATCH v14 0/3] power: introduce PM QoS interface Huisong Li
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Huisong Li @ 2024-10-25  9:18 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Please see the description in kernel document[1].
Each cpuidle governor in Linux select which idle state to enter based on
this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from idle state by setting strict resume latency (zero value).

[1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

---
 v13:
  - not allow negative value for --cpu-resume-latency.
  - restore to the original value as Konstantin suggested.
 v12:
  - add Acked-by Chengwen and Konstantin
  - fix overflow issue in l3fwd-power when parse command line
  - add a command parameter to set CPU resume latency
 v11:
  - operate the cpu id the lcore mapped by the new function
    power_get_lcore_mapped_cpu_id().
 v10:
  - replace LINE_MAX with a custom macro and fix two typos.
 v9:
  - move new feature description from release_24_07.rst to release_24_11.rst.
 v8:
  - update the latest code to resolve CI warning
 v7:
  - remove a dead code rte_lcore_is_enabled in patch[2/2]
 v6:
  - update release_24_07.rst based on dpdk repo to resolve CI warning.
 v5:
  - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
 v4:
  - fix some comments basd on Stephen
  - add stdint.h include
  - add Acked-by Morten Brørup <mb@smartsharesystems.com>
 v3:
  - add RTE_POWER_xxx prefix for some macro in header
  - add the check for lcore_id with rte_lcore_is_enabled
 v2:
  - use PM QoS on CPU wide to replace the one on system wide

Huisong Li (3):
  power: introduce PM QoS API on CPU wide
  examples/l3fwd-power: fix data overflow when parse command line
  examples/l3fwd-power: add PM QoS configuration

 doc/guides/prog_guide/power_man.rst           |  19 +++
 doc/guides/rel_notes/release_24_11.rst        |   5 +
 .../sample_app_ug/l3_forward_power_man.rst    |   5 +-
 examples/l3fwd-power/main.c                   | 115 ++++++++++++++--
 lib/power/meson.build                         |   2 +
 lib/power/rte_power_qos.c                     | 123 ++++++++++++++++++
 lib/power/rte_power_qos.h                     |  73 +++++++++++
 lib/power/version.map                         |   4 +
 8 files changed, 331 insertions(+), 15 deletions(-)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

-- 
2.22.0


^ permalink raw reply	[relevance 4%]

* [PATCH v28 13/13] doc: add release note about log library
  @ 2024-10-24 19:02  4%   ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-10-24 19:02 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Bruce Richardson, Chengwen Feng

Significant enough to add some documentation.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/rel_notes/release_24_11.rst | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..ec4b7ba2a4 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -349,6 +349,25 @@ API Changes
   and replaced it with a new shared devarg ``llq_policy`` that keeps the same logic.
 
 
+* **Logging library changes**
+
+  * The log is initialized earlier in startup so all messages go through the library.
+
+  * Added a new option to timestamp log messages, which is useful for
+    debugging delays in application and driver startup.
+
+  * Syslog option change. If *--syslog* is specified, then messages
+    will go to syslog; if not specified then messages will only be displayed
+    on stderr. This option is now supported on FreeBSD (but not on Windows).
+
+  * If the application is a systemd service and the log output is being
+    sent of standard error then DPDK will switch to journal native protocol.
+
+  * Log messages can be timestamped with *--log-timestamp* option.
+
+  * Log messages can be colorized with the *--log-color* option.
+
+
 ABI Changes
 -----------
 
-- 
2.45.2


^ permalink raw reply	[relevance 4%]

* [PATCH v27 14/14] doc: add release note about log library
  @ 2024-10-24  3:18  4%   ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-10-24  3:18 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Bruce Richardson, Chengwen Feng

Significant enough to add some documentation.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
---
 doc/guides/rel_notes/release_24_11.rst | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..ec4b7ba2a4 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -349,6 +349,25 @@ API Changes
   and replaced it with a new shared devarg ``llq_policy`` that keeps the same logic.
 
 
+* **Logging library changes**
+
+  * The log is initialized earlier in startup so all messages go through the library.
+
+  * Added a new option to timestamp log messages, which is useful for
+    debugging delays in application and driver startup.
+
+  * Syslog option change. If *--syslog* is specified, then messages
+    will go to syslog; if not specified then messages will only be displayed
+    on stderr. This option is now supported on FreeBSD (but not on Windows).
+
+  * If the application is a systemd service and the log output is being
+    sent of standard error then DPDK will switch to journal native protocol.
+
+  * Log messages can be timestamped with *--log-timestamp* option.
+
+  * Log messages can be colorized with the *--log-color* option.
+
+
 ABI Changes
 -----------
 
-- 
2.45.2


^ permalink raw reply	[relevance 4%]

* [PATCH v7 1/3] cryptodev: add ec points to sm2 op
  2024-10-23  8:19  3% [PATCH v7 " Arkadiusz Kusztal
@ 2024-10-23  8:19  4% ` Arkadiusz Kusztal
  0 siblings, 0 replies; 200+ results
From: Arkadiusz Kusztal @ 2024-10-23  8:19 UTC (permalink / raw)
  To: dev; +Cc: gakhil, brian.dooley, Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

Points C1, kP therefore were added to the SM2 crypto operation struct.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
 doc/guides/rel_notes/release_24_11.rst |  3 ++
 lib/cryptodev/rte_crypto_asym.h        | 56 +++++++++++++++++++-------
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..0f91dae987 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -406,6 +406,9 @@ ABI Changes
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
   added ``xstat_off`` to ``rte_node``.
 
+* cryptodev: The ``rte_crypto_sm2_op_param`` struct member to hold ciphertext
+  is changed to union data type. This change is to support partial SM2 calculation.
+
 
 Known Issues
 ------------
diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
index aeb46e688e..f095cebcd0 100644
--- a/lib/cryptodev/rte_crypto_asym.h
+++ b/lib/cryptodev/rte_crypto_asym.h
@@ -646,6 +646,8 @@ enum rte_crypto_sm2_op_capa {
 	/**< Random number generator supported in SM2 ops. */
 	RTE_CRYPTO_SM2_PH,
 	/**< Prehash message before crypto op. */
+	RTE_CRYPTO_SM2_PARTIAL,
+	/**< Calculate elliptic curve points only. */
 };
 
 /**
@@ -673,20 +675,46 @@ struct rte_crypto_sm2_op_param {
 	 * will be overwritten by the PMD with the decrypted length.
 	 */
 
-	rte_crypto_param cipher;
-	/**<
-	 * Pointer to input data
-	 * - to be decrypted for SM2 private decrypt.
-	 *
-	 * Pointer to output data
-	 * - for SM2 public encrypt.
-	 * In this case the underlying array should have been allocated
-	 * with enough memory to hold ciphertext output (at least X bytes
-	 * for prime field curve of N bytes and for message M bytes,
-	 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
-	 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
-	 * be overwritten by the PMD with the encrypted length.
-	 */
+	union {
+		rte_crypto_param cipher;
+		/**<
+		 * Pointer to input data
+		 * - to be decrypted for SM2 private decrypt.
+		 *
+		 * Pointer to output data
+		 * - for SM2 public encrypt.
+		 * In this case the underlying array should have been allocated
+		 * with enough memory to hold ciphertext output (at least X bytes
+		 * for prime field curve of N bytes and for message M bytes,
+		 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
+		 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
+		 * be overwritten by the PMD with the encrypted length.
+		 */
+		struct {
+			struct rte_crypto_ec_point c1;
+			/**<
+			 * This field is used only when PMD does not support the full
+			 * process of the SM2 encryption/decryption, but the elliptic
+			 * curve part only.
+			 *
+			 * In the case of encryption, it is an output - point C1 = (x1,y1).
+			 * In the case of decryption, if is an input - point C1 = (x1,y1).
+			 *
+			 * Must be used along with the RTE_CRYPTO_SM2_PARTIAL flag.
+			 */
+			struct rte_crypto_ec_point kp;
+			/**<
+			 * This field is used only when PMD does not support the full
+			 * process of the SM2 encryption/decryption, but the elliptic
+			 * curve part only.
+			 *
+			 * It is an output in the encryption case, it is a point
+			 * [k]P = (x2,y2).
+			 *
+			 * Must be used along with the RTE_CRYPTO_SM2_PARTIAL flag.
+			 */
+		};
+	};
 
 	rte_crypto_uint id;
 	/**< The SM2 id used by signer and verifier. */
-- 
2.17.1


^ permalink raw reply	[relevance 4%]

* [PATCH v7 0/3] add ec points to sm2 op
@ 2024-10-23  8:19  3% Arkadiusz Kusztal
  2024-10-23  8:19  4% ` [PATCH v7 1/3] cryptodev: " Arkadiusz Kusztal
  0 siblings, 1 reply; 200+ results
From: Arkadiusz Kusztal @ 2024-10-23  8:19 UTC (permalink / raw)
  To: dev; +Cc: gakhil, brian.dooley, Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

v2:
- rebased against the 24.11 code
v3:
- added feature flag
- added QAT patches
- added test patches
v4:
- replaced feature flag with capability
- split API patches
v5:
- rebased
- clarified usage of the partial flag
v6:
- removed already applied patch 1
- added ABI relase notes comment
- removed camel case
- added flag reference
v7:
- removed SM2 from auth features, in asym it was added in SM2 ECDSA patch

Arkadiusz Kusztal (3):
  cryptodev: add ec points to sm2 op
  crypto/qat: add sm2 encryption/decryption function
  app/test: add test sm2 C1/Kp test cases

 app/test/test_cryptodev_asym.c                | 138 ++++++++++++++++-
 app/test/test_cryptodev_sm2_test_vectors.h    | 112 +++++++++++++-
 doc/guides/rel_notes/release_24_11.rst        |   7 +
 .../common/qat/qat_adf/icp_qat_fw_mmp_ids.h   |   3 +
 drivers/common/qat/qat_adf/qat_pke.h          |  20 +++
 drivers/crypto/qat/qat_asym.c                 | 140 +++++++++++++++++-
 lib/cryptodev/rte_crypto_asym.h               |  56 +++++--
 7 files changed, 452 insertions(+), 24 deletions(-)

-- 
2.17.1


^ permalink raw reply	[relevance 3%]

* [PATCH v12 1/3] power: introduce PM QoS API on CPU wide
  2024-10-23  4:09  4% ` [PATCH v12 0/3] power: introduce PM QoS interface Huisong Li
@ 2024-10-23  4:09  5%   ` Huisong Li
  0 siblings, 0 replies; 200+ results
From: Huisong Li @ 2024-10-23  4:09 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..91358b04f3 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..d9e268274b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -237,6 +237,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9b5d3e8315 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
         'rte_power.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+	'rte_power_qos.c',
 )
 headers = files(
         'rte_power.h',
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+	'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..4dd0532b36
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == RTE_POWER_QOS_STRICT_LATENCY_VALUE)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = RTE_POWER_QOS_STRICT_LATENCY_VALUE;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..7a8dab9272
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE		0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT	INT32_MAX
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..08f178a39d 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,8 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [PATCH v12 0/3] power: introduce PM QoS interface
    2024-10-21 11:42  4% ` [PATCH v11 0/2] power: " Huisong Li
@ 2024-10-23  4:09  4% ` Huisong Li
  2024-10-23  4:09  5%   ` [PATCH v12 1/3] power: introduce PM QoS API on CPU wide Huisong Li
  2024-10-25  9:18  4% ` [PATCH v13 0/3] power: introduce PM QoS interface Huisong Li
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Huisong Li @ 2024-10-23  4:09 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Please see the description in kernel document[1].
Each cpuidle governor in Linux select which idle state to enter based on
this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from idle state by setting strict resume latency (zero value).

[1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

---
 v12:
  - add Acked-by Chengwen and Konstantin
  - fix overflow issue in l3fwd-power when parse command line
  - add a command parameter to set CPU resume latency
 v11:
  - operate the cpu id the lcore mapped by the new function
    power_get_lcore_mapped_cpu_id().
 v10:
  - replace LINE_MAX with a custom macro and fix two typos.
 v9:
  - move new feature description from release_24_07.rst to release_24_11.rst.
 v8:
  - update the latest code to resolve CI warning
 v7:
  - remove a dead code rte_lcore_is_enabled in patch[2/2]
 v6:
  - update release_24_07.rst based on dpdk repo to resolve CI warning.
 v5:
  - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
 v4:
  - fix some comments basd on Stephen
  - add stdint.h include
  - add Acked-by Morten Brørup <mb@smartsharesystems.com>
 v3:
  - add RTE_POWER_xxx prefix for some macro in header
  - add the check for lcore_id with rte_lcore_is_enabled
 v2:
  - use PM QoS on CPU wide to replace the one on system wide

Huisong Li (3):
  power: introduce PM QoS API on CPU wide
  examples/l3fwd-power: fix data overflow when parse command line
  examples/l3fwd-power: add PM QoS configuration

 doc/guides/prog_guide/power_man.rst           |  19 +++
 doc/guides/rel_notes/release_24_11.rst        |   5 +
 .../sample_app_ug/l3_forward_power_man.rst    |   5 +-
 examples/l3fwd-power/main.c                   |  92 +++++++++++--
 lib/power/meson.build                         |   2 +
 lib/power/rte_power_qos.c                     | 123 ++++++++++++++++++
 lib/power/rte_power_qos.h                     |  73 +++++++++++
 lib/power/version.map                         |   4 +
 8 files changed, 308 insertions(+), 15 deletions(-)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

-- 
2.22.0


^ permalink raw reply	[relevance 4%]

* Re: [PATCH v6 0/3] add ec points to sm2 op
  2024-10-22 19:05  3% [PATCH v6 0/3] add ec points to sm2 op Arkadiusz Kusztal
  2024-10-22 19:05  4% ` [PATCH v6 1/3] cryptodev: " Arkadiusz Kusztal
@ 2024-10-23  1:19  0% ` Stephen Hemminger
  1 sibling, 0 replies; 200+ results
From: Stephen Hemminger @ 2024-10-23  1:19 UTC (permalink / raw)
  To: Arkadiusz Kusztal; +Cc: dev, gakhil, brian.dooley

On Tue, 22 Oct 2024 20:05:57 +0100
Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com> wrote:

> In the case when PMD cannot support the full process of the SM2,
> but elliptic curve computation only, additional fields
> are needed to handle such a case.
> 
> v2:
> - rebased against the 24.11 code
> v3:
> - added feature flag
> - added QAT patches
> - added test patches
> v4:
> - replaced feature flag with capability
> - split API patches
> v5:
> - rebased
> - clarified usage of the partial flag
> v6:
> - removed already applied patch 1
> - added ABI relase notes comment
> - removed camel case
> - added flag reference
> 
> Arkadiusz Kusztal (3):
>   cryptodev: add ec points to sm2 op
>   crypto/qat: add sm2 encryption/decryption function
>   app/test: add test sm2 C1/Kp test cases
> 
>  app/test/test_cryptodev_asym.c                | 138 ++++++++++++++++-
>  app/test/test_cryptodev_sm2_test_vectors.h    | 112 +++++++++++++-
>  doc/guides/cryptodevs/features/qat.ini        |   1 +
>  doc/guides/rel_notes/release_24_11.rst        |   7 +
>  .../common/qat/qat_adf/icp_qat_fw_mmp_ids.h   |   3 +
>  drivers/common/qat/qat_adf/qat_pke.h          |  20 +++
>  drivers/crypto/qat/qat_asym.c                 | 140 +++++++++++++++++-
>  lib/cryptodev/rte_crypto_asym.h               |  56 +++++--
>  8 files changed, 453 insertions(+), 24 deletions(-)

There is an issue with new feature missing in some of the templates of the doc.

$ ninja -C build doc
ninja: Entering directory `build'
[4/6] Generating doc/api/dts/dts_api_html with a custom command
Warning generate_overview_table(): Unknown feature 'SM2' in 'qat.ini'


^ permalink raw reply	[relevance 0%]

* [PATCH v6 1/3] cryptodev: add ec points to sm2 op
  2024-10-22 19:05  3% [PATCH v6 0/3] add ec points to sm2 op Arkadiusz Kusztal
@ 2024-10-22 19:05  4% ` Arkadiusz Kusztal
  2024-10-23  1:19  0% ` [PATCH v6 0/3] " Stephen Hemminger
  1 sibling, 0 replies; 200+ results
From: Arkadiusz Kusztal @ 2024-10-22 19:05 UTC (permalink / raw)
  To: dev; +Cc: gakhil, brian.dooley, Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

Points C1, kP therefore were added to the SM2 crypto operation struct.

Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
---
 doc/guides/rel_notes/release_24_11.rst |  3 ++
 lib/cryptodev/rte_crypto_asym.h        | 56 +++++++++++++++++++-------
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..0f91dae987 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -406,6 +406,9 @@ ABI Changes
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
   added ``xstat_off`` to ``rte_node``.
 
+* cryptodev: The ``rte_crypto_sm2_op_param`` struct member to hold ciphertext
+  is changed to union data type. This change is to support partial SM2 calculation.
+
 
 Known Issues
 ------------
diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
index aeb46e688e..f095cebcd0 100644
--- a/lib/cryptodev/rte_crypto_asym.h
+++ b/lib/cryptodev/rte_crypto_asym.h
@@ -646,6 +646,8 @@ enum rte_crypto_sm2_op_capa {
 	/**< Random number generator supported in SM2 ops. */
 	RTE_CRYPTO_SM2_PH,
 	/**< Prehash message before crypto op. */
+	RTE_CRYPTO_SM2_PARTIAL,
+	/**< Calculate elliptic curve points only. */
 };
 
 /**
@@ -673,20 +675,46 @@ struct rte_crypto_sm2_op_param {
 	 * will be overwritten by the PMD with the decrypted length.
 	 */
 
-	rte_crypto_param cipher;
-	/**<
-	 * Pointer to input data
-	 * - to be decrypted for SM2 private decrypt.
-	 *
-	 * Pointer to output data
-	 * - for SM2 public encrypt.
-	 * In this case the underlying array should have been allocated
-	 * with enough memory to hold ciphertext output (at least X bytes
-	 * for prime field curve of N bytes and for message M bytes,
-	 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
-	 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
-	 * be overwritten by the PMD with the encrypted length.
-	 */
+	union {
+		rte_crypto_param cipher;
+		/**<
+		 * Pointer to input data
+		 * - to be decrypted for SM2 private decrypt.
+		 *
+		 * Pointer to output data
+		 * - for SM2 public encrypt.
+		 * In this case the underlying array should have been allocated
+		 * with enough memory to hold ciphertext output (at least X bytes
+		 * for prime field curve of N bytes and for message M bytes,
+		 * where X = (C1 || C2 || C3) and computed based on SM2 RFC as
+		 * C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
+		 * be overwritten by the PMD with the encrypted length.
+		 */
+		struct {
+			struct rte_crypto_ec_point c1;
+			/**<
+			 * This field is used only when PMD does not support the full
+			 * process of the SM2 encryption/decryption, but the elliptic
+			 * curve part only.
+			 *
+			 * In the case of encryption, it is an output - point C1 = (x1,y1).
+			 * In the case of decryption, if is an input - point C1 = (x1,y1).
+			 *
+			 * Must be used along with the RTE_CRYPTO_SM2_PARTIAL flag.
+			 */
+			struct rte_crypto_ec_point kp;
+			/**<
+			 * This field is used only when PMD does not support the full
+			 * process of the SM2 encryption/decryption, but the elliptic
+			 * curve part only.
+			 *
+			 * It is an output in the encryption case, it is a point
+			 * [k]P = (x2,y2).
+			 *
+			 * Must be used along with the RTE_CRYPTO_SM2_PARTIAL flag.
+			 */
+		};
+	};
 
 	rte_crypto_uint id;
 	/**< The SM2 id used by signer and verifier. */
-- 
2.17.1


^ permalink raw reply	[relevance 4%]

* [PATCH v6 0/3] add ec points to sm2 op
@ 2024-10-22 19:05  3% Arkadiusz Kusztal
  2024-10-22 19:05  4% ` [PATCH v6 1/3] cryptodev: " Arkadiusz Kusztal
  2024-10-23  1:19  0% ` [PATCH v6 0/3] " Stephen Hemminger
  0 siblings, 2 replies; 200+ results
From: Arkadiusz Kusztal @ 2024-10-22 19:05 UTC (permalink / raw)
  To: dev; +Cc: gakhil, brian.dooley, Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

v2:
- rebased against the 24.11 code
v3:
- added feature flag
- added QAT patches
- added test patches
v4:
- replaced feature flag with capability
- split API patches
v5:
- rebased
- clarified usage of the partial flag
v6:
- removed already applied patch 1
- added ABI relase notes comment
- removed camel case
- added flag reference

Arkadiusz Kusztal (3):
  cryptodev: add ec points to sm2 op
  crypto/qat: add sm2 encryption/decryption function
  app/test: add test sm2 C1/Kp test cases

 app/test/test_cryptodev_asym.c                | 138 ++++++++++++++++-
 app/test/test_cryptodev_sm2_test_vectors.h    | 112 +++++++++++++-
 doc/guides/cryptodevs/features/qat.ini        |   1 +
 doc/guides/rel_notes/release_24_11.rst        |   7 +
 .../common/qat/qat_adf/icp_qat_fw_mmp_ids.h   |   3 +
 drivers/common/qat/qat_adf/qat_pke.h          |  20 +++
 drivers/crypto/qat/qat_asym.c                 | 140 +++++++++++++++++-
 lib/cryptodev/rte_crypto_asym.h               |  56 +++++--
 8 files changed, 453 insertions(+), 24 deletions(-)

-- 
2.17.1


^ permalink raw reply	[relevance 3%]

* Re: [PATCH v11 1/2] power: introduce PM QoS API on CPU wide
  2024-10-22  9:08  0%     ` Konstantin Ananyev
@ 2024-10-22  9:41  0%       ` lihuisong (C)
  0 siblings, 0 replies; 200+ results
From: lihuisong (C) @ 2024-10-22  9:41 UTC (permalink / raw)
  To: Konstantin Ananyev, dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, david.marchand, Fengchengwen,
	liuyonglong


在 2024/10/22 17:08, Konstantin Ananyev 写道:
>
>> The deeper the idle state, the lower the power consumption, but the longer
>> the resume time. Some service are delay sensitive and very except the low
>> resume time, like interrupt packet receiving mode.
>>
>> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
>> interface is used to set and get the resume latency limit on the cpuX for
>> userspace. Each cpuidle governor in Linux select which idle state to enter
>> based on this CPU resume latency in their idle task.
>>
>> The per-CPU PM QoS API can be used to control this CPU's idle state
>> selection and limit just enter the shallowest idle state to low the delay
>> when wake up from by setting strict resume latency (zero value).
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> LGTM overall, few nits, see below.
> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>
>> ---
>>   doc/guides/prog_guide/power_man.rst    |  19 ++++
>>   doc/guides/rel_notes/release_24_11.rst |   5 +
>>   lib/power/meson.build                  |   2 +
>>   lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
>>   lib/power/rte_power_qos.h              |  73 +++++++++++++++
>>   lib/power/version.map                  |   4 +
>>   6 files changed, 226 insertions(+)
>>   create mode 100644 lib/power/rte_power_qos.c
>>   create mode 100644 lib/power/rte_power_qos.h
>>
>> diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
>> index f6674efe2d..91358b04f3 100644
>> --- a/doc/guides/prog_guide/power_man.rst
>> +++ b/doc/guides/prog_guide/power_man.rst
>> @@ -107,6 +107,25 @@ User Cases
>>   The power management mechanism is used to save power when performing L3 forwarding.
>>
>>
>> +PM QoS
>> +------
>> +
>> +The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
>> +interface is used to set and get the resume latency limit on the cpuX for
>> +userspace. Each cpuidle governor in Linux select which idle state to enter
>> +based on this CPU resume latency in their idle task.
>> +
>> +The deeper the idle state, the lower the power consumption, but the longer
>> +the resume time. Some service are latency sensitive and very except the low
>> +resume time, like interrupt packet receiving mode.
>> +
>> +Applications can set and get the CPU resume latency by the
>> +``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
>> +respectively. Applications can set a strict resume latency (zero value) by
>> +the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
>> +get better performance (instead, the power consumption of platform may increase).
>> +
>> +
>>   Ethernet PMD Power Management API
>>   ---------------------------------
>>
>> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
>> index fa4822d928..d9e268274b 100644
>> --- a/doc/guides/rel_notes/release_24_11.rst
>> +++ b/doc/guides/rel_notes/release_24_11.rst
>> @@ -237,6 +237,11 @@ New Features
>>     This field is used to pass an extra configuration settings such as ability
>>     to lookup IPv4 addresses in network byte order.
>>
>> +* **Introduce per-CPU PM QoS interface.**
>> +
>> +  * Add per-CPU PM QoS interface to low the resume latency when wake up from
>> +    idle state.
>> +
>>   * **Added new API to register telemetry endpoint callbacks with private arguments.**
>>
>>     A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
>> diff --git a/lib/power/meson.build b/lib/power/meson.build
>> index 2f0f3d26e9..9b5d3e8315 100644
>> --- a/lib/power/meson.build
>> +++ b/lib/power/meson.build
>> @@ -23,12 +23,14 @@ sources = files(
>>           'rte_power.c',
>>           'rte_power_uncore.c',
>>           'rte_power_pmd_mgmt.c',
>> +	'rte_power_qos.c',
>>   )
>>   headers = files(
>>           'rte_power.h',
>>           'rte_power_guest_channel.h',
>>           'rte_power_pmd_mgmt.h',
>>           'rte_power_uncore.h',
>> +	'rte_power_qos.h',
>>   )
>>
>>   deps += ['timer', 'ethdev']
>> diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
>> new file mode 100644
>> index 0000000000..09692b2161
>> --- /dev/null
>> +++ b/lib/power/rte_power_qos.c
>> @@ -0,0 +1,123 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 HiSilicon Limited
>> + */
>> +
>> +#include <errno.h>
>> +#include <stdlib.h>
>> +#include <string.h>
>> +
>> +#include <rte_lcore.h>
>> +#include <rte_log.h>
>> +
>> +#include "power_common.h"
>> +#include "rte_power_qos.h"
>> +
>> +#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
>> +	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
>> +
>> +#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
>> +
>> +int
>> +rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
>> +{
>> +	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
>> +	uint32_t cpu_id;
>> +	FILE *f;
>> +	int ret;
>> +
>> +	if (!rte_lcore_is_enabled(lcore_id)) {
>> +		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
>> +		return -EINVAL;
>> +	}
>> +	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
>> +	if (ret != 0)
>> +		return ret;
>> +
>> +	if (latency < 0) {
>> +		POWER_LOG(ERR, "latency should be greater than and equal to 0");
>> +		return -EINVAL;
>> +	}
>> +
>> +	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
>> +	if (ret != 0) {
>> +		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
>> +			  cpu_id, strerror(errno));
>> +		return ret;
>> +	}
>> +
>> +	/*
>> +	 * Based on the sysfs interface pm_qos_resume_latency_us under
>> +	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
>> +	 * is as follows for different input string.
>> +	 * 1> the resume latency is 0 if the input is "n/a".
>> +	 * 2> the resume latency is no constraint if the input is "0".
>> +	 * 3> the resume latency is the actual value to be set.
>> +	 */
>> +	if (latency == 0)
>
> Why not to use your own macro:
> RTE_POWER_QOS_STRICT_LATENCY_VALUE
> Instead of hard-coded constant here?
you are right. will fix it in next version.
>
>> +		snprintf(buf, sizeof(buf), "%s", "n/a");
>> +	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
>> +		snprintf(buf, sizeof(buf), "%u", 0);
>> +	else
>> +		snprintf(buf, sizeof(buf), "%u", latency);
>> +
>> +	ret = write_core_sysfs_s(f, buf);
>> +	if (ret != 0)
>> +		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
>> +			  cpu_id, strerror(errno));
>> +
>> +	fclose(f);
>> +
>> +	return ret;
>> +}
>> +
>> +int
>> +rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
>> +{
>> +	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
>> +	int latency = -1;
>> +	uint32_t cpu_id;
>> +	FILE *f;
>> +	int ret;
>> +
>> +	if (!rte_lcore_is_enabled(lcore_id)) {
>> +		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
>> +		return -EINVAL;
>> +	}
>> +	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
>> +	if (ret != 0)
>> +		return ret;
>> +
>> +	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
>> +	if (ret != 0) {
>> +		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
>> +			  cpu_id, strerror(errno));
>> +		return ret;
>> +	}
>> +
>> +	ret = read_core_sysfs_s(f, buf, sizeof(buf));
>> +	if (ret != 0) {
>> +		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
>> +			  cpu_id, strerror(errno));
>> +		goto out;
>> +	}
>> +
>> +	/*
>> +	 * Based on the sysfs interface pm_qos_resume_latency_us under
>> +	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
>> +	 * is as follows for different output string.
>> +	 * 1> the resume latency is 0 if the output is "n/a".
>> +	 * 2> the resume latency is no constraint if the output is "0".
>> +	 * 3> the resume latency is the actual value in used for other string.
>> +	 */
>> +	if (strcmp(buf, "n/a") == 0)
>> +		latency = 0;
>
> RTE_POWER_QOS_STRICT_LATENCY_VALUE
Ack
> ?
>
>> +	else {
>> +		latency = strtoul(buf, NULL, 10);
>> +		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
>> +	}
>> +
>> +out:
>> +	fclose(f);
>> +
>> +	return latency != -1 ? latency : ret;
>> +}
>> diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
>> new file mode 100644
>> index 0000000000..990c488373
>> --- /dev/null
>> +++ b/lib/power/rte_power_qos.h
>> @@ -0,0 +1,73 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2024 HiSilicon Limited
>> + */
>> +
>> +#ifndef RTE_POWER_QOS_H
>> +#define RTE_POWER_QOS_H
>> +
>> +#include <stdint.h>
>> +
>> +#include <rte_compat.h>
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +/**
>> + * @file rte_power_qos.h
>> + *
>> + * PM QoS API.
>> + *
>> + * The CPU-wide resume latency limit has a positive impact on this CPU's idle
>> + * state selection in each cpuidle governor.
>> + * Please see the PM QoS on CPU wide in the following link:
>> + * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-
>> power-pm-qos-resume-latency-us
>> + *
>> + * The deeper the idle state, the lower the power consumption, but the
>> + * longer the resume time. Some service are delay sensitive and very except the
>> + * low resume time, like interrupt packet receiving mode.
>> + *
>> + * In these case, per-CPU PM QoS API can be used to control this CPU's idle
>> + * state selection and limit just enter the shallowest idle state to low the
>> + * delay after sleep by setting strict resume latency (zero value).
>> + */
>> +
>> +#define RTE_POWER_QOS_STRICT_LATENCY_VALUE             0
>> +#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT    ((int)(UINT32_MAX >> 1))
> Isn't it just INT32_MAX?
will fix it.
>
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * @param lcore_id
>> + *   target logical core id
>> + *
>> + * @param latency
>> + *   The latency should be greater than and equal to zero in microseconds unit.
>> + *
>> + * @return
>> + *   0 on success. Otherwise negative value is returned.
>> + */
>> +__rte_experimental
>> +int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Get the current resume latency of this logical core.
>> + * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
>> + * if don't set it.
>> + *
>> + * @return
>> + *   Negative value on failure.
>> + *   >= 0 means the actual resume latency limit on this core.
>> + */
>> +__rte_experimental
>> +int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* RTE_POWER_QOS_H */
>> diff --git a/lib/power/version.map b/lib/power/version.map
>> index c9a226614e..08f178a39d 100644
>> --- a/lib/power/version.map
>> +++ b/lib/power/version.map
>> @@ -51,4 +51,8 @@ EXPERIMENTAL {
>>   	rte_power_set_uncore_env;
>>   	rte_power_uncore_freqs;
>>   	rte_power_unset_uncore_env;
>> +
>> +	# added in 24.11
>> +	rte_power_qos_get_cpu_resume_latency;
>> +	rte_power_qos_set_cpu_resume_latency;
>>   };
>> --
>> 2.22.0

^ permalink raw reply	[relevance 0%]

* RE: [PATCH v11 1/2] power: introduce PM QoS API on CPU wide
  2024-10-21 11:42  5%   ` [PATCH v11 1/2] power: introduce PM QoS API on CPU wide Huisong Li
@ 2024-10-22  9:08  0%     ` Konstantin Ananyev
  2024-10-22  9:41  0%       ` lihuisong (C)
  0 siblings, 1 reply; 200+ results
From: Konstantin Ananyev @ 2024-10-22  9:08 UTC (permalink / raw)
  To: lihuisong (C), dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, david.marchand, Fengchengwen,
	liuyonglong



> The deeper the idle state, the lower the power consumption, but the longer
> the resume time. Some service are delay sensitive and very except the low
> resume time, like interrupt packet receiving mode.
> 
> And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> interface is used to set and get the resume latency limit on the cpuX for
> userspace. Each cpuidle governor in Linux select which idle state to enter
> based on this CPU resume latency in their idle task.
> 
> The per-CPU PM QoS API can be used to control this CPU's idle state
> selection and limit just enter the shallowest idle state to low the delay
> when wake up from by setting strict resume latency (zero value).
> 
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>

LGTM overall, few nits, see below.
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>

> ---
>  doc/guides/prog_guide/power_man.rst    |  19 ++++
>  doc/guides/rel_notes/release_24_11.rst |   5 +
>  lib/power/meson.build                  |   2 +
>  lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
>  lib/power/rte_power_qos.h              |  73 +++++++++++++++
>  lib/power/version.map                  |   4 +
>  6 files changed, 226 insertions(+)
>  create mode 100644 lib/power/rte_power_qos.c
>  create mode 100644 lib/power/rte_power_qos.h
> 
> diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
> index f6674efe2d..91358b04f3 100644
> --- a/doc/guides/prog_guide/power_man.rst
> +++ b/doc/guides/prog_guide/power_man.rst
> @@ -107,6 +107,25 @@ User Cases
>  The power management mechanism is used to save power when performing L3 forwarding.
> 
> 
> +PM QoS
> +------
> +
> +The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
> +interface is used to set and get the resume latency limit on the cpuX for
> +userspace. Each cpuidle governor in Linux select which idle state to enter
> +based on this CPU resume latency in their idle task.
> +
> +The deeper the idle state, the lower the power consumption, but the longer
> +the resume time. Some service are latency sensitive and very except the low
> +resume time, like interrupt packet receiving mode.
> +
> +Applications can set and get the CPU resume latency by the
> +``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
> +respectively. Applications can set a strict resume latency (zero value) by
> +the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
> +get better performance (instead, the power consumption of platform may increase).
> +
> +
>  Ethernet PMD Power Management API
>  ---------------------------------
> 
> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
> index fa4822d928..d9e268274b 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -237,6 +237,11 @@ New Features
>    This field is used to pass an extra configuration settings such as ability
>    to lookup IPv4 addresses in network byte order.
> 
> +* **Introduce per-CPU PM QoS interface.**
> +
> +  * Add per-CPU PM QoS interface to low the resume latency when wake up from
> +    idle state.
> +
>  * **Added new API to register telemetry endpoint callbacks with private arguments.**
> 
>    A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
> diff --git a/lib/power/meson.build b/lib/power/meson.build
> index 2f0f3d26e9..9b5d3e8315 100644
> --- a/lib/power/meson.build
> +++ b/lib/power/meson.build
> @@ -23,12 +23,14 @@ sources = files(
>          'rte_power.c',
>          'rte_power_uncore.c',
>          'rte_power_pmd_mgmt.c',
> +	'rte_power_qos.c',
>  )
>  headers = files(
>          'rte_power.h',
>          'rte_power_guest_channel.h',
>          'rte_power_pmd_mgmt.h',
>          'rte_power_uncore.h',
> +	'rte_power_qos.h',
>  )
> 
>  deps += ['timer', 'ethdev']
> diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
> new file mode 100644
> index 0000000000..09692b2161
> --- /dev/null
> +++ b/lib/power/rte_power_qos.c
> @@ -0,0 +1,123 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 HiSilicon Limited
> + */
> +
> +#include <errno.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include <rte_lcore.h>
> +#include <rte_log.h>
> +
> +#include "power_common.h"
> +#include "rte_power_qos.h"
> +
> +#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
> +	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
> +
> +#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
> +
> +int
> +rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
> +{
> +	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
> +	uint32_t cpu_id;
> +	FILE *f;
> +	int ret;
> +
> +	if (!rte_lcore_is_enabled(lcore_id)) {
> +		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
> +		return -EINVAL;
> +	}
> +	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
> +	if (ret != 0)
> +		return ret;
> +
> +	if (latency < 0) {
> +		POWER_LOG(ERR, "latency should be greater than and equal to 0");
> +		return -EINVAL;
> +	}
> +
> +	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
> +	if (ret != 0) {
> +		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +			  cpu_id, strerror(errno));
> +		return ret;
> +	}
> +
> +	/*
> +	 * Based on the sysfs interface pm_qos_resume_latency_us under
> +	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
> +	 * is as follows for different input string.
> +	 * 1> the resume latency is 0 if the input is "n/a".
> +	 * 2> the resume latency is no constraint if the input is "0".
> +	 * 3> the resume latency is the actual value to be set.
> +	 */
> +	if (latency == 0)


Why not to use your own macro:
RTE_POWER_QOS_STRICT_LATENCY_VALUE
Instead of hard-coded constant here?

> +		snprintf(buf, sizeof(buf), "%s", "n/a");
> +	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
> +		snprintf(buf, sizeof(buf), "%u", 0);
> +	else
> +		snprintf(buf, sizeof(buf), "%u", latency);
> +
> +	ret = write_core_sysfs_s(f, buf);
> +	if (ret != 0)
> +		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +			  cpu_id, strerror(errno));
> +
> +	fclose(f);
> +
> +	return ret;
> +}
> +
> +int
> +rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
> +{
> +	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
> +	int latency = -1;
> +	uint32_t cpu_id;
> +	FILE *f;
> +	int ret;
> +
> +	if (!rte_lcore_is_enabled(lcore_id)) {
> +		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
> +		return -EINVAL;
> +	}
> +	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
> +	if (ret != 0)
> +		return ret;
> +
> +	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
> +	if (ret != 0) {
> +		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +			  cpu_id, strerror(errno));
> +		return ret;
> +	}
> +
> +	ret = read_core_sysfs_s(f, buf, sizeof(buf));
> +	if (ret != 0) {
> +		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
> +			  cpu_id, strerror(errno));
> +		goto out;
> +	}
> +
> +	/*
> +	 * Based on the sysfs interface pm_qos_resume_latency_us under
> +	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
> +	 * is as follows for different output string.
> +	 * 1> the resume latency is 0 if the output is "n/a".
> +	 * 2> the resume latency is no constraint if the output is "0".
> +	 * 3> the resume latency is the actual value in used for other string.
> +	 */
> +	if (strcmp(buf, "n/a") == 0)
> +		latency = 0;


RTE_POWER_QOS_STRICT_LATENCY_VALUE
?

> +	else {
> +		latency = strtoul(buf, NULL, 10);
> +		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
> +	}
> +
> +out:
> +	fclose(f);
> +
> +	return latency != -1 ? latency : ret;
> +}
> diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
> new file mode 100644
> index 0000000000..990c488373
> --- /dev/null
> +++ b/lib/power/rte_power_qos.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 HiSilicon Limited
> + */
> +
> +#ifndef RTE_POWER_QOS_H
> +#define RTE_POWER_QOS_H
> +
> +#include <stdint.h>
> +
> +#include <rte_compat.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @file rte_power_qos.h
> + *
> + * PM QoS API.
> + *
> + * The CPU-wide resume latency limit has a positive impact on this CPU's idle
> + * state selection in each cpuidle governor.
> + * Please see the PM QoS on CPU wide in the following link:
> + * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-
> power-pm-qos-resume-latency-us
> + *
> + * The deeper the idle state, the lower the power consumption, but the
> + * longer the resume time. Some service are delay sensitive and very except the
> + * low resume time, like interrupt packet receiving mode.
> + *
> + * In these case, per-CPU PM QoS API can be used to control this CPU's idle
> + * state selection and limit just enter the shallowest idle state to low the
> + * delay after sleep by setting strict resume latency (zero value).
> + */
> +
> +#define RTE_POWER_QOS_STRICT_LATENCY_VALUE             0
> +#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT    ((int)(UINT32_MAX >> 1))

Isn't it just INT32_MAX?

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * @param lcore_id
> + *   target logical core id
> + *
> + * @param latency
> + *   The latency should be greater than and equal to zero in microseconds unit.
> + *
> + * @return
> + *   0 on success. Otherwise negative value is returned.
> + */
> +__rte_experimental
> +int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the current resume latency of this logical core.
> + * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
> + * if don't set it.
> + *
> + * @return
> + *   Negative value on failure.
> + *   >= 0 means the actual resume latency limit on this core.
> + */
> +__rte_experimental
> +int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_POWER_QOS_H */
> diff --git a/lib/power/version.map b/lib/power/version.map
> index c9a226614e..08f178a39d 100644
> --- a/lib/power/version.map
> +++ b/lib/power/version.map
> @@ -51,4 +51,8 @@ EXPERIMENTAL {
>  	rte_power_set_uncore_env;
>  	rte_power_uncore_freqs;
>  	rte_power_unset_uncore_env;
> +
> +	# added in 24.11
> +	rte_power_qos_get_cpu_resume_latency;
> +	rte_power_qos_set_cpu_resume_latency;
>  };
> --
> 2.22.0


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v7 1/5] power: refactor core power management library
  2024-10-22  7:13  0%       ` Tummala, Sivaprasad
@ 2024-10-22  8:36  0%         ` lihuisong (C)
  0 siblings, 0 replies; 200+ results
From: lihuisong (C) @ 2024-10-22  8:36 UTC (permalink / raw)
  To: Tummala, Sivaprasad, david.hunt, konstantin.ananyev
  Cc: dev, anatoly.burakov, jerinj, radu.nicolau, gakhil,
	cristian.dumitrescu, Yigit, Ferruh


在 2024/10/22 15:13, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Huisong,
>
> Please find my comments inline.
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Tuesday, October 22, 2024 8:33 AM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
>> david.hunt@intel.com; konstantin.ananyev@huawei.com
>> Cc: dev@dpdk.org; anatoly.burakov@intel.com; jerinj@marvell.com;
>> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
>> Ferruh <Ferruh.Yigit@amd.com>
>> Subject: Re: [PATCH v7 1/5] power: refactor core power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> Hi Sivaprasad,
>>
>> Some comments inline.
>>
>> 在 2024/10/21 12:07, Sivaprasad Tummala 写道:
>>> This patch introduces a comprehensive refactor to the core power
>>> management library. The primary focus is on improving modularity and
>>> organization by relocating specific driver implementations from the
>>> 'lib/power' directory to dedicated directories within
>>> 'drivers/power/core/*'. The adjustment of meson.build files enables
>>> the selective activation of individual drivers.
>>>
>>> These changes contribute to a significant enhancement in code
>>> organization, providing a clearer structure for driver implementations.
>>> The refactor aims to improve overall code clarity and boost
>>> maintainability. Additionally, it establishes a foundation for future
>>> development, allowing for more focused work on individual drivers and
>>> seamless integration of forthcoming enhancements.
>>>
>>> v6:
>>>    - fixed compilation error with symbol export in API
>>>    - exported power_get_lcore_mapped_cpu_id as internal API to be
>>>      used in drivers/power/*
>>>
>>> v5:
>>>    - fixed code style warning
>>>
>>> v4:
>>>    - fixed build error with RTE_ASSERT
>>>
>>> v3:
>>>    - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
>>>    - re-worked on auto detection logic
>>>
>>> v2:
>>>    - added NULL check for global_core_ops in rte_power_get_core_ops
>>>
>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>> ---
>>>    drivers/meson.build                           |   1 +
>>>    .../power/acpi/acpi_cpufreq.c                 |  22 +-
>>>    .../power/acpi/acpi_cpufreq.h                 |   6 +-
>>>    drivers/power/acpi/meson.build                |  10 +
>>>    .../power/amd_pstate/amd_pstate_cpufreq.c     |  24 +-
>>>    .../power/amd_pstate/amd_pstate_cpufreq.h     |  10 +-
>>>    drivers/power/amd_pstate/meson.build          |  10 +
>>>    .../power/cppc/cppc_cpufreq.c                 |  22 +-
>>>    .../power/cppc/cppc_cpufreq.h                 |   8 +-
>>>    drivers/power/cppc/meson.build                |  10 +
>>>    .../power/kvm_vm}/guest_channel.c             |   0
>>>    .../power/kvm_vm}/guest_channel.h             |   0
>>>    .../power/kvm_vm/kvm_vm.c                     |  22 +-
>>>    .../power/kvm_vm/kvm_vm.h                     |   6 +-
>>>    drivers/power/kvm_vm/meson.build              |  14 +
>>>    drivers/power/meson.build                     |  12 +
>>>    drivers/power/pstate/meson.build              |  10 +
>>>    .../power/pstate/pstate_cpufreq.c             |  22 +-
>>>    .../power/pstate/pstate_cpufreq.h             |   6 +-
>>>    lib/power/meson.build                         |   7 +-
>>>    lib/power/power_common.c                      |   2 +-
>>>    lib/power/power_common.h                      |  18 +-
>>>    lib/power/rte_power.c                         | 355 ++++++++----------
>>>    lib/power/rte_power.h                         | 116 +++---
>>>    lib/power/rte_power_cpufreq_api.h             | 206 ++++++++++
>>>    lib/power/version.map                         |  15 +
>>>    26 files changed, 665 insertions(+), 269 deletions(-)
>>>    rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c
>> (95%)
>>>    rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h
>> (98%)
>>>    create mode 100644 drivers/power/acpi/meson.build
>>>    rename lib/power/power_amd_pstate_cpufreq.c =>
>> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
>>>    rename lib/power/power_amd_pstate_cpufreq.h =>
>> drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
>>>    create mode 100644 drivers/power/amd_pstate/meson.build
>>>    rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c
>> (95%)
>>>    rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h
>> (97%)
>>>    create mode 100644 drivers/power/cppc/meson.build
>>>    rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
>>>    rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
>>>    rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
>>>    rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
>>>    create mode 100644 drivers/power/kvm_vm/meson.build
>>>    create mode 100644 drivers/power/meson.build
>>>    create mode 100644 drivers/power/pstate/meson.build
>>>    rename lib/power/power_pstate_cpufreq.c =>
>> drivers/power/pstate/pstate_cpufreq.c (96%)
>>>    rename lib/power/power_pstate_cpufreq.h =>
>> drivers/power/pstate/pstate_cpufreq.h (98%)
>>>    create mode 100644 lib/power/rte_power_cpufreq_api.h
>>>
>>> diff --git a/drivers/meson.build b/drivers/meson.build index
>>> 2733306698..7ef4f581a0 100644
>>> --- a/drivers/meson.build
>>> +++ b/drivers/meson.build
>>> @@ -29,6 +29,7 @@ subdirs = [
>>>            'event',          # depends on common, bus, mempool and net.
>>>            'baseband',       # depends on common and bus.
>>>            'gpu',            # depends on common and bus.
>>> +        'power',          # depends on common (in future).
>>>    ]
>>>
>>>    if meson.is_cross_build()
>>> diff --git a/lib/power/power_acpi_cpufreq.c
>>> b/drivers/power/acpi/acpi_cpufreq.c
>>> similarity index 95%
>>> rename from lib/power/power_acpi_cpufreq.c rename to
>>> drivers/power/acpi/acpi_cpufreq.c index ae809fbb60..974fbb7ba8 100644
>>> --- a/lib/power/power_acpi_cpufreq.c
>>> +++ b/drivers/power/acpi/acpi_cpufreq.c
>>> @@ -10,7 +10,7 @@
>>>    #include <rte_stdatomic.h>
>>>    #include <rte_string_fns.h>
>>>
>>> -#include "power_acpi_cpufreq.h"
>>> +#include "acpi_cpufreq.h"
>>>    #include "power_common.h"
>>>
>> <...>
>>> diff --git a/lib/power/power_common.c b/lib/power/power_common.c index
>>> b47c63a5f1..e482f71c64 100644
>>> --- a/lib/power/power_common.c
>>> +++ b/lib/power/power_common.c
>>> @@ -13,7 +13,7 @@
>>>
>>>    #include "power_common.h"
>>>
>>> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
>>> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>>>
>>>    #define POWER_SYSFILE_SCALING_DRIVER   \
>>>                "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
>>> diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
>>> 82fb94d0c0..c294f561bb 100644
>>> --- a/lib/power/power_common.h
>>> +++ b/lib/power/power_common.h
>>> @@ -6,12 +6,13 @@
>>>    #define _POWER_COMMON_H_
>>>
>>>    #include <rte_common.h>
>>> +#include <rte_compat.h>
>>>    #include <rte_log.h>
>>>
>>>    #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>>>
>>> -extern int power_logtype;
>>> -#define RTE_LOGTYPE_POWER power_logtype
>>> +extern int rte_power_logtype;
>>> +#define RTE_LOGTYPE_POWER rte_power_logtype
>>>    #define POWER_LOG(level, ...) \
>>>        RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>>>
>>> @@ -23,14 +24,27 @@ extern int power_logtype;
>>>    #endif
>>>
>>>    /* check if scaling driver matches one we want */
>>> +__rte_internal
>>>    int cpufreq_check_scaling_driver(const char *driver);
>>> +
>>> +__rte_internal
>>>    int power_set_governor(unsigned int lcore_id, const char *new_governor,
>>>                char *orig_governor, size_t orig_governor_len);
>> cpufreq_check_scaling_driver and power_set_governor are just used for cpufreq,
>> they shouldn't be put in this common header file.
>> We've come to an aggrement in patch V2 1/4.
>> I guess you forget it😁
>> suggest that move these two APIs to rte_power_cpufreq_api.h.
> OK!
>>> +
>>> +__rte_internal
>>>    int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
>>>                __rte_format_printf(3, 4);
>>> +
>>> +__rte_internal
>>>    int read_core_sysfs_u32(FILE *f, uint32_t *val);
>>> +
>>> +__rte_internal
>>>    int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
>>> +
>>> +__rte_internal
>>>    int write_core_sysfs_s(FILE *f, const char *str);
>>> +
>>> +__rte_internal
>>>    int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t
>>> *cpu_id);
>>>
>>>    #endif /* _POWER_COMMON_H_ */
>>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
>>> 36c3f3da98..416f0148a3 100644
>>> --- a/lib/power/rte_power.c
>>> +++ b/lib/power/rte_power.c
>>> @@ -6,155 +6,88 @@
>>>
>>>    #include <rte_errno.h>
>>>    #include <rte_spinlock.h>
>>> +#include <rte_debug.h>
>>>
>>>    #include "rte_power.h"
>>> -#include "power_acpi_cpufreq.h"
>>> -#include "power_cppc_cpufreq.h"
>>>    #include "power_common.h"
>>> -#include "power_kvm_vm.h"
>>> -#include "power_pstate_cpufreq.h"
>>> -#include "power_amd_pstate_cpufreq.h"
>>>
>>> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
>>> +static enum power_management_env global_default_env =
>> PM_ENV_NOT_SET;
>>> +static struct rte_power_core_ops *global_power_core_ops;
>>>
>>>    static rte_spinlock_t global_env_cfg_lock =
>>> RTE_SPINLOCK_INITIALIZER;
>>> -
>>> -/* function pointers */
>>> -rte_power_freqs_t rte_power_freqs  = NULL; -rte_power_get_freq_t
>>> rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
>>> NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
>>> -rte_power_freq_change_t rte_power_freq_down = NULL;
>>> -rte_power_freq_change_t rte_power_freq_max = NULL;
>>> -rte_power_freq_change_t rte_power_freq_min = NULL;
>>> -rte_power_freq_change_t rte_power_turbo_status;
>>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
>>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
>>> -rte_power_get_capabilities_t rte_power_get_capabilities;
>>> -
>>> -static void
>>> -reset_power_function_ptrs(void)
>>> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
>>> +                     TAILQ_HEAD_INITIALIZER(core_ops_list);
>>> +
>>> +const char *power_env_str[] = {
>>> +     "not set",
>>> +     "acpi",
>>> +     "kvm-vm",
>>> +     "pstate",
>>> +     "cppc",
>>> +     "amd-pstate"
>>> +};
>>> +
>> <...>
>>> +uint32_t
>>> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->get_avail_freqs(lcore_id, freqs,
>>> +n); }
>>> +
>>> +uint32_t
>>> +rte_power_get_freq(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->get_freq(lcore_id);
>>> +}
>>> +
>>> +uint32_t
>>> +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->set_freq(lcore_id, index); }
>>> +
>>> +int
>>> +rte_power_freq_up(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->freq_up(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_down(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->freq_down(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_max(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->freq_max(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_min(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->freq_min(lcore_id);
>>> +}
>>>
>>> +int
>>> +rte_power_turbo_status(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->turbo_status(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_enable_turbo(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->enable_turbo(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_disable_turbo(unsigned int lcore_id) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->disable_turbo(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_get_capabilities(unsigned int lcore_id,
>>> +             struct rte_power_core_capabilities *caps) {
>>> +     RTE_ASSERT(global_power_core_ops != NULL);
>>> +     return global_power_core_ops->get_caps(lcore_id, caps);
>>>    }
>>> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
>>> 4fa4afe399..e9a72b92ad 100644
>>> --- a/lib/power/rte_power.h
>>> +++ b/lib/power/rte_power.h
>>> @@ -1,5 +1,6 @@
>>>    /* SPDX-License-Identifier: BSD-3-Clause
>>>     * Copyright(c) 2010-2014 Intel Corporation
>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>>     */
>>>
>>>    #ifndef _RTE_POWER_H
>>> @@ -14,14 +15,21 @@
>>>    #include <rte_log.h>
>>>    #include <rte_power_guest_channel.h>
>>>
>>> +#include "rte_power_cpufreq_api.h"
>>   From the name of rte_power.c and rte_power.h, they are supposed to work for all
>> power libraries I also proposed in previous version.
>> But rte_power.* currently just work for cpufreq lib. If we need to put all power
>> components togeter and create it.
>> Now that the rte_power_cpufreq_api.h has been created for cpufreq library.
>> How about directly rename rte_power.c to rte_poer_cpufreq_api.c and rte_power.h
>> to rte_power_cpufreq_api.h?
>> There will be ABI changes, but it is allowed in this 24.11. If we plan to do it later, we'll
>> have to wait another year.
> Yes, I had split the rte_power.h as part of refactor to avoid exposing internal functions.
> Renaming rte_power.*  to rte_power_cpufreq.* can be considered but not merge with rte_power_cpufreq_api.h
What is your plan? I feel it is not very hard and just rename the file.
>>> +
>>>    #ifdef __cplusplus
>>>    extern "C" {
>>>    #endif
>>>
>>>    /* Power Management Environment State */ -enum power_management_env
>>> {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
>>> -             PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
>>> -             PM_ENV_AMD_PSTATE_CPUFREQ};
>>> +enum power_management_env {
>>> +     PM_ENV_NOT_SET = 0,
>>> +     PM_ENV_ACPI_CPUFREQ,
>>> +     PM_ENV_KVM_VM,
>>> +     PM_ENV_PSTATE_CPUFREQ,
>>> +     PM_ENV_CPPC_CPUFREQ,
>>> +     PM_ENV_AMD_PSTATE_CPUFREQ
>>> +};
>>>
>> <...>

^ permalink raw reply	[relevance 0%]

* RE: [PATCH v7 1/5] power: refactor core power management library
  2024-10-22  3:03  3%     ` lihuisong (C)
@ 2024-10-22  7:13  0%       ` Tummala, Sivaprasad
  2024-10-22  8:36  0%         ` lihuisong (C)
  0 siblings, 1 reply; 200+ results
From: Tummala, Sivaprasad @ 2024-10-22  7:13 UTC (permalink / raw)
  To: lihuisong (C), david.hunt, konstantin.ananyev
  Cc: dev, anatoly.burakov, jerinj, radu.nicolau, gakhil,
	cristian.dumitrescu, Yigit, Ferruh

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Huisong,

Please find my comments inline.

> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Tuesday, October 22, 2024 8:33 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
> david.hunt@intel.com; konstantin.ananyev@huawei.com
> Cc: dev@dpdk.org; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
> Ferruh <Ferruh.Yigit@amd.com>
> Subject: Re: [PATCH v7 1/5] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hi Sivaprasad,
>
> Some comments inline.
>
> 在 2024/10/21 12:07, Sivaprasad Tummala 写道:
> > This patch introduces a comprehensive refactor to the core power
> > management library. The primary focus is on improving modularity and
> > organization by relocating specific driver implementations from the
> > 'lib/power' directory to dedicated directories within
> > 'drivers/power/core/*'. The adjustment of meson.build files enables
> > the selective activation of individual drivers.
> >
> > These changes contribute to a significant enhancement in code
> > organization, providing a clearer structure for driver implementations.
> > The refactor aims to improve overall code clarity and boost
> > maintainability. Additionally, it establishes a foundation for future
> > development, allowing for more focused work on individual drivers and
> > seamless integration of forthcoming enhancements.
> >
> > v6:
> >   - fixed compilation error with symbol export in API
> >   - exported power_get_lcore_mapped_cpu_id as internal API to be
> >     used in drivers/power/*
> >
> > v5:
> >   - fixed code style warning
> >
> > v4:
> >   - fixed build error with RTE_ASSERT
> >
> > v3:
> >   - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
> >   - re-worked on auto detection logic
> >
> > v2:
> >   - added NULL check for global_core_ops in rte_power_get_core_ops
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> >   drivers/meson.build                           |   1 +
> >   .../power/acpi/acpi_cpufreq.c                 |  22 +-
> >   .../power/acpi/acpi_cpufreq.h                 |   6 +-
> >   drivers/power/acpi/meson.build                |  10 +
> >   .../power/amd_pstate/amd_pstate_cpufreq.c     |  24 +-
> >   .../power/amd_pstate/amd_pstate_cpufreq.h     |  10 +-
> >   drivers/power/amd_pstate/meson.build          |  10 +
> >   .../power/cppc/cppc_cpufreq.c                 |  22 +-
> >   .../power/cppc/cppc_cpufreq.h                 |   8 +-
> >   drivers/power/cppc/meson.build                |  10 +
> >   .../power/kvm_vm}/guest_channel.c             |   0
> >   .../power/kvm_vm}/guest_channel.h             |   0
> >   .../power/kvm_vm/kvm_vm.c                     |  22 +-
> >   .../power/kvm_vm/kvm_vm.h                     |   6 +-
> >   drivers/power/kvm_vm/meson.build              |  14 +
> >   drivers/power/meson.build                     |  12 +
> >   drivers/power/pstate/meson.build              |  10 +
> >   .../power/pstate/pstate_cpufreq.c             |  22 +-
> >   .../power/pstate/pstate_cpufreq.h             |   6 +-
> >   lib/power/meson.build                         |   7 +-
> >   lib/power/power_common.c                      |   2 +-
> >   lib/power/power_common.h                      |  18 +-
> >   lib/power/rte_power.c                         | 355 ++++++++----------
> >   lib/power/rte_power.h                         | 116 +++---
> >   lib/power/rte_power_cpufreq_api.h             | 206 ++++++++++
> >   lib/power/version.map                         |  15 +
> >   26 files changed, 665 insertions(+), 269 deletions(-)
> >   rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c
> (95%)
> >   rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h
> (98%)
> >   create mode 100644 drivers/power/acpi/meson.build
> >   rename lib/power/power_amd_pstate_cpufreq.c =>
> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> >   rename lib/power/power_amd_pstate_cpufreq.h =>
> drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
> >   create mode 100644 drivers/power/amd_pstate/meson.build
> >   rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c
> (95%)
> >   rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h
> (97%)
> >   create mode 100644 drivers/power/cppc/meson.build
> >   rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> >   rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> >   rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> >   rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> >   create mode 100644 drivers/power/kvm_vm/meson.build
> >   create mode 100644 drivers/power/meson.build
> >   create mode 100644 drivers/power/pstate/meson.build
> >   rename lib/power/power_pstate_cpufreq.c =>
> drivers/power/pstate/pstate_cpufreq.c (96%)
> >   rename lib/power/power_pstate_cpufreq.h =>
> drivers/power/pstate/pstate_cpufreq.h (98%)
> >   create mode 100644 lib/power/rte_power_cpufreq_api.h
> >
> > diff --git a/drivers/meson.build b/drivers/meson.build index
> > 2733306698..7ef4f581a0 100644
> > --- a/drivers/meson.build
> > +++ b/drivers/meson.build
> > @@ -29,6 +29,7 @@ subdirs = [
> >           'event',          # depends on common, bus, mempool and net.
> >           'baseband',       # depends on common and bus.
> >           'gpu',            # depends on common and bus.
> > +        'power',          # depends on common (in future).
> >   ]
> >
> >   if meson.is_cross_build()
> > diff --git a/lib/power/power_acpi_cpufreq.c
> > b/drivers/power/acpi/acpi_cpufreq.c
> > similarity index 95%
> > rename from lib/power/power_acpi_cpufreq.c rename to
> > drivers/power/acpi/acpi_cpufreq.c index ae809fbb60..974fbb7ba8 100644
> > --- a/lib/power/power_acpi_cpufreq.c
> > +++ b/drivers/power/acpi/acpi_cpufreq.c
> > @@ -10,7 +10,7 @@
> >   #include <rte_stdatomic.h>
> >   #include <rte_string_fns.h>
> >
> > -#include "power_acpi_cpufreq.h"
> > +#include "acpi_cpufreq.h"
> >   #include "power_common.h"
> >
> <...>
> > diff --git a/lib/power/power_common.c b/lib/power/power_common.c index
> > b47c63a5f1..e482f71c64 100644
> > --- a/lib/power/power_common.c
> > +++ b/lib/power/power_common.c
> > @@ -13,7 +13,7 @@
> >
> >   #include "power_common.h"
> >
> > -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> > +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
> >
> >   #define POWER_SYSFILE_SCALING_DRIVER   \
> >               "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> > diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
> > 82fb94d0c0..c294f561bb 100644
> > --- a/lib/power/power_common.h
> > +++ b/lib/power/power_common.h
> > @@ -6,12 +6,13 @@
> >   #define _POWER_COMMON_H_
> >
> >   #include <rte_common.h>
> > +#include <rte_compat.h>
> >   #include <rte_log.h>
> >
> >   #define RTE_POWER_INVALID_FREQ_INDEX (~0)
> >
> > -extern int power_logtype;
> > -#define RTE_LOGTYPE_POWER power_logtype
> > +extern int rte_power_logtype;
> > +#define RTE_LOGTYPE_POWER rte_power_logtype
> >   #define POWER_LOG(level, ...) \
> >       RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
> >
> > @@ -23,14 +24,27 @@ extern int power_logtype;
> >   #endif
> >
> >   /* check if scaling driver matches one we want */
> > +__rte_internal
> >   int cpufreq_check_scaling_driver(const char *driver);
> > +
> > +__rte_internal
> >   int power_set_governor(unsigned int lcore_id, const char *new_governor,
> >               char *orig_governor, size_t orig_governor_len);
> cpufreq_check_scaling_driver and power_set_governor are just used for cpufreq,
> they shouldn't be put in this common header file.
> We've come to an aggrement in patch V2 1/4.
> I guess you forget it😁
> suggest that move these two APIs to rte_power_cpufreq_api.h.
OK!
> > +
> > +__rte_internal
> >   int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> >               __rte_format_printf(3, 4);
> > +
> > +__rte_internal
> >   int read_core_sysfs_u32(FILE *f, uint32_t *val);
> > +
> > +__rte_internal
> >   int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> > +
> > +__rte_internal
> >   int write_core_sysfs_s(FILE *f, const char *str);
> > +
> > +__rte_internal
> >   int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t
> > *cpu_id);
> >
> >   #endif /* _POWER_COMMON_H_ */
> > diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
> > 36c3f3da98..416f0148a3 100644
> > --- a/lib/power/rte_power.c
> > +++ b/lib/power/rte_power.c
> > @@ -6,155 +6,88 @@
> >
> >   #include <rte_errno.h>
> >   #include <rte_spinlock.h>
> > +#include <rte_debug.h>
> >
> >   #include "rte_power.h"
> > -#include "power_acpi_cpufreq.h"
> > -#include "power_cppc_cpufreq.h"
> >   #include "power_common.h"
> > -#include "power_kvm_vm.h"
> > -#include "power_pstate_cpufreq.h"
> > -#include "power_amd_pstate_cpufreq.h"
> >
> > -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> > +static enum power_management_env global_default_env =
> PM_ENV_NOT_SET;
> > +static struct rte_power_core_ops *global_power_core_ops;
> >
> >   static rte_spinlock_t global_env_cfg_lock =
> > RTE_SPINLOCK_INITIALIZER;
> > -
> > -/* function pointers */
> > -rte_power_freqs_t rte_power_freqs  = NULL; -rte_power_get_freq_t
> > rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
> > NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
> > -rte_power_freq_change_t rte_power_freq_down = NULL;
> > -rte_power_freq_change_t rte_power_freq_max = NULL;
> > -rte_power_freq_change_t rte_power_freq_min = NULL;
> > -rte_power_freq_change_t rte_power_turbo_status;
> > -rte_power_freq_change_t rte_power_freq_enable_turbo;
> > -rte_power_freq_change_t rte_power_freq_disable_turbo;
> > -rte_power_get_capabilities_t rte_power_get_capabilities;
> > -
> > -static void
> > -reset_power_function_ptrs(void)
> > +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> > +                     TAILQ_HEAD_INITIALIZER(core_ops_list);
> > +
> > +const char *power_env_str[] = {
> > +     "not set",
> > +     "acpi",
> > +     "kvm-vm",
> > +     "pstate",
> > +     "cppc",
> > +     "amd-pstate"
> > +};
> > +
>
> <...>
> > +uint32_t
> > +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->get_avail_freqs(lcore_id, freqs,
> > +n); }
> > +
> > +uint32_t
> > +rte_power_get_freq(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->get_freq(lcore_id);
> > +}
> > +
> > +uint32_t
> > +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->set_freq(lcore_id, index); }
> > +
> > +int
> > +rte_power_freq_up(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->freq_up(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_down(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->freq_down(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_max(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->freq_max(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_min(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->freq_min(lcore_id);
> > +}
> >
> > +int
> > +rte_power_turbo_status(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->turbo_status(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_enable_turbo(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->enable_turbo(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_disable_turbo(unsigned int lcore_id) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->disable_turbo(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_get_capabilities(unsigned int lcore_id,
> > +             struct rte_power_core_capabilities *caps) {
> > +     RTE_ASSERT(global_power_core_ops != NULL);
> > +     return global_power_core_ops->get_caps(lcore_id, caps);
> >   }
> > diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
> > 4fa4afe399..e9a72b92ad 100644
> > --- a/lib/power/rte_power.h
> > +++ b/lib/power/rte_power.h
> > @@ -1,5 +1,6 @@
> >   /* SPDX-License-Identifier: BSD-3-Clause
> >    * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> >    */
> >
> >   #ifndef _RTE_POWER_H
> > @@ -14,14 +15,21 @@
> >   #include <rte_log.h>
> >   #include <rte_power_guest_channel.h>
> >
> > +#include "rte_power_cpufreq_api.h"
>  From the name of rte_power.c and rte_power.h, they are supposed to work for all
> power libraries I also proposed in previous version.
> But rte_power.* currently just work for cpufreq lib. If we need to put all power
> components togeter and create it.
> Now that the rte_power_cpufreq_api.h has been created for cpufreq library.
> How about directly rename rte_power.c to rte_poer_cpufreq_api.c and rte_power.h
> to rte_power_cpufreq_api.h?
> There will be ABI changes, but it is allowed in this 24.11. If we plan to do it later, we'll
> have to wait another year.
Yes, I had split the rte_power.h as part of refactor to avoid exposing internal functions.
Renaming rte_power.*  to rte_power_cpufreq.* can be considered but not merge with rte_power_cpufreq_api.h
> > +
> >   #ifdef __cplusplus
> >   extern "C" {
> >   #endif
> >
> >   /* Power Management Environment State */ -enum power_management_env
> > {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> > -             PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> > -             PM_ENV_AMD_PSTATE_CPUFREQ};
> > +enum power_management_env {
> > +     PM_ENV_NOT_SET = 0,
> > +     PM_ENV_ACPI_CPUFREQ,
> > +     PM_ENV_KVM_VM,
> > +     PM_ENV_PSTATE_CPUFREQ,
> > +     PM_ENV_CPPC_CPUFREQ,
> > +     PM_ENV_AMD_PSTATE_CPUFREQ
> > +};
> >
> <...>

^ permalink raw reply	[relevance 0%]

* Re: [PATCH v7 1/5] power: refactor core power management library
  @ 2024-10-22  3:03  3%     ` lihuisong (C)
  2024-10-22  7:13  0%       ` Tummala, Sivaprasad
  0 siblings, 1 reply; 200+ results
From: lihuisong (C) @ 2024-10-22  3:03 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.hunt, konstantin.ananyev
  Cc: dev, anatoly.burakov, jerinj, radu.nicolau, gakhil,
	cristian.dumitrescu, ferruh.yigit

Hi Sivaprasad,

Some comments inline.

在 2024/10/21 12:07, Sivaprasad Tummala 写道:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
>
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
>
> v6:
>   - fixed compilation error with symbol export in API
>   - exported power_get_lcore_mapped_cpu_id as internal API to be
>     used in drivers/power/*
>
> v5:
>   - fixed code style warning
>
> v4:
>   - fixed build error with RTE_ASSERT
>
> v3:
>   - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
>   - re-worked on auto detection logic
>
> v2:
>   - added NULL check for global_core_ops in rte_power_get_core_ops
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
>   drivers/meson.build                           |   1 +
>   .../power/acpi/acpi_cpufreq.c                 |  22 +-
>   .../power/acpi/acpi_cpufreq.h                 |   6 +-
>   drivers/power/acpi/meson.build                |  10 +
>   .../power/amd_pstate/amd_pstate_cpufreq.c     |  24 +-
>   .../power/amd_pstate/amd_pstate_cpufreq.h     |  10 +-
>   drivers/power/amd_pstate/meson.build          |  10 +
>   .../power/cppc/cppc_cpufreq.c                 |  22 +-
>   .../power/cppc/cppc_cpufreq.h                 |   8 +-
>   drivers/power/cppc/meson.build                |  10 +
>   .../power/kvm_vm}/guest_channel.c             |   0
>   .../power/kvm_vm}/guest_channel.h             |   0
>   .../power/kvm_vm/kvm_vm.c                     |  22 +-
>   .../power/kvm_vm/kvm_vm.h                     |   6 +-
>   drivers/power/kvm_vm/meson.build              |  14 +
>   drivers/power/meson.build                     |  12 +
>   drivers/power/pstate/meson.build              |  10 +
>   .../power/pstate/pstate_cpufreq.c             |  22 +-
>   .../power/pstate/pstate_cpufreq.h             |   6 +-
>   lib/power/meson.build                         |   7 +-
>   lib/power/power_common.c                      |   2 +-
>   lib/power/power_common.h                      |  18 +-
>   lib/power/rte_power.c                         | 355 ++++++++----------
>   lib/power/rte_power.h                         | 116 +++---
>   lib/power/rte_power_cpufreq_api.h             | 206 ++++++++++
>   lib/power/version.map                         |  15 +
>   26 files changed, 665 insertions(+), 269 deletions(-)
>   rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
>   rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
>   create mode 100644 drivers/power/acpi/meson.build
>   rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
>   rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
>   create mode 100644 drivers/power/amd_pstate/meson.build
>   rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
>   rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
>   create mode 100644 drivers/power/cppc/meson.build
>   rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
>   rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
>   rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
>   rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
>   create mode 100644 drivers/power/kvm_vm/meson.build
>   create mode 100644 drivers/power/meson.build
>   create mode 100644 drivers/power/pstate/meson.build
>   rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
>   rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
>   create mode 100644 lib/power/rte_power_cpufreq_api.h
>
> diff --git a/drivers/meson.build b/drivers/meson.build
> index 2733306698..7ef4f581a0 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -29,6 +29,7 @@ subdirs = [
>           'event',          # depends on common, bus, mempool and net.
>           'baseband',       # depends on common and bus.
>           'gpu',            # depends on common and bus.
> +        'power',          # depends on common (in future).
>   ]
>   
>   if meson.is_cross_build()
> diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
> similarity index 95%
> rename from lib/power/power_acpi_cpufreq.c
> rename to drivers/power/acpi/acpi_cpufreq.c
> index ae809fbb60..974fbb7ba8 100644
> --- a/lib/power/power_acpi_cpufreq.c
> +++ b/drivers/power/acpi/acpi_cpufreq.c
> @@ -10,7 +10,7 @@
>   #include <rte_stdatomic.h>
>   #include <rte_string_fns.h>
>   
> -#include "power_acpi_cpufreq.h"
> +#include "acpi_cpufreq.h"
>   #include "power_common.h"
>   
<...>
> diff --git a/lib/power/power_common.c b/lib/power/power_common.c
> index b47c63a5f1..e482f71c64 100644
> --- a/lib/power/power_common.c
> +++ b/lib/power/power_common.c
> @@ -13,7 +13,7 @@
>   
>   #include "power_common.h"
>   
> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>   
>   #define POWER_SYSFILE_SCALING_DRIVER   \
>   		"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
> index 82fb94d0c0..c294f561bb 100644
> --- a/lib/power/power_common.h
> +++ b/lib/power/power_common.h
> @@ -6,12 +6,13 @@
>   #define _POWER_COMMON_H_
>   
>   #include <rte_common.h>
> +#include <rte_compat.h>
>   #include <rte_log.h>
>   
>   #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>   
> -extern int power_logtype;
> -#define RTE_LOGTYPE_POWER power_logtype
> +extern int rte_power_logtype;
> +#define RTE_LOGTYPE_POWER rte_power_logtype
>   #define POWER_LOG(level, ...) \
>   	RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>   
> @@ -23,14 +24,27 @@ extern int power_logtype;
>   #endif
>   
>   /* check if scaling driver matches one we want */
> +__rte_internal
>   int cpufreq_check_scaling_driver(const char *driver);
> +
> +__rte_internal
>   int power_set_governor(unsigned int lcore_id, const char *new_governor,
>   		char *orig_governor, size_t orig_governor_len);
cpufreq_check_scaling_driver and power_set_governor are just used for 
cpufreq, they shouldn't be put in this common header file.
We've come to an aggrement in patch V2 1/4.
I guess you forget it😁
suggest that move these two APIs to rte_power_cpufreq_api.h.
> +
> +__rte_internal
>   int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
>   		__rte_format_printf(3, 4);
> +
> +__rte_internal
>   int read_core_sysfs_u32(FILE *f, uint32_t *val);
> +
> +__rte_internal
>   int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> +
> +__rte_internal
>   int write_core_sysfs_s(FILE *f, const char *str);
> +
> +__rte_internal
>   int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
>   
>   #endif /* _POWER_COMMON_H_ */
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> index 36c3f3da98..416f0148a3 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power.c
> @@ -6,155 +6,88 @@
>   
>   #include <rte_errno.h>
>   #include <rte_spinlock.h>
> +#include <rte_debug.h>
>   
>   #include "rte_power.h"
> -#include "power_acpi_cpufreq.h"
> -#include "power_cppc_cpufreq.h"
>   #include "power_common.h"
> -#include "power_kvm_vm.h"
> -#include "power_pstate_cpufreq.h"
> -#include "power_amd_pstate_cpufreq.h"
>   
> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static struct rte_power_core_ops *global_power_core_ops;
>   
>   static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> -
> -/* function pointers */
> -rte_power_freqs_t rte_power_freqs  = NULL;
> -rte_power_get_freq_t rte_power_get_freq = NULL;
> -rte_power_set_freq_t rte_power_set_freq = NULL;
> -rte_power_freq_change_t rte_power_freq_up = NULL;
> -rte_power_freq_change_t rte_power_freq_down = NULL;
> -rte_power_freq_change_t rte_power_freq_max = NULL;
> -rte_power_freq_change_t rte_power_freq_min = NULL;
> -rte_power_freq_change_t rte_power_turbo_status;
> -rte_power_freq_change_t rte_power_freq_enable_turbo;
> -rte_power_freq_change_t rte_power_freq_disable_turbo;
> -rte_power_get_capabilities_t rte_power_get_capabilities;
> -
> -static void
> -reset_power_function_ptrs(void)
> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> +			TAILQ_HEAD_INITIALIZER(core_ops_list);
> +
> +const char *power_env_str[] = {
> +	"not set",
> +	"acpi",
> +	"kvm-vm",
> +	"pstate",
> +	"cppc",
> +	"amd-pstate"
> +};
> +

<...>
> +uint32_t
> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->get_avail_freqs(lcore_id, freqs, n);
> +}
> +
> +uint32_t
> +rte_power_get_freq(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->get_freq(lcore_id);
> +}
> +
> +uint32_t
> +rte_power_set_freq(unsigned int lcore_id, uint32_t index)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->set_freq(lcore_id, index);
> +}
> +
> +int
> +rte_power_freq_up(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->freq_up(lcore_id);
> +}
> +
> +int
> +rte_power_freq_down(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->freq_down(lcore_id);
> +}
> +
> +int
> +rte_power_freq_max(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->freq_max(lcore_id);
> +}
> +
> +int
> +rte_power_freq_min(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->freq_min(lcore_id);
> +}
>   
> +int
> +rte_power_turbo_status(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->turbo_status(lcore_id);
> +}
> +
> +int
> +rte_power_freq_enable_turbo(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->enable_turbo(lcore_id);
> +}
> +
> +int
> +rte_power_freq_disable_turbo(unsigned int lcore_id)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->disable_turbo(lcore_id);
> +}
> +
> +int
> +rte_power_get_capabilities(unsigned int lcore_id,
> +		struct rte_power_core_capabilities *caps)
> +{
> +	RTE_ASSERT(global_power_core_ops != NULL);
> +	return global_power_core_ops->get_caps(lcore_id, caps);
>   }
> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
> index 4fa4afe399..e9a72b92ad 100644
> --- a/lib/power/rte_power.h
> +++ b/lib/power/rte_power.h
> @@ -1,5 +1,6 @@
>   /* SPDX-License-Identifier: BSD-3-Clause
>    * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>    */
>   
>   #ifndef _RTE_POWER_H
> @@ -14,14 +15,21 @@
>   #include <rte_log.h>
>   #include <rte_power_guest_channel.h>
>   
> +#include "rte_power_cpufreq_api.h"
 From the name of rte_power.c and rte_power.h, they are supposed to work 
for all power libraries I also proposed in previous version.
But rte_power.* currently just work for cpufreq lib. If we need to put 
all power components togeter and create it.
Now that the rte_power_cpufreq_api.h has been created for cpufreq library.
How about directly rename rte_power.c to rte_poer_cpufreq_api.c and 
rte_power.h to rte_power_cpufreq_api.h?
There will be ABI changes, but it is allowed in this 24.11. If we plan 
to do it later, we'll have to wait another year.
> +
>   #ifdef __cplusplus
>   extern "C" {
>   #endif
>   
>   /* Power Management Environment State */
> -enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> -		PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> -		PM_ENV_AMD_PSTATE_CPUFREQ};
> +enum power_management_env {
> +	PM_ENV_NOT_SET = 0,
> +	PM_ENV_ACPI_CPUFREQ,
> +	PM_ENV_KVM_VM,
> +	PM_ENV_PSTATE_CPUFREQ,
> +	PM_ENV_CPPC_CPUFREQ,
> +	PM_ENV_AMD_PSTATE_CPUFREQ
> +};
>   
<...>

^ permalink raw reply	[relevance 3%]

* Re: [PATCH v2 2/4] power: refactor uncore power management library
  @ 2024-10-22  2:05  0%         ` lihuisong (C)
  0 siblings, 0 replies; 200+ results
From: lihuisong (C) @ 2024-10-22  2:05 UTC (permalink / raw)
  To: Tummala, Sivaprasad
  Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
	cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil

Hi Sivaprasa,

I have a inline question, please take a look.

在 2024/10/8 14:19, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lihuisong,
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Tuesday, August 27, 2024 6:33 PM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
>> radu.nicolau@intel.com; jerinj@marvell.com; cristian.dumitrescu@intel.com;
>> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
>> gakhil@marvell.com
>> Subject: Re: [PATCH v2 2/4] power: refactor uncore power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> Hi Sivaprasad,
>>
>> Suggest to split this patch into two patches for easiler to review:
>> patch-1: abstract a file for uncore dvfs core level, namely, the
>> rte_power_uncore_ops.c you did.
>> patch-2: move and rename, lib/power/power_intel_uncore.c =>
>> drivers/power/intel_uncore/intel_uncore.c
>>
>> patch[1/4] is also too big and not good to review.
>>
>> In addition, I have some question and am not sure if we can adjust uncore init
>> process.
>>
>> /Huisong
>>
>>
>> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
>>> This patch refactors the power management library, addressing uncore
>>> power management. The primary changes involve the creation of
>>> dedicated directories for each driver within 'drivers/power/uncore/*'.
>>> The adjustment of meson.build files enables the selective activation
>>> of individual drivers.
>>>
>>> This refactor significantly improves code organization, enhances
>>> clarity and boosts maintainability. It lays the foundation for more
>>> focused development on individual drivers and facilitates seamless
>>> integration of future enhancements, particularly the AMD uncore driver.
>>>
>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>> ---
>>>    .../power/intel_uncore/intel_uncore.c         |  18 +-
>>>    .../power/intel_uncore/intel_uncore.h         |   8 +-
>>>    drivers/power/intel_uncore/meson.build        |   6 +
>>>    drivers/power/meson.build                     |   3 +-
>>>    lib/power/meson.build                         |   2 +-
>>>    lib/power/rte_power_uncore.c                  | 205 ++++++---------
>>>    lib/power/rte_power_uncore.h                  |  87 ++++---
>>>    lib/power/rte_power_uncore_ops.h              | 239 ++++++++++++++++++
>>>    lib/power/version.map                         |   1 +
>>>    9 files changed, 405 insertions(+), 164 deletions(-)
>>>    rename lib/power/power_intel_uncore.c =>
>> drivers/power/intel_uncore/intel_uncore.c (95%)
>>>    rename lib/power/power_intel_uncore.h =>
>> drivers/power/intel_uncore/intel_uncore.h (97%)
>>>    create mode 100644 drivers/power/intel_uncore/meson.build
>>>    create mode 100644 lib/power/rte_power_uncore_ops.h
>>>
>>> diff --git a/lib/power/power_intel_uncore.c
>>> b/drivers/power/intel_uncore/intel_uncore.c
>>> similarity index 95%
>>> rename from lib/power/power_intel_uncore.c rename to
>>> drivers/power/intel_uncore/intel_uncore.c
>>> index 4eb9c5900a..804ad5d755 100644
>>> --- a/lib/power/power_intel_uncore.c
>>> +++ b/drivers/power/intel_uncore/intel_uncore.c
>>> @@ -8,7 +8,7 @@
>>>
>>>    #include <rte_memcpy.h>
>>>
>>> -#include "power_intel_uncore.h"
>>> +#include "intel_uncore.h"
>>>    #include "power_common.h"
>>>
>>>    #define MAX_NUMA_DIE 8
>>> @@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
>>>
>>>        return count;
>>>    }
>> <...>
>>> -#endif /* POWER_INTEL_UNCORE_H */
>>> +#endif /* INTEL_UNCORE_H */
>>> diff --git a/drivers/power/intel_uncore/meson.build
>>> b/drivers/power/intel_uncore/meson.build
>>> new file mode 100644
>>> index 0000000000..876df8ad14
>>> --- /dev/null
>>> +++ b/drivers/power/intel_uncore/meson.build
>>> @@ -0,0 +1,6 @@
>>> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel
>>> +Corporation # Copyright(c) 2024 Advanced Micro Devices, Inc.
>>> +
>>> +sources = files('intel_uncore.c')
>>> +deps += ['power']
>>> diff --git a/drivers/power/meson.build b/drivers/power/meson.build
>>> index 8c7215c639..c83047af94 100644
>>> --- a/drivers/power/meson.build
>>> +++ b/drivers/power/meson.build
>>> @@ -6,7 +6,8 @@ drivers = [
>>>            'amd_pstate',
>>>            'cppc',
>>>            'kvm_vm',
>>> -        'pstate'
>>> +        'pstate',
>>> +        'intel_uncore'
>> The cppc, amd_pstate and so on belong to cpufreq scope.
>> And intel_uncore belongs to uncore dvfs scope.
>> They are not the same level. So I proposes that we need to create one directory
>> called like cpufreq or core.
>> This 'intel_uncore' name don't seems appropriate. what do you think the following
>> directory structure:
>> drivers/power/uncore/intel_uncore.c
>> drivers/power/uncore/amd_uncore.c (according to the patch[4/4]).
> At present, Meson does not support detecting an additional level of subdirectories within drivers/*.
> All the drivers maintain a consistent subdirectory structure.
>>>    ]
>>>    std_deps = ['power']
>>> diff --git a/lib/power/meson.build b/lib/power/meson.build index
>>> f3e3451cdc..9b13d98810 100644
>>> --- a/lib/power/meson.build
>>> +++ b/lib/power/meson.build
>>> @@ -13,7 +13,6 @@ if not is_linux
>>>    endif
>>>    sources = files(
>>>            'power_common.c',
>>> -        'power_intel_uncore.c',
>>>            'rte_power.c',
>>>            'rte_power_uncore.c',
>>>            'rte_power_pmd_mgmt.c',
>>> @@ -24,6 +23,7 @@ headers = files(
>>>            'rte_power_guest_channel.h',
>>>            'rte_power_pmd_mgmt.h',
>>>            'rte_power_uncore.h',
>>> +        'rte_power_uncore_ops.h',
>>>    )
>>>    if cc.has_argument('-Wno-cast-qual')
>>>        cflags += '-Wno-cast-qual'
>>> diff --git a/lib/power/rte_power_uncore.c
>>> b/lib/power/rte_power_uncore.c index 48c75a5da0..9f8771224f 100644
>>> --- a/lib/power/rte_power_uncore.c
>>> +++ b/lib/power/rte_power_uncore.c
>>> @@ -1,6 +1,7 @@
>>>    /* SPDX-License-Identifier: BSD-3-Clause
>>>     * Copyright(c) 2010-2014 Intel Corporation
>>>     * Copyright(c) 2023 AMD Corporation
>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>>     */
>>>
>>>    #include <errno.h>
>>> @@ -12,98 +13,50 @@
>>>    #include "rte_power_uncore.h"
>>>    #include "power_intel_uncore.h"
>>>
>>> -enum rte_uncore_power_mgmt_env default_uncore_env =
>>> RTE_UNCORE_PM_ENV_NOT_SET;
>>> +static enum rte_uncore_power_mgmt_env global_uncore_env =
>>> +RTE_UNCORE_PM_ENV_NOT_SET; static struct rte_power_uncore_ops
>>> +*global_uncore_ops;
>>>
>>>    static rte_spinlock_t global_env_cfg_lock =
>>> RTE_SPINLOCK_INITIALIZER;
>>> +static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
>>> +                     TAILQ_HEAD_INITIALIZER(uncore_ops_list);
>>>
>>> -static uint32_t
>>> -power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
>>> -            unsigned int die __rte_unused)
>>> -{
>>> -     return 0;
>>> -}
>>> -
>>> -static int
>>> -power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
>>> -            unsigned int die __rte_unused, uint32_t index __rte_unused)
>>> -{
>>> -     return 0;
>>> -}
>>> +const char *uncore_env_str[] = {
>>> +     "not set",
>>> +     "auto-detect",
>>> +     "intel-uncore",
>>> +     "amd-hsmp"
>>> +};
>> Why open the "auto-detect" mode to user?
>> Why not set this automatically at framework initialization?
>> After all, the uncore driver is fixed for one platform.
> The auto-detection feature has been implemented to enable seamless migration across platforms
> without requiring any changes to the application
>>> -static int
>>> -power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
>>> -            unsigned int die __rte_unused)
>>> -{
>>> -     return 0;
>>> -}
>>> -
>> <...>
>>> -static int
>>> -power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
>>> -            unsigned int die __rte_unused)
>>> +/* register the ops struct in rte_power_uncore_ops, return 0 on
>>> +success. */ int rte_power_register_uncore_ops(struct
>>> +rte_power_uncore_ops *driver_ops)
>>>    {
>>> -     return 0;
>>> -}
>>> +     if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
>>> +             !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
>>> +             !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
>>> +             !driver_ops->set_freq || !driver_ops->freq_max ||
>>> +             !driver_ops->freq_min) {
>>> +             POWER_LOG(ERR, "Missing callbacks while registering power ops");
>>> +             return -1;
>>> +     }
>>> +     if (driver_ops->cb)
>>> +             driver_ops->cb();
>>>
>>> -static unsigned int
>>> -power_dummy_uncore_get_num_pkgs(void)
>>> -{
>>> -     return 0;
>>> -}
>>> +     TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
>>>
>>> -static unsigned int
>>> -power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused) -{
>>>        return 0;
>>>    }
>>> -
>>> -/* function pointers */
>>> -rte_power_get_uncore_freq_t rte_power_get_uncore_freq =
>>> power_get_dummy_uncore_freq; -rte_power_set_uncore_freq_t
>>> rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
>>> -rte_power_uncore_freq_change_t rte_power_uncore_freq_max =
>>> power_dummy_uncore_freq_max; -rte_power_uncore_freq_change_t
>>> rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
>>> -rte_power_uncore_freqs_t rte_power_uncore_freqs =
>>> power_dummy_uncore_freqs; -rte_power_uncore_get_num_freqs_t
>>> rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
>>> -rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs =
>>> power_dummy_uncore_get_num_pkgs; -rte_power_uncore_get_num_dies_t
>>> rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
>>> -
>>> -static void
>>> -reset_power_uncore_function_ptrs(void)
>>> -{
>>> -     rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
>>> -     rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
>>> -     rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
>>> -     rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
>>> -     rte_power_uncore_freqs  = power_dummy_uncore_freqs;
>>> -     rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
>>> -     rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
>>> -     rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
>>> -}
>>> -
>>>    int
>>>    rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
>>>    {
>>> -     int ret;
>>> +     int ret = -1;
>>> +     struct rte_power_uncore_ops *ops;
>>>
>>>        rte_spinlock_lock(&global_env_cfg_lock);
>>>
>>> -     if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
>>> +     if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
>>>                POWER_LOG(ERR, "Uncore Power Management Env already set.");
>>> -             rte_spinlock_unlock(&global_env_cfg_lock);
>>> -             return -1;
>>> +             goto out;
>>>        }
>>>
>> <...>
>>> +     if (env <= RTE_DIM(uncore_env_str)) {
>>> +             RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
>>> +                     if (strncmp(ops->name, uncore_env_str[env],
>>> +                             RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
>>> +                             global_uncore_env = env;
>>> +                             global_uncore_ops = ops;
>>> +                             ret = 0;
>>> +                             goto out;
>>> +                     }
>>> +             POWER_LOG(ERR, "Power Management (%s) not supported",
>>> +                             uncore_env_str[env]);
>>> +     } else
>>> +             POWER_LOG(ERR, "Invalid Power Management Environment");
>>>
>>> -     default_uncore_env = env;
>>>    out:
>>>        rte_spinlock_unlock(&global_env_cfg_lock);
>>>        return ret;
>>> @@ -139,15 +89,22 @@ void
>>>    rte_power_unset_uncore_env(void)
>>>    {
>>>        rte_spinlock_lock(&global_env_cfg_lock);
>>> -     default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
>>> -     reset_power_uncore_function_ptrs();
>>> +     global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
>>>        rte_spinlock_unlock(&global_env_cfg_lock);
>>>    }
>>>
>> How about abstract an ABI interface to intialize or set the uncore driver on platform
>> by automatical.
>>
>> And later do power_intel_uncore_init_on_die() for each die on different package.
>>>    enum rte_uncore_power_mgmt_env
>>>    rte_power_get_uncore_env(void)
>>>    {
>>> -     return default_uncore_env;
>>> +     return global_uncore_env;
>>> +}
>>> +
>>> +struct rte_power_uncore_ops *
>>> +rte_power_get_uncore_ops(void)
>>> +{
>>> +     RTE_ASSERT(global_uncore_ops != NULL);
>>> +
>>> +     return global_uncore_ops;
>>>    }
>>>
>>>    int
>>> @@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned
>>> int die)
>> This pkg means the socket id on the platform, right?
>> If so, I am not sure that the
>> uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE] used in uncore lib is
>> universal for all uncore driver.
>> For example, uncore driver just support do uncore dvfs based on the socket unit.
>> What shoud we do for this? we may need to think twice.
> Yes, pkg represents a socket id. In platforms with a single uncore controller per socket,
> the die ID should be set to '0' for the corresponding socket ID (pkg).
> .
So just use the die ID 0 on one socket ID(namely, uncore_info[0][0], 
uncore_info[1][0]) to initialize the uncore power info on sockets, right?
 From the implement in l3fwd-power, it set all die ID and all sockets.
For the platform with a single uncore controller per socket, their 
uncore driver in DPDK have to ignore other die IDs except die-0 on one 
socket. right?
>>>    {
>>>        int ret = -1;
>>>
>> <...>

^ permalink raw reply	[relevance 0%]

* [PATCH v11 0/2] power: introduce PM QoS interface
  @ 2024-10-21 11:42  4% ` Huisong Li
  2024-10-21 11:42  5%   ` [PATCH v11 1/2] power: introduce PM QoS API on CPU wide Huisong Li
  2024-10-23  4:09  4% ` [PATCH v12 0/3] power: introduce PM QoS interface Huisong Li
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Huisong Li @ 2024-10-21 11:42 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Please see the description in kernel document[1].
Each cpuidle governor in Linux select which idle state to enter based on
this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from idle state by setting strict resume latency (zero value).

[1] https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us

---
 v11:
  - operate the cpu id the lcore mapped by the new function
    power_get_lcore_mapped_cpu_id().
 v10:
  - replace LINE_MAX with a custom macro and fix two typos.
 v9:
  - move new feature description from release_24_07.rst to release_24_11.rst.
 v8:
  - update the latest code to resolve CI warning
 v7:
  - remove a dead code rte_lcore_is_enabled in patch[2/2]
 v6:
  - update release_24_07.rst based on dpdk repo to resolve CI warning.
 v5:
  - use LINE_MAX to replace BUFSIZ, and use snprintf to replace sprintf.
 v4:
  - fix some comments basd on Stephen
  - add stdint.h include
  - add Acked-by Morten Brørup <mb@smartsharesystems.com>
 v3:
  - add RTE_POWER_xxx prefix for some macro in header
  - add the check for lcore_id with rte_lcore_is_enabled
 v2:
  - use PM QoS on CPU wide to replace the one on system wide

Huisong Li (2):
  power: introduce PM QoS API on CPU wide
  examples/l3fwd-power: add PM QoS configuration

 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 examples/l3fwd-power/main.c            |  24 +++++
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 7 files changed, 250 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

-- 
2.22.0


^ permalink raw reply	[relevance 4%]

* [PATCH v11 1/2] power: introduce PM QoS API on CPU wide
  2024-10-21 11:42  4% ` [PATCH v11 0/2] power: " Huisong Li
@ 2024-10-21 11:42  5%   ` Huisong Li
  2024-10-22  9:08  0%     ` Konstantin Ananyev
  0 siblings, 1 reply; 200+ results
From: Huisong Li @ 2024-10-21 11:42 UTC (permalink / raw)
  To: dev
  Cc: mb, thomas, ferruh.yigit, anatoly.burakov, david.hunt,
	sivaprasad.tummala, stephen, konstantin.ananyev, david.marchand,
	fengchengwen, liuyonglong, lihuisong

The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.

And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.

The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
when wake up from by setting strict resume latency (zero value).

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 doc/guides/prog_guide/power_man.rst    |  19 ++++
 doc/guides/rel_notes/release_24_11.rst |   5 +
 lib/power/meson.build                  |   2 +
 lib/power/rte_power_qos.c              | 123 +++++++++++++++++++++++++
 lib/power/rte_power_qos.h              |  73 +++++++++++++++
 lib/power/version.map                  |   4 +
 6 files changed, 226 insertions(+)
 create mode 100644 lib/power/rte_power_qos.c
 create mode 100644 lib/power/rte_power_qos.h

diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..91358b04f3 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -107,6 +107,25 @@ User Cases
 The power management mechanism is used to save power when performing L3 forwarding.
 
 
+PM QoS
+------
+
+The "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are latency sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+Applications can set and get the CPU resume latency by the
+``rte_power_qos_set_cpu_resume_latency()`` and ``rte_power_qos_get_cpu_resume_latency()``
+respectively. Applications can set a strict resume latency (zero value) by
+the ``rte_power_qos_set_cpu_resume_latency()`` to low the resume latency and
+get better performance (instead, the power consumption of platform may increase).
+
+
 Ethernet PMD Power Management API
 ---------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..d9e268274b 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -237,6 +237,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Introduce per-CPU PM QoS interface.**
+
+  * Add per-CPU PM QoS interface to low the resume latency when wake up from
+    idle state.
+
 * **Added new API to register telemetry endpoint callbacks with private arguments.**
 
   A new ``rte_telemetry_register_cmd_arg`` function is available to pass an opaque value to
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9b5d3e8315 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
         'rte_power.c',
         'rte_power_uncore.c',
         'rte_power_pmd_mgmt.c',
+	'rte_power_qos.c',
 )
 headers = files(
         'rte_power.h',
         'rte_power_guest_channel.h',
         'rte_power_pmd_mgmt.h',
         'rte_power_uncore.h',
+	'rte_power_qos.h',
 )
 
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..09692b2161
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US	\
+	"/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN	32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	if (latency < 0) {
+		POWER_LOG(ERR, "latency should be greater than and equal to 0");
+		return -EINVAL;
+	}
+
+	ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different input string.
+	 * 1> the resume latency is 0 if the input is "n/a".
+	 * 2> the resume latency is no constraint if the input is "0".
+	 * 3> the resume latency is the actual value to be set.
+	 */
+	if (latency == 0)
+		snprintf(buf, sizeof(buf), "%s", "n/a");
+	else if (latency == RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+		snprintf(buf, sizeof(buf), "%u", 0);
+	else
+		snprintf(buf, sizeof(buf), "%u", latency);
+
+	ret = write_core_sysfs_s(f, buf);
+	if (ret != 0)
+		POWER_LOG(ERR, "Failed to write "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+
+	fclose(f);
+
+	return ret;
+}
+
+int
+rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id)
+{
+	char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+	int latency = -1;
+	uint32_t cpu_id;
+	FILE *f;
+	int ret;
+
+	if (!rte_lcore_is_enabled(lcore_id)) {
+		POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+		return -EINVAL;
+	}
+	ret = power_get_lcore_mapped_cpu_id(lcore_id, &cpu_id);
+	if (ret != 0)
+		return ret;
+
+	ret = open_core_sysfs_file(&f, "r", PM_QOS_SYSFILE_RESUME_LATENCY_US, cpu_id);
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to open "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		return ret;
+	}
+
+	ret = read_core_sysfs_s(f, buf, sizeof(buf));
+	if (ret != 0) {
+		POWER_LOG(ERR, "Failed to read "PM_QOS_SYSFILE_RESUME_LATENCY_US" : %s",
+			  cpu_id, strerror(errno));
+		goto out;
+	}
+
+	/*
+	 * Based on the sysfs interface pm_qos_resume_latency_us under
+	 * @PM_QOS_SYSFILE_RESUME_LATENCY_US directory in kernel, their meaning
+	 * is as follows for different output string.
+	 * 1> the resume latency is 0 if the output is "n/a".
+	 * 2> the resume latency is no constraint if the output is "0".
+	 * 3> the resume latency is the actual value in used for other string.
+	 */
+	if (strcmp(buf, "n/a") == 0)
+		latency = 0;
+	else {
+		latency = strtoul(buf, NULL, 10);
+		latency = latency == 0 ? RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT : latency;
+	}
+
+out:
+	fclose(f);
+
+	return latency != -1 ? latency : ret;
+}
diff --git a/lib/power/rte_power_qos.h b/lib/power/rte_power_qos.h
new file mode 100644
index 0000000000..990c488373
--- /dev/null
+++ b/lib/power/rte_power_qos.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#ifndef RTE_POWER_QOS_H
+#define RTE_POWER_QOS_H
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file rte_power_qos.h
+ *
+ * PM QoS API.
+ *
+ * The CPU-wide resume latency limit has a positive impact on this CPU's idle
+ * state selection in each cpuidle governor.
+ * Please see the PM QoS on CPU wide in the following link:
+ * https://www.kernel.org/doc/html/latest/admin-guide/abi-testing.html?highlight=pm_qos_resume_latency_us#abi-sys-devices-power-pm-qos-resume-latency-us
+ *
+ * The deeper the idle state, the lower the power consumption, but the
+ * longer the resume time. Some service are delay sensitive and very except the
+ * low resume time, like interrupt packet receiving mode.
+ *
+ * In these case, per-CPU PM QoS API can be used to control this CPU's idle
+ * state selection and limit just enter the shallowest idle state to low the
+ * delay after sleep by setting strict resume latency (zero value).
+ */
+
+#define RTE_POWER_QOS_STRICT_LATENCY_VALUE             0
+#define RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT    ((int)(UINT32_MAX >> 1))
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * @param lcore_id
+ *   target logical core id
+ *
+ * @param latency
+ *   The latency should be greater than and equal to zero in microseconds unit.
+ *
+ * @return
+ *   0 on success. Otherwise negative value is returned.
+ */
+__rte_experimental
+int rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the current resume latency of this logical core.
+ * The default value in kernel is @see RTE_POWER_QOS_RESUME_LATENCY_NO_CONSTRAINT
+ * if don't set it.
+ *
+ * @return
+ *   Negative value on failure.
+ *   >= 0 means the actual resume latency limit on this core.
+ */
+__rte_experimental
+int rte_power_qos_get_cpu_resume_latency(uint16_t lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_QOS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..08f178a39d 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,8 @@ EXPERIMENTAL {
 	rte_power_set_uncore_env;
 	rte_power_uncore_freqs;
 	rte_power_unset_uncore_env;
+
+	# added in 24.11
+	rte_power_qos_get_cpu_resume_latency;
+	rte_power_qos_set_cpu_resume_latency;
 };
-- 
2.22.0


^ permalink raw reply	[relevance 5%]

* [PATCH 00/10] eventdev: remove single-event enqueue and dequeue
    2024-10-21  7:25  0%   ` Jerin Jacob
  2024-10-21  8:51  3%   ` [PATCH " Mattias Rönnblom
@ 2024-10-21  9:06  3%   ` Mattias Rönnblom
  2024-10-21  9:06 11%     ` [PATCH 10/10] eventdev: remove single event " Mattias Rönnblom
  2 siblings, 1 reply; 200+ results
From: Mattias Rönnblom @ 2024-10-21  9:06 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Mattias Rönnblom, David Marchand, Stephen Hemminger,
	Anoob Joseph, Hemant Agrawal, Sachin Saxena, Abdullah Sevincer,
	Pavan Nikhilesh, Shijith Thotton, Harry van Haaren,
	Mattias Rönnblom

Remove the single-event enqueue and dequeue functions from the
eventdev "ops" struct, to reduce complexity, leaving performance
unaffected.

This ABI change has been announced as a DPDK deprecation notice,
originally scheduled for DPDK 23.11.

Mattias Rönnblom (9):
  event/dsw: remove single event enqueue and dequeue
  event/dlb2: remove single event enqueue and dequeue
  event/octeontx: remove single event enqueue and dequeue
  event/sw: remove single event enqueue and dequeue
  event/dpaa: remove single event enqueue and dequeue
  event/dpaa2: remove single event enqueue and dequeue
  event/opdl: remove single event enqueue and dequeue
  event/skeleton: remove single event enqueue and dequeue
  eventdev: remove single event enqueue and dequeue

Pavan Nikhilesh (1):
  event/cnxk: remove single event enqueue and dequeue

 doc/guides/rel_notes/deprecation.rst       |  6 +-
 doc/guides/rel_notes/release_24_11.rst     |  3 +
 drivers/event/cnxk/cn10k_eventdev.c        | 74 ++--------------------
 drivers/event/cnxk/cn10k_worker.c          | 49 +++++++-------
 drivers/event/cnxk/cn10k_worker.h          |  1 -
 drivers/event/cnxk/cn9k_eventdev.c         | 73 +--------------------
 drivers/event/cnxk/cn9k_worker.c           | 26 +++-----
 drivers/event/cnxk/cn9k_worker.h           |  3 -
 drivers/event/dlb2/dlb2.c                  | 40 +-----------
 drivers/event/dpaa/dpaa_eventdev.c         | 27 +-------
 drivers/event/dpaa2/dpaa2_eventdev.c       | 15 -----
 drivers/event/dsw/dsw_evdev.c              |  2 -
 drivers/event/dsw/dsw_evdev.h              |  2 -
 drivers/event/dsw/dsw_event.c              | 12 ----
 drivers/event/octeontx/ssovf_evdev.h       |  1 -
 drivers/event/octeontx/ssovf_worker.c      | 40 ++----------
 drivers/event/opdl/opdl_evdev.c            |  2 -
 drivers/event/skeleton/skeleton_eventdev.c | 29 ---------
 drivers/event/sw/sw_evdev.c                |  2 -
 drivers/event/sw/sw_evdev.h                |  2 -
 drivers/event/sw/sw_evdev_worker.c         | 12 ----
 lib/eventdev/eventdev_pmd.h                |  4 --
 lib/eventdev/eventdev_private.c            | 22 -------
 lib/eventdev/rte_eventdev.h                | 21 ++----
 lib/eventdev/rte_eventdev_core.h           | 11 ----
 25 files changed, 52 insertions(+), 427 deletions(-)

-- 
2.43.0


^ permalink raw reply	[relevance 3%]

* Re: [PATCH 00/10] eventdev: remove single-event enqueue and dequeue
  2024-10-21  8:51  3%   ` [PATCH " Mattias Rönnblom
  2024-10-21  8:51 11%     ` [PATCH 10/10] eventdev: remove single event " Mattias Rönnblom
@ 2024-10-21  9:21  0%     ` Mattias Rönnblom
  1 sibling, 0 replies; 200+ results
From: Mattias Rönnblom @ 2024-10-21  9:21 UTC (permalink / raw)
  To: Mattias Rönnblom, Jerin Jacob
  Cc: dev, David Marchand, Stephen Hemminger, Anoob Joseph,
	Hemant Agrawal, Sachin Saxena, Abdullah Sevincer,
	Pavan Nikhilesh, Shijith Thotton, Harry van Haaren

On 2024-10-21 10:51, Mattias Rönnblom wrote:
> Remove the single-event enqueue and dequeue functions from the
> eventdev "ops" struct, to reduce complexity, leaving performance
> unaffected.
> 
> This ABI change has been announced as a DPDK deprecation notice,
> originally scheduled for DPDK 23.11.
> 

The outgoing SMTP server I'm required to use seems to throw away random 
messages for the moment.

I tried to repost the same patchset, but in that case it threw away the 
cover letter.

Jerin, maybe you can puzzle something together.

> Mattias Rönnblom (9):
>    event/dsw: remove single event enqueue and dequeue
>    event/dlb2: remove single event enqueue and dequeue
>    event/octeontx: remove single event enqueue and dequeue
>    event/sw: remove single event enqueue and dequeue
>    event/dpaa: remove single event enqueue and dequeue
>    event/dpaa2: remove single event enqueue and dequeue
>    event/opdl: remove single event enqueue and dequeue
>    event/skeleton: remove single event enqueue and dequeue
>    eventdev: remove single event enqueue and dequeue
> 
> Pavan Nikhilesh (1):
>    event/cnxk: remove single event enqueue and dequeue
> 
>   doc/guides/rel_notes/deprecation.rst       |  6 +-
>   doc/guides/rel_notes/release_24_11.rst     |  3 +
>   drivers/event/cnxk/cn10k_eventdev.c        | 74 ++--------------------
>   drivers/event/cnxk/cn10k_worker.c          | 49 +++++++-------
>   drivers/event/cnxk/cn10k_worker.h          |  1 -
>   drivers/event/cnxk/cn9k_eventdev.c         | 73 +--------------------
>   drivers/event/cnxk/cn9k_worker.c           | 26 +++-----
>   drivers/event/cnxk/cn9k_worker.h           |  3 -
>   drivers/event/dlb2/dlb2.c                  | 40 +-----------
>   drivers/event/dpaa/dpaa_eventdev.c         | 27 +-------
>   drivers/event/dpaa2/dpaa2_eventdev.c       | 15 -----
>   drivers/event/dsw/dsw_evdev.c              |  2 -
>   drivers/event/dsw/dsw_evdev.h              |  2 -
>   drivers/event/dsw/dsw_event.c              | 12 ----
>   drivers/event/octeontx/ssovf_evdev.h       |  1 -
>   drivers/event/octeontx/ssovf_worker.c      | 40 ++----------
>   drivers/event/opdl/opdl_evdev.c            |  2 -
>   drivers/event/skeleton/skeleton_eventdev.c | 29 ---------
>   drivers/event/sw/sw_evdev.c                |  2 -
>   drivers/event/sw/sw_evdev.h                |  2 -
>   drivers/event/sw/sw_evdev_worker.c         | 12 ----
>   lib/eventdev/eventdev_pmd.h                |  4 --
>   lib/eventdev/eventdev_private.c            | 22 -------
>   lib/eventdev/rte_eventdev.h                | 21 ++----
>   lib/eventdev/rte_eventdev_core.h           | 11 ----
>   25 files changed, 52 insertions(+), 427 deletions(-)
> 


^ permalink raw reply	[relevance 0%]

* [PATCH 10/10] eventdev: remove single event enqueue and dequeue
  2024-10-21  9:06  3%   ` Mattias Rönnblom
@ 2024-10-21  9:06 11%     ` Mattias Rönnblom
  0 siblings, 0 replies; 200+ results
From: Mattias Rönnblom @ 2024-10-21  9:06 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Mattias Rönnblom, David Marchand, Stephen Hemminger,
	Anoob Joseph, Hemant Agrawal, Sachin Saxena, Abdullah Sevincer,
	Pavan Nikhilesh, Shijith Thotton, Harry van Haaren,
	Mattias Rönnblom

Remove the single event enqueue and dequeue, since they did not
provide any noticeable performance benefits.

This is a change of the ABI, previously announced as a deprecation
notice. These functions were not directly invoked by the application,
so the API remains unaffected.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>

--

RFC v3:
 * Update release notes. (Jerin Jacob)
 * Remove single-event enqueue and dequeue function typedefs.
   (Pavan Nikhilesh)
---
 doc/guides/rel_notes/deprecation.rst   |  6 +-----
 doc/guides/rel_notes/release_24_11.rst |  3 +++
 lib/eventdev/eventdev_pmd.h            |  4 ----
 lib/eventdev/eventdev_private.c        | 22 ----------------------
 lib/eventdev/rte_eventdev.h            | 21 ++++-----------------
 lib/eventdev/rte_eventdev_core.h       | 11 -----------
 6 files changed, 8 insertions(+), 59 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 17b7332007..a90b54fc77 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -131,11 +131,7 @@ Deprecation Notices
 
 * eventdev: The single-event (non-burst) enqueue and dequeue operations,
   used by static inline burst enqueue and dequeue functions in ``rte_eventdev.h``,
-  will be removed in DPDK 23.11.
-  This simplification includes changing the layout and potentially also
-  the size of the public ``rte_event_fp_ops`` struct, breaking the ABI.
-  Since these functions are not called directly by the application,
-  the API remains unaffected.
+  are removed in DPDK 24.11.
 
 * pipeline: The pipeline library legacy API (functions rte_pipeline_*)
   will be deprecated and subsequently removed in DPDK 24.11 release.
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..5461798970 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -401,6 +401,9 @@ ABI Changes
 
 * eventdev: Added ``preschedule_type`` field to ``rte_event_dev_config`` structure.
 
+* eventdev: The PMD single-event enqueue and dequeue function pointers are removed
+  from ``rte_event_fp_fps``.
+
 * graph: To accommodate node specific xstats counters, added ``xstat_cntrs``,
   ``xstat_desc`` and ``xstat_count`` to ``rte_graph_cluster_node_stats``,
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index af855e3467..36148f8d86 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -158,16 +158,12 @@ struct __rte_cache_aligned rte_eventdev {
 	uint8_t attached : 1;
 	/**< Flag indicating the device is attached */
 
-	event_enqueue_t enqueue;
-	/**< Pointer to PMD enqueue function. */
 	event_enqueue_burst_t enqueue_burst;
 	/**< Pointer to PMD enqueue burst function. */
 	event_enqueue_burst_t enqueue_new_burst;
 	/**< Pointer to PMD enqueue burst function(op new variant) */
 	event_enqueue_burst_t enqueue_forward_burst;
 	/**< Pointer to PMD enqueue burst function(op forward variant) */
-	event_dequeue_t dequeue;
-	/**< Pointer to PMD dequeue function. */
 	event_dequeue_burst_t dequeue_burst;
 	/**< Pointer to PMD dequeue burst function. */
 	event_maintain_t maintain;
diff --git a/lib/eventdev/eventdev_private.c b/lib/eventdev/eventdev_private.c
index b628f4a69e..6df129fc2d 100644
--- a/lib/eventdev/eventdev_private.c
+++ b/lib/eventdev/eventdev_private.c
@@ -5,15 +5,6 @@
 #include "eventdev_pmd.h"
 #include "rte_eventdev.h"
 
-static uint16_t
-dummy_event_enqueue(__rte_unused void *port,
-		    __rte_unused const struct rte_event *ev)
-{
-	RTE_EDEV_LOG_ERR(
-		"event enqueue requested for unconfigured event device");
-	return 0;
-}
-
 static uint16_t
 dummy_event_enqueue_burst(__rte_unused void *port,
 			  __rte_unused const struct rte_event ev[],
@@ -24,15 +15,6 @@ dummy_event_enqueue_burst(__rte_unused void *port,
 	return 0;
 }
 
-static uint16_t
-dummy_event_dequeue(__rte_unused void *port, __rte_unused struct rte_event *ev,
-		    __rte_unused uint64_t timeout_ticks)
-{
-	RTE_EDEV_LOG_ERR(
-		"event dequeue requested for unconfigured event device");
-	return 0;
-}
-
 static uint16_t
 dummy_event_dequeue_burst(__rte_unused void *port,
 			  __rte_unused struct rte_event ev[],
@@ -129,11 +111,9 @@ event_dev_fp_ops_reset(struct rte_event_fp_ops *fp_op)
 {
 	static void *dummy_data[RTE_MAX_QUEUES_PER_PORT];
 	static const struct rte_event_fp_ops dummy = {
-		.enqueue = dummy_event_enqueue,
 		.enqueue_burst = dummy_event_enqueue_burst,
 		.enqueue_new_burst = dummy_event_enqueue_burst,
 		.enqueue_forward_burst = dummy_event_enqueue_burst,
-		.dequeue = dummy_event_dequeue,
 		.dequeue_burst = dummy_event_dequeue_burst,
 		.maintain = dummy_event_maintain,
 		.txa_enqueue = dummy_event_tx_adapter_enqueue,
@@ -153,11 +133,9 @@ void
 event_dev_fp_ops_set(struct rte_event_fp_ops *fp_op,
 		     const struct rte_eventdev *dev)
 {
-	fp_op->enqueue = dev->enqueue;
 	fp_op->enqueue_burst = dev->enqueue_burst;
 	fp_op->enqueue_new_burst = dev->enqueue_new_burst;
 	fp_op->enqueue_forward_burst = dev->enqueue_forward_burst;
-	fp_op->dequeue = dev->dequeue;
 	fp_op->dequeue_burst = dev->dequeue_burst;
 	fp_op->maintain = dev->maintain;
 	fp_op->txa_enqueue = dev->txa_enqueue;
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index b5c3c16dd0..fabd1490db 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -2596,14 +2596,8 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t port_id,
 	}
 #endif
 	rte_eventdev_trace_enq_burst(dev_id, port_id, ev, nb_events, (void *)fn);
-	/*
-	 * Allow zero cost non burst mode routine invocation if application
-	 * requests nb_events as const one
-	 */
-	if (nb_events == 1)
-		return (fp_ops->enqueue)(port, ev);
-	else
-		return fn(port, ev, nb_events);
+
+	return fn(port, ev, nb_events);
 }
 
 /**
@@ -2852,15 +2846,8 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t port_id, struct rte_event ev[],
 	}
 #endif
 	rte_eventdev_trace_deq_burst(dev_id, port_id, ev, nb_events);
-	/*
-	 * Allow zero cost non burst mode routine invocation if application
-	 * requests nb_events as const one
-	 */
-	if (nb_events == 1)
-		return (fp_ops->dequeue)(port, ev, timeout_ticks);
-	else
-		return (fp_ops->dequeue_burst)(port, ev, nb_events,
-					       timeout_ticks);
+
+	return (fp_ops->dequeue_burst)(port, ev, nb_events, timeout_ticks);
 }
 
 #define RTE_EVENT_DEV_MAINT_OP_FLUSH          (1 << 0)
diff --git a/lib/eventdev/rte_eventdev_core.h b/lib/eventdev/rte_eventdev_core.h
index 2706d5e6c8..1818483044 100644
--- a/lib/eventdev/rte_eventdev_core.h
+++ b/lib/eventdev/rte_eventdev_core.h
@@ -12,18 +12,11 @@
 extern "C" {
 #endif
 
-typedef uint16_t (*event_enqueue_t)(void *port, const struct rte_event *ev);
-/**< @internal Enqueue event on port of a device */
-
 typedef uint16_t (*event_enqueue_burst_t)(void *port,
 					  const struct rte_event ev[],
 					  uint16_t nb_events);
 /**< @internal Enqueue burst of events on port of a device */
 
-typedef uint16_t (*event_dequeue_t)(void *port, struct rte_event *ev,
-				    uint64_t timeout_ticks);
-/**< @internal Dequeue event from port of a device */
-
 typedef uint16_t (*event_dequeue_burst_t)(void *port, struct rte_event ev[],
 					  uint16_t nb_events,
 					  uint64_t timeout_ticks);
@@ -60,16 +53,12 @@ typedef void (*event_preschedule_t)(void *port,
 struct __rte_cache_aligned rte_event_fp_ops {
 	void **data;
 	/**< points to array of internal port data pointers */
-	event_enqueue_t enqueue;
-	/**< PMD enqueue function. */
 	event_enqueue_burst_t enqueue_burst;
 	/**< PMD enqueue burst function. */
 	event_enqueue_burst_t enqueue_new_burst;
 	/**< PMD enqueue burst new function. */
 	event_enqueue_burst_t enqueue_forward_burst;
 	/**< PMD enqueue burst fwd function. */
-	event_dequeue_t dequeue;
-	/**< PMD dequeue function. */
 	event_dequeue_burst_t dequeue_burst;
 	/**< PMD dequeue burst function. */
 	event_maintain_t maintain;
-- 
2.43.0


^ permalink raw reply	[relevance 11%]

* [PATCH 00/10] eventdev: remove single-event enqueue and dequeue
    2024-10-21  7:25  0%   ` Jerin Jacob
@ 2024-10-21  8:51  3%   ` Mattias Rönnblom
  2024-10-21  8:51 11%     ` [PATCH 10/10] eventdev: remove single event " Mattias Rönnblom
  2024-10-21  9:21  0%     ` [PATCH 00/10] eventdev: remove single-event " Mattias Rönnblom
  2024-10-21  9:06  3%   ` Mattias Rönnblom
  2 siblings, 2 replies; 200+ results
From: Mattias Rönnblom @ 2024-10-21  8:51 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Mattias Rönnblom, David Marchand, Stephen Hemminger,
	Anoob Joseph, Hemant Agrawal, Sachin Saxena, Abdullah Sevincer,
	Pavan Nikhilesh, Shijith Thotton, Harry van Haaren,
	Mattias Rönnblom

Remove the single-event enqueue and dequeue functions from the
eventdev "ops" struct, to reduce complexity, leaving performance
unaffected.

This ABI change has been announced as a DPDK deprecation notice,
originally scheduled for DPDK 23.11.

Mattias Rönnblom (9):
  event/dsw: remove single event enqueue and dequeue
  event/dlb2: remove single event enqueue and dequeue
  event/octeontx: remove single event enqueue and dequeue
  event/sw: remove single event enqueue and dequeue
  event/dpaa: remove single event enqueue and dequeue
  event/dpaa2: remove single event enqueue and dequeue
  event/opdl: remove single event enqueue and dequeue
  event/skeleton: remove single event enqueue and dequeue
  eventdev: remove single event enqueue and dequeue

Pavan Nikhilesh (1):
  event/cnxk: remove single event enqueue and dequeue

 doc/guides/rel_notes/deprecation.rst       |  6 +-
 doc/guides/rel_notes/release_24_11.rst     |  3 +
 drivers/event/cnxk/cn10k_eventdev.c        | 74 ++--------------------
 drivers/event/cnxk/cn10k_worker.c          | 49 +++++++-------
 drivers/event/cnxk/cn10k_worker.h          |  1 -
 drivers/event/cnxk/cn9k_eventdev.c         | 73 +--------------------
 drivers/event/cnxk/cn9k_worker.c           | 26 +++-----
 drivers/event/cnxk/cn9k_worker.h           |  3 -
 drivers/event/dlb2/dlb2.c                  | 40 +-----------
 drivers/event/dpaa/dpaa_eventdev.c         | 27 +-------
 drivers/event/dpaa2/dpaa2_eventdev.c       | 15 -----
 drivers/event/dsw/dsw_evdev.c              |  2 -
 drivers/event/dsw/dsw_evdev.h              |  2 -
 drivers/event/dsw/dsw_event.c              | 12 ----
 drivers/event/octeontx/ssovf_evdev.h       |  1 -
 drivers/event/octeontx/ssovf_worker.c      | 40 ++----------
 drivers/event/opdl/opdl_evdev.c            |  2 -
 drivers/event/skeleton/skeleton_eventdev.c | 29 ---------
 drivers/event/sw/sw_evdev.c                |  2 -
 drivers/event/sw/sw_evdev.h                |  2 -
 drivers/event/sw/sw_evdev_worker.c         | 12 ----
 lib/eventdev/eventdev_pmd.h                |  4 --
 lib/eventdev/eventdev_private.c            | 22 -------
 lib/eventdev/rte_eventdev.h                | 21 ++----
 lib/eventdev/rte_eventdev_core.h           | 11 ----
 25 files changed, 52 insertions(+), 427 deletions(-)

-- 
2.43.0


^ permalink raw reply	[relevance 3%]

* [PATCH 10/10] eventdev: remove single event enqueue and dequeue
  2024-10-21  8:51  3%   ` [PATCH " Mattias Rönnblom
@ 2024-10-21  8:51 11%     ` Mattias Rönnblom
  2024-10-21  9:21  0%     ` [PATCH 00/10] eventdev: remove single-event " Mattias Rönnblom
  1 sibling, 0 replies; 200+ results
From: Mattias Rönnblom @ 2024-10-21  8:51 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Mattias Rönnblom, David Marchand, Stephen Hemminger,
	Anoob Joseph, Hemant Agrawal, Sachin Saxena, Abdullah Sevincer,
	Pavan Nikhilesh, Shijith Thotton, Harry van Haaren,
	Mattias Rönnblom

Remove the single event enqueue and dequeue, since they did not
provide any noticeable performance benefits.

This is a change of the ABI, previously announced as a deprecation
notice. These functions were not directly invoked by the application,
so the API remains unaffected.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>

--

RFC v3:
 * Update release notes. (Jerin Jacob)
 * Remove single-event enqueue and dequeue function typedefs.
   (Pavan Nikhilesh)
---
 doc/guides/rel_notes/deprecation.rst   |  6 +-----
 doc/guides/rel_notes/release_24_11.rst |  3 +++
 lib/eventdev/eventdev_pmd.h            |  4 ----
 lib/eventdev/eventdev_private.c        | 22 ----------------------
 lib/eventdev/rte_eventdev.h            | 21 ++++-----------------
 lib/eventdev/rte_eventdev_core.h       | 11 -----------
 6 files changed, 8 insertions(+), 59 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 17b7332007..a90b54fc77 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -131,11 +131,7 @@ Deprecation Notices
 
 * eventdev: The single-event (non-burst) enqueue and dequeue operations,
   used by static inline burst enqueue and dequeue functions in ``rte_eventdev.h``,
-  will be removed in DPDK 23.11.
-  This simplification includes changing the layout and potentially also
-  the size of the public ``rte_event_fp_ops`` struct, breaking the ABI.
-  Since these functions are not called directly by the application,
-  the API remains unaffected.
+  are removed in DPDK 24.11.
 
 * pipeline: The pipeline library legacy API (functions rte_pipeline_*)
   will be deprecated and subsequently removed in DPDK 24.11 release.
diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
index fa4822d928..5461798970 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -401,6 +401,9 @@ ABI Changes
 
 * eventdev: Added ``preschedule_type`` field to ``rte_event_dev_config`` structure.
 
+* eventdev: The PMD single-event enqueue and dequeue function pointers are removed
+  from ``rte_event_fp_fps``.
+
 * graph: To accommodate node specific xstats counters, added ``xstat_cntrs``,
   ``xstat_desc`` and ``xstat_count`` to ``rte_graph_cluster_node_stats``,
   added new structure ``rte_node_xstats`` to ``rte_node_register`` and
diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index af855e3467..36148f8d86 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -158,16 +158,12 @@ struct __rte_cache_aligned rte_eventdev {
 	uint8_t attached : 1;
 	/**< Flag indicating the device is attached */
 
-	event_enqueue_t enqueue;
-	/**< Pointer to PMD enqueue function. */
 	event_enqueue_burst_t enqueue_burst;
 	/**< Pointer to PMD enqueue burst function. */
 	event_enqueue_burst_t enqueue_new_burst;
 	/**< Pointer to PMD enqueue burst function(op new variant) */
 	event_enqueue_burst_t enqueue_forward_burst;
 	/**< Pointer to PMD enqueue burst function(op forward variant) */
-	event_dequeue_t dequeue;
-	/**< Pointer to PMD dequeue function. */
 	event_dequeue_burst_t dequeue_burst;
 	/**< Pointer to PMD dequeue burst function. */
 	event_maintain_t maintain;
diff --git a/lib/eventdev/eventdev_private.c b/lib/eventdev/eventdev_private.c
index b628f4a69e..6df129fc2d 100644
--- a/lib/eventdev/eventdev_private.c
+++ b/lib/eventdev/eventdev_private.c
@@ -5,15 +5,6 @@
 #include "eventdev_pmd.h"
 #include "rte_eventdev.h"
 
-static uint16_t
-dummy_event_enqueue(__rte_unused void *port,
-		    __rte_unused const struct rte_event *ev)
-{
-	RTE_EDEV_LOG_ERR(
-		"event enqueue requested for unconfigured event device");
-	return 0;
-}
-
 static uint16_t
 dummy_event_enqueue_burst(__rte_unused void *port,
 			  __rte_unused const struct rte_event ev[],
@@ -24,15 +15,6 @@ dummy_event_enqueue_burst(__rte_unused void *port,
 	return 0;
 }
 
-static uint16_t
-dummy_event_dequeue(__rte_unused void *port, __rte_unused struct rte_event *ev,
-		    __rte_unused uint64_t timeout_ticks)
-{
-	RTE_EDEV_LOG_ERR(
-		"event dequeue requested for unconfigured event device");
-	return 0;
-}
-
 static uint16_t
 dummy_event_dequeue_burst(__rte_unused void *port,
 			  __rte_unused struct rte_event ev[],
@@ -129,11 +111,9 @@ event_dev_fp_ops_reset(struct rte_event_fp_ops *fp_op)
 {
 	static void *dummy_data[RTE_MAX_QUEUES_PER_PORT];
 	static const struct rte_event_fp_ops dummy = {
-		.enqueue = dummy_event_enqueue,
 		.enqueue_burst = dummy_event_enqueue_burst,
 		.enqueue_new_burst = dummy_event_enqueue_burst,
 		.enqueue_forward_burst = dummy_event_enqueue_burst,
-		.dequeue = dummy_event_dequeue,
 		.dequeue_burst = dummy_event_dequeue_burst,
 		.maintain = dummy_event_maintain,
 		.txa_enqueue = dummy_event_tx_adapter_enqueue,
@@ -153,11 +133,9 @@ void
 event_dev_fp_ops_set(struct rte_event_fp_ops *fp_op,
 		     const struct rte_eventdev *dev)
 {
-	fp_op->enqueue = dev->enqueue;
 	fp_op->enqueue_burst = dev->enqueue_burst;
 	fp_op->enqueue_new_burst = dev->enqueue_new_burst;
 	fp_op->enqueue_forward_burst = dev->enqueue_forward_burst;
-	fp_op->dequeue = dev->dequeue;
 	fp_op->dequeue_burst = dev->dequeue_burst;
 	fp_op->maintain = dev->maintain;
 	fp_op->txa_enqueue = dev->txa_enqueue;
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index b5c3c16dd0..fabd1490db 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -2596,14 +2596,8 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t port_id,
 	}
 #endif
 	rte_eventdev_trace_enq_burst(dev_id, port_id, ev, nb_events, (void *)fn);
-	/*
-	 * Allow zero cost non burst mode routine invocation if application
-	 * requests nb_events as const one
-	 */
-	if (nb_events == 1)
-		return (fp_ops->enqueue)(port, ev);
-	else
-		return fn(port, ev, nb_events);
+
+	return fn(port, ev, nb_events);
 }
 
 /**
@@ -2852,15 +2846,8 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t port_id, struct rte_event ev[],
 	}
 #endif
 	rte_eventdev_trace_deq_burst(dev_id, port_id, ev, nb_events);
-	/*
-	 * Allow zero cost non burst mode routine invocation if application
-	 * requests nb_events as const one
-	 */
-	if (nb_events == 1)
-		return (fp_ops->dequeue)(port, ev, timeout_ticks);
-	else
-		return (fp_ops->dequeue_burst)(port, ev, nb_events,
-					       timeout_ticks);
+
+	return (fp_ops->dequeue_burst)(port, ev, nb_events, timeout_ticks);
 }
 
 #define RTE_EVENT_DEV_MAINT_OP_FLUSH          (1 << 0)
diff --git a/lib/eventdev/rte_eventdev_core.h b/lib/eventdev/rte_eventdev_core.h
index 2706d5e6c8..1818483044 100644
--- a/lib/eventdev/rte_eventdev_core.h
+++ b/lib/eventdev/rte_eventdev_core.h
@@ -12,18 +12,11 @@
 extern "C" {
 #endif
 
-typedef uint16_t (*event_enqueue_t)(void *port, const struct rte_event *ev);
-/**< @internal Enqueue event on port of a device */
-
 typedef uint16_t (*event_enqueue_burst_t)(void *port,
 					  const struct rte_event ev[],
 					  uint16_t nb_events);
 /**< @internal Enqueue burst of events on port of a device */
 
-typedef uint16_t (*event_dequeue_t)(void *port, struct rte_event *ev,
-				    uint64_t timeout_ticks);
-/**< @internal Dequeue event from port of a device */
-
 typedef uint16_t (*event_dequeue_burst_t)(void *port, struct rte_event ev[],
 					  uint16_t nb_events,
 					  uint64_t timeout_ticks);
@@ -60,16 +53,12 @@ typedef void (*event_preschedule_t)(void *port,
 struct __rte_cache_aligned rte_event_fp_ops {
 	void **data;
 	/**< points to array of internal port data pointers */
-	event_enqueue_t enqueue;
-	/**< PMD enqueue function. */
 	event_enqueue_burst_t enqueue_burst;
 	/**< PMD enqueue burst function. */
 	event_enqueue_burst_t enqueue_new_burst;
 	/**< PMD enqueue burst new function. */
 	event_enqueue_burst_t enqueue_forward_burst;
 	/**< PMD enqueue burst fwd function. */
-	event_dequeue_t dequeue;
-	/**< PMD dequeue function. */
 	event_dequeue_burst_t dequeue_burst;
 	/**< PMD dequeue burst function. */
 	event_maintain_t maintain;
-- 
2.43.0


^ permalink raw reply	[relevance 11%]

* Re: [RFC v3 00/10] eventdev: remove single-event enqueue and dequeue
  2024-10-21  7:25  0%   ` Jerin Jacob
@ 2024-10-21  8:38  0%     ` Mattias Rönnblom
  0 siblings, 0 replies; 200+ results
From: Mattias Rönnblom @ 2024-10-21  8:38 UTC (permalink / raw)
  To: Jerin Jacob, Mattias Rönnblom
  Cc: Jerin Jacob, dev, David Marchand, Stephen Hemminger,
	Anoob Joseph, Hemant Agrawal, Sachin Saxena, Abdullah Sevincer,
	Pavan Nikhilesh, Shijith Thotton, Harry van Haaren

On 2024-10-21 09:25, Jerin Jacob wrote:
> On Fri, Oct 18, 2024 at 1:14 AM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>>
>> Remove the single-event enqueue and dequeue functions from the
>> eventdev "ops" struct, to reduce complexity, leaving performance
>> unaffected.
>>
>> This ABI change has been announced as a DPDK deprication notice,
>> originally scheduled for DPDK 23.11.
>>
>> Mattias Rönnblom (9):
> 
> Changes look good. Please send the NON RFC version of the series ASAP.
> I will merge it for rc2 (rc1 is created now)
> 

Without any more changes? OK.

>>    event/dsw: remove single event enqueue and dequeue
>>    event/dlb2: remove single event enqueue and dequeue
>>    event/octeontx: remove single event enqueue and dequeue
>>    event/sw: remove single event enqueue and dequeue
>>    event/dpaa: remove single event enqueue and dequeue
>>    event/dpaa2: remove single event enqueue and dequeue
>>    event/opdl: remove single event enqueue and dequeue
>>    event/skeleton: remove single event enqueue and dequeue
>>    eventdev: remove single event enqueue and dequeue
>>
>> Pavan Nikhilesh (1):
>>    event/cnxk: remove single event enqueue and dequeue
>>   drivers/event/sw/sw_evdev_worker.c         | 12 ----
>>   lib/eventdev/eventdev_pmd.h                |  4 --
>>   lib/eventdev/eventdev_private.c            | 22 -------
>>   lib/eventdev/rte_eventdev.h                | 21 ++----
>>   lib/eventdev/rte_eventdev_core.h           | 11 ----
>>   25 files changed, 52 insertions(+), 427 deletions(-)
>>
>> --
>> 2.43.0
>>


^ permalink raw reply	[relevance 0%]

* Re: [RFC v3 00/10] eventdev: remove single-event enqueue and dequeue
  @ 2024-10-21  7:25  0%   ` Jerin Jacob
  2024-10-21  8:38  0%     ` Mattias Rönnblom
  2024-10-21  8:51  3%   ` [PATCH " Mattias Rönnblom
  2024-10-21  9:06  3%   ` Mattias Rönnblom
  2 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2024-10-21  7:25 UTC (permalink / raw)
  To: Mattias Rönnblom
  Cc: Jerin Jacob, dev, Mattias Rönnblom, David Marchand,
	Stephen Hemminger, Anoob Joseph, Hemant Agrawal, Sachin Saxena,
	Abdullah Sevincer, Pavan Nikhilesh, Shijith Thotton,
	Harry van Haaren

On Fri, Oct 18, 2024 at 1:14 AM Mattias Rönnblom
<mattias.ronnblom@ericsson.com> wrote:
>
> Remove the single-event enqueue and dequeue functions from the
> eventdev "ops" struct, to reduce complexity, leaving performance
> unaffected.
>
> This ABI change has been announced as a DPDK deprication notice,
> originally scheduled for DPDK 23.11.
>
> Mattias Rönnblom (9):

Changes look good. Please send the NON RFC version of the series ASAP.
I will merge it for rc2 (rc1 is created now)

>   event/dsw: remove single event enqueue and dequeue
>   event/dlb2: remove single event enqueue and dequeue
>   event/octeontx: remove single event enqueue and dequeue
>   event/sw: remove single event enqueue and dequeue
>   event/dpaa: remove single event enqueue and dequeue
>   event/dpaa2: remove single event enqueue and dequeue
>   event/opdl: remove single event enqueue and dequeue
>   event/skeleton: remove single event enqueue and dequeue
>   eventdev: remove single event enqueue and dequeue
>
> Pavan Nikhilesh (1):
>   event/cnxk: remove single event enqueue and dequeue
>  drivers/event/sw/sw_evdev_worker.c         | 12 ----
>  lib/eventdev/eventdev_pmd.h                |  4 --
>  lib/eventdev/eventdev_private.c            | 22 -------
>  lib/eventdev/rte_eventdev.h                | 21 ++----
>  lib/eventdev/rte_eventdev_core.h           | 11 ----
>  25 files changed, 52 insertions(+), 427 deletions(-)
>
> --
> 2.43.0
>

^ permalink raw reply	[relevance 0%]

* RE: DPDK - PCIe Steering Tags Meeting on 10/23/24
  @ 2024-10-21  2:05  0% ` Wathsala Wathawana Vithanage
  0 siblings, 0 replies; 200+ results
From: Wathsala Wathawana Vithanage @ 2024-10-21  2:05 UTC (permalink / raw)
  To: dev, Nathan Southern, thomas, Honnappa Nagarahalli; +Cc: nd, nd

Here is the updated RFC https://inbox.dpdk.org/dev/20241021015246.304431-1-wathsala.vithanage@arm.com/#t

Thanks

--wathsal

>
> Subject: DPDK - PCIe Steering Tags Meeting on 10/23/24
> 
> Hi all,
> 
> This is an invitation to discuss adding PCIe steering tags support to DPDK.
> We have had brief conversations over the idea at the DPDK summit.
> Steering tags allows stashing of descriptors and packet data closer to the
> CPUs, possibly allowing for lower latency and higher throughput.
> This feature requires contributions from CPU vendors and NIC vendors.
> The goal of the meeting is to present the next version of the API and seek
> support for implementation from other participants in the community.
> 
> I will be sending out the RFC some time this week, so there will be a plenty of
> time before the meeting to go over it.
> 
> Agenda:
> - Brief introduction to the feature
> - Introduce the APIs from RFC v2 (this will be submitted to the community
> before the call)
> - Dependencies on kernel support - API for reading steering tags
> - Addressing ABI in advance as patches will not be ready by 24.11
> 
> Please join the call if you are interested in the topic.
> LXF meeting registration ink: https://zoom-
> lfx.platform.linuxfoundation.org/meeting/94917063595?password=77f3662
> 5-ad41-4b9c-b067-d33e68c3a29e&invite=true
> 
> Thanks.
> 
> --wathsala


^ permalink raw reply	[relevance 0%]

* release candidate 24.11-rc1
@ 2024-10-18 21:47  4% Thomas Monjalon
  2024-10-29 10:19  0% ` Xu, HailinX
  2024-10-29 19:31  0% ` Thinh Tran
  0 siblings, 2 replies; 200+ results
From: Thomas Monjalon @ 2024-10-18 21:47 UTC (permalink / raw)
  To: announce

A new DPDK release candidate is ready for testing:
	https://git.dpdk.org/dpdk/tag/?id=v24.11-rc1

There are 630 new patches in this snapshot,
including many API/ABI compatibility breakages.
This release won't be ABI-compatible with previous ones.

Release notes:
	https://doc.dpdk.org/guides/rel_notes/release_24_11.html

Highlights of 24.11-rc1:
	- bit set and atomic bit manipulation
	- IPv6 address API
	- Ethernet link lanes
	- flow table index action
	- Cisco enic VF
	- Marvell CN20K
	- symmetric crypto SM4
	- asymmetric crypto EdDSA
	- event device pre-scheduling
	- event device independent enqueue

Please test and report issues on bugs.dpdk.org.

Few more new APIs may be added in -rc2.
DPDK 24.11-rc2 is expected in more than two weeks (early November).

Thank you everyone



^ permalink raw reply	[relevance 4%]

Results 201-400 of ~18000   |  | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2020-08-14 17:34     [dpdk-dev] [PATCH] eal: add option to put timestamp on console output Stephen Hemminger
2024-10-24  3:18     ` [PATCH v27 00/14] Log subsystem changes Stephen Hemminger
2024-10-24  3:18  4%   ` [PATCH v27 14/14] doc: add release note about log library Stephen Hemminger
2024-10-24 19:02     ` [PATCH v28 00/13] Logging subsystem improvements Stephen Hemminger
2024-10-24 19:02  4%   ` [PATCH v28 13/13] doc: add release note about log library Stephen Hemminger
2024-10-25 21:45     ` [PATCH v29 00/13] Logging subsystem enhancements Stephen Hemminger
2024-10-25 21:45  4%   ` [PATCH v29 13/13] doc: add release note about log library Stephen Hemminger
2024-10-27 17:24     ` [PATCH v30 00/13] Log library enhancements Stephen Hemminger
2024-10-27 17:24  4%   ` [PATCH v30 13/13] doc: add release note about log library Stephen Hemminger
2024-03-20 10:55     [PATCH 0/2] introduce PM QoS interface Huisong Li
2024-10-21 11:42  4% ` [PATCH v11 0/2] power: " Huisong Li
2024-10-21 11:42  5%   ` [PATCH v11 1/2] power: introduce PM QoS API on CPU wide Huisong Li
2024-10-22  9:08  0%     ` Konstantin Ananyev
2024-10-22  9:41  0%       ` lihuisong (C)
2024-10-23  4:09  4% ` [PATCH v12 0/3] power: introduce PM QoS interface Huisong Li
2024-10-23  4:09  5%   ` [PATCH v12 1/3] power: introduce PM QoS API on CPU wide Huisong Li
2024-10-25  9:18  4% ` [PATCH v13 0/3] power: introduce PM QoS interface Huisong Li
2024-10-25  9:18  5%   ` [PATCH v13 1/3] power: introduce PM QoS API on CPU wide Huisong Li
2024-10-25 12:08  0%     ` Tummala, Sivaprasad
2024-10-29 13:28  4% ` [PATCH v14 0/3] power: introduce PM QoS interface Huisong Li
2024-10-29 13:28  5%   ` [PATCH v14 1/3] power: introduce PM QoS API on CPU wide Huisong Li
2024-11-04  9:13  0%   ` [PATCH v14 0/3] power: introduce PM QoS interface lihuisong (C)
2024-11-11  2:25  4% ` [PATCH v15 " Huisong Li
2024-11-11  2:25  5%   ` [PATCH v15 1/3] power: introduce PM QoS API on CPU wide Huisong Li
2024-11-11 10:29  0%   ` [PATCH v15 0/3] power: introduce PM QoS interface Thomas Monjalon
2024-11-11  9:14  4% ` [RESEND PATCH " Huisong Li
2024-11-11  9:14  5%   ` [RESEND PATCH v15 1/3] power: introduce PM QoS API on CPU wide Huisong Li
2024-03-20 21:05     [PATCH 00/15] fix packing of structs when building with MSVC Tyler Retzlaff
2024-12-31 18:37     ` [PATCH v8 00/29] " Andre Muezerie
2024-12-31 18:38       ` [PATCH v8 27/29] lib/net: replace packed attributes Andre Muezerie
2025-01-08 12:01  3%     ` David Marchand
2025-01-09  2:49  0%       ` Andre Muezerie
2025-01-09  2:45     ` [PATCH v10 00/30] fix packing of structs when building with MSVC Andre Muezerie
2025-01-09  2:46  1%   ` [PATCH v10 27/30] lib/net: replace packed attributes Andre Muezerie
2025-01-10 22:16     ` [PATCH v11 00/30] fix packing of structs when building with MSVC Andre Muezerie
2025-01-10 22:16  1%   ` [PATCH v11 27/30] net: replace packed attributes Andre Muezerie
2024-04-17 23:41     [PATCH 00/16] remove use of VLAs for Windows built code Tyler Retzlaff
2025-02-06  1:33     ` [PATCH v22 00/27] remove use of VLAs for Windows Andre Muezerie
2025-02-06 20:44  4%   ` David Marchand
2025-02-07 14:23  3%     ` Konstantin Ananyev
2025-02-18 14:22  0%       ` David Marchand
2025-02-19 14:28  0%         ` Konstantin Ananyev
2024-07-15 22:11     [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
2024-10-21  1:52     ` [RFC v3 0/2] An API for Stashing Packets into CPU caches Wathsala Vithanage
2024-10-21  1:52       ` [RFC v3 2/2] ethdev: introduce the cache stashing hints API Wathsala Vithanage
2024-12-03 21:13  3%     ` Stephen Hemminger
2024-12-05 15:40  3%       ` David Marchand
2024-12-05 21:00  0%         ` Stephen Hemminger
2024-07-20 16:50     [PATCH v1 0/4] power: refactor power management library Sivaprasad Tummala
2024-08-26 13:06     ` [PATCH v2 " Sivaprasad Tummala
2024-08-26 13:06       ` [PATCH v2 2/4] power: refactor uncore " Sivaprasad Tummala
2024-08-27 13:02         ` lihuisong (C)
2024-10-08  6:19           ` Tummala, Sivaprasad
2024-10-22  2:05  0%         ` lihuisong (C)
2024-09-05 10:14     [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs Akhil Goyal
2024-10-04  3:54     ` Ferruh Yigit
2024-10-04  7:04       ` David Marchand
2024-10-10  0:49         ` Ferruh Yigit
2024-10-10  6:18           ` [EXTERNAL] " Akhil Goyal
2024-10-28 11:15  3%         ` Dodji Seketeli
2024-10-04  9:38       ` Dodji Seketeli
2024-10-04 17:45         ` [EXTERNAL] " Akhil Goyal
2024-10-28 10:55  4%       ` Dodji Seketeli
2024-10-10  0:35         ` Ferruh Yigit
2024-10-28 10:12  4%       ` Dodji Seketeli
2024-09-20 16:32     [RFC PATCH] mempool: obey configured cache size Morten Brørup
2025-02-21 15:13  4% ` [RFC PATCH v18] mempool: fix mempool " Morten Brørup
2025-02-21 19:05  3% ` [RFC PATCH v19] " Morten Brørup
2025-02-21 20:27  3% ` [RFC PATCH v20] " Morten Brørup
2024-10-01 18:11     [PATCH v2 0/3] net: add thread-safe crc api Arkadiusz Kusztal
2024-10-01 18:11     ` [PATCH v2 1/3] " Arkadiusz Kusztal
2024-10-08  3:42       ` Ferruh Yigit
2024-10-08 20:51         ` Kusztal, ArkadiuszX
2024-10-09  1:03           ` Ferruh Yigit
2025-02-06 20:54  0%         ` Kusztal, ArkadiuszX
2024-12-02 22:36  3%   ` Stephen Hemminger
2025-02-06 20:43  0%     ` Kusztal, ArkadiuszX
2025-02-06 20:38  4%   ` [PATCH v3] " Arkadiusz Kusztal
2025-02-07  6:37  4%     ` [PATCH v4] " Arkadiusz Kusztal
2025-02-07 18:24  4%       ` [PATCH v5] " Arkadiusz Kusztal
2025-02-10 21:27  4%         ` [PATCH v6] " Arkadiusz Kusztal
2025-02-11  6:23  0%           ` Stephen Hemminger
2025-02-11  8:35  0%             ` Kusztal, ArkadiuszX
2025-02-11  9:02  4%           ` [PATCH v7] " Arkadiusz Kusztal
2024-10-11  1:38     [PATCH] doc: correct definition of Stats per queue feature Stephen Hemminger
2024-10-11 19:25     ` Ferruh Yigit
2024-11-26 23:39  0%   ` Thomas Monjalon
2024-10-11  9:49     [PATCH v14 0/4] add support for self monitoring Tomasz Duszynski
2024-10-25  8:54     ` [PATCH v15 " Tomasz Duszynski
2024-10-25  8:54       ` [PATCH v15 4/4] eal: add PMU support to tracing library Tomasz Duszynski
2024-11-12 23:09  3%     ` Stephen Hemminger
2024-11-15 10:24  0%       ` [EXTERNAL] " Tomasz Duszynski
2024-11-18  7:37       ` [PATCH v16 0/4] add support for self monitoring Tomasz Duszynski
2024-11-18  7:37         ` [PATCH v16 1/4] lib: add generic support for reading PMU events Tomasz Duszynski
2024-12-06 18:15  3%       ` Konstantin Ananyev
2025-01-07  7:45  0%         ` Tomasz Duszynski
2024-10-15 18:25     [RFC v2 01/10] event/dsw: remove single event enqueue and dequeue Mattias Rönnblom
2024-10-17  6:38     ` [RFC v3 00/10] eventdev: remove single-event " Mattias Rönnblom
2024-10-21  7:25  0%   ` Jerin Jacob
2024-10-21  8:38  0%     ` Mattias Rönnblom
2024-10-21  8:51  3%   ` [PATCH " Mattias Rönnblom
2024-10-21  8:51 11%     ` [PATCH 10/10] eventdev: remove single event " Mattias Rönnblom
2024-10-21  9:21  0%     ` [PATCH 00/10] eventdev: remove single-event " Mattias Rönnblom
2024-10-21  9:06  3%   ` Mattias Rönnblom
2024-10-21  9:06 11%     ` [PATCH 10/10] eventdev: remove single event " Mattias Rönnblom
2024-10-17 19:56     DPDK - PCIe Steering Tags Meeting on 10/23/24 Wathsala Wathawana Vithanage
2024-10-21  2:05  0% ` Wathsala Wathawana Vithanage
2024-10-17 22:58     [PATCH 0/2] gpudev: annotate memory allocation Stephen Hemminger
2024-11-09  0:22  3% ` Stephen Hemminger
2024-10-18 21:47  4% release candidate 24.11-rc1 Thomas Monjalon
2024-10-29 10:19  0% ` Xu, HailinX
2024-10-29 19:31  0% ` Thinh Tran
2024-10-20  9:22     [PATCH v6 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-21  4:07     ` [PATCH v7 " Sivaprasad Tummala
2024-10-21  4:07       ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
2024-10-22  3:03  3%     ` lihuisong (C)
2024-10-22  7:13  0%       ` Tummala, Sivaprasad
2024-10-22  8:36  0%         ` lihuisong (C)
2024-10-22 19:05  3% [PATCH v6 0/3] add ec points to sm2 op Arkadiusz Kusztal
2024-10-22 19:05  4% ` [PATCH v6 1/3] cryptodev: " Arkadiusz Kusztal
2024-10-23  1:19  0% ` [PATCH v6 0/3] " Stephen Hemminger
2024-10-23  8:19  3% [PATCH v7 " Arkadiusz Kusztal
2024-10-23  8:19  4% ` [PATCH v7 1/3] cryptodev: " Arkadiusz Kusztal
2024-10-28  9:18     [PATCH V2 7/7] mlx5: add backward compatibility for RDMA monitor Minggang Li(Gavin)
2024-10-29 13:42     ` [PATCH V3 0/7] port probe time optimization Minggang Li(Gavin)
2024-10-29 13:42       ` [PATCH V3 7/7] mlx5: add backward compatibility for RDMA monitor Minggang Li(Gavin)
2024-10-29 16:26  3%     ` Stephen Hemminger
2024-10-30  8:25  0%       ` Minggang(Gavin) Li
2024-10-30  8:19     [PATCH 0/3] NFP PMD enhancement Chaoyong He
2024-10-30  8:19  6% ` [PATCH 3/3] net/nfp: add support for port identify Chaoyong He
2024-10-30  8:27     ` [PATCH 0/4] NFP PMD enhancement Chaoyong He
2024-10-30  8:27  6%   ` [PATCH 4/4] net/nfp: add support for port identify Chaoyong He
2024-11-01  2:57       ` [PATCH v3 0/4] NFP PMD enhancement Chaoyong He
2024-11-01  2:57  6%     ` [PATCH v3 4/4] net/nfp: add support for port identify Chaoyong He
2024-11-04  1:34         ` [PATCH v4 0/4] NFP PMD enhancement Chaoyong He
2024-11-04  1:34  6%       ` [PATCH v4 4/4] net/nfp: add LED support Chaoyong He
2024-11-04  9:36  3% [PATCH v8 0/3] add ec points to sm2 op Arkadiusz Kusztal
2024-11-04  9:36  4% ` [PATCH v8 1/3] cryptodev: " Arkadiusz Kusztal
2024-11-06 10:08  0%   ` [EXTERNAL] " Akhil Goyal
2024-11-06 15:17  0%     ` Kusztal, ArkadiuszX
2024-11-07  8:04     [PATCH] graph: optimize graph search when scheduling nodes Huichao cai
2024-11-07  9:37  3% ` [EXTERNAL] " Jerin Jacob
2024-11-08  1:39  4%   ` Huichao Cai
2024-11-08 12:22  3%     ` Jerin Jacob
2024-11-08 13:38  0%       ` David Marchand
2024-11-11  5:38  0%         ` Jerin Jacob
2024-11-12  8:51  0%           ` David Marchand
2024-11-12  9:35  3%             ` Jerin Jacob
2024-11-12 12:57  0%               ` Huichao Cai
2024-11-11  4:03     ` [PATCH v2] graph: mcore: optimize graph search Huichao Cai
2024-11-11  5:46  3%   ` [EXTERNAL] " Jerin Jacob
2024-11-13  7:35  5%   ` [PATCH v3 1/2] " Huichao Cai
2024-11-13  7:35  5%     ` [PATCH v3 2/2] graph: add alignment to the member of rte_node Huichao Cai
2024-11-14  7:14  0%       ` [EXTERNAL] " Jerin Jacob
2024-11-14  8:45  5%     ` [PATCH v4 1/2] graph: mcore: optimize graph search Huichao Cai
2024-11-14  8:45  5%       ` [PATCH v4 2/2] graph: add alignment to the member of rte_node Huichao Cai
2024-11-14 10:05  0%         ` [EXTERNAL] " Jerin Jacob
2024-11-15  1:55  5%         ` [PATCH v5 1/1] graph: improve node layout Huichao Cai
2024-11-15 14:23  0%           ` Thomas Monjalon
2024-11-15 15:57  0%             ` [EXTERNAL] " Jerin Jacob
2024-12-13  2:21 10%       ` [PATCH v5] graph: mcore: optimize graph search Huichao Cai
2024-12-13 14:36  3%         ` David Marchand
2024-12-16  1:43 11%         ` [PATCH v6] " Huichao Cai
2024-12-16 14:49  4%           ` David Marchand
2024-12-17  9:04  0%             ` David Marchand
2025-01-20 14:36  4%           ` Huichao Cai
2025-02-06  2:53 11%           ` [PATCH v7 1/1] " Huichao Cai
2025-02-06 20:10  0%             ` Patrick Robb
2025-02-07  1:39 11%             ` [PATCH v8] " Huichao Cai
2025-02-22  6:59  0%               ` [EXTERNAL] " Kiran Kumar Kokkilagadda
2024-11-08 18:17     [PATCH] config: limit lcore variable maximum size to 4k David Marchand
2024-11-08 18:35     ` Morten Brørup
2024-11-08 19:53       ` Morten Brørup
2024-11-08 22:13         ` Thomas Monjalon
2024-11-08 22:49  3%       ` Morten Brørup
2024-11-11 12:52  5% [PATCH] power: fix a typo in the PM QoS guide Huisong Li
2024-11-12  8:35  5% ` [PATCH v2] " Huisong Li
2024-11-13  0:59  5% ` [PATCH v3] " Huisong Li
2024-11-12  9:31     rte_fib network order bug Robin Jarry
2024-11-13 10:42     ` Medvedkin, Vladimir
2024-11-13 13:27       ` Robin Jarry
2024-11-13 19:39         ` Medvedkin, Vladimir
2024-11-14  7:43           ` Morten Brørup
2024-11-14 10:18             ` Robin Jarry
2024-11-14 14:35               ` Morten Brørup
2024-11-15 13:01                 ` Robin Jarry
2024-11-15 13:52  3%               ` Morten Brørup
2024-11-15 14:28  3%                 ` Robin Jarry
2024-11-15 16:20  0%                   ` Stephen Hemminger
2024-11-17 15:04  3%                     ` Vladimir Medvedkin
2024-11-14  1:10     [PATCH 0/3] introduce rte_memset_sensative Stephen Hemminger
2025-02-11 17:35     ` [PATCH v5 00/11] memset security fixes Stephen Hemminger
2025-02-11 17:35       ` [PATCH v5 02/11] eal: add new secure free function Stephen Hemminger
2025-02-12  2:01         ` fengchengwen
2025-02-12  6:46  3%       ` Stephen Hemminger
2024-11-20 22:24  3% Tech Board Meeting Minutes - 2024-Nov-13 Honnappa Nagarahalli
2024-11-22 12:53     [RFC PATCH 00/21] Reduce code duplication across Intel NIC drivers Bruce Richardson
2025-01-24 16:28     ` [PATCH v6 00/25] " Bruce Richardson
2025-01-24 16:28  1%   ` [PATCH v6 01/25] net: move intel drivers to intel subdirectory Bruce Richardson
2024-11-26 13:14  3% [PATCH v1 0/4] Adjust wording for NUMA vs. socket ID in DPDK Anatoly Burakov
2024-11-27 10:03     rte_event_eth_tx_adapter_enqueue() short enqueue Mattias Rönnblom
2024-11-27 10:38     ` Bruce Richardson
2024-11-27 10:53       ` Mattias Rönnblom
2024-11-27 11:07         ` Bruce Richardson
2024-12-19 15:59           ` Morten Brørup
2024-12-19 17:12  3%         ` Bruce Richardson
2024-11-28 17:07  4% [PATCH v1] doc: update release notes for 24.11 John McNamara
2024-11-30 23:50  4% DPDK 24.11 released Thomas Monjalon
2024-12-03  7:54 11% [PATCH] version: 25.03-rc0 David Marchand
2024-12-04 10:06  3% ` Thomas Monjalon
2024-12-04 12:05  3%   ` David Marchand
2024-12-05 17:57     [PATCH 0/3] Defer lcore variables allocation David Marchand
2024-12-06 11:01     ` Mattias Rönnblom
2024-12-09 11:03       ` David Marchand
2024-12-09 15:39         ` Mattias Rönnblom
2024-12-09 17:40  3%       ` David Marchand
2024-12-10  9:41  0%         ` Mattias Rönnblom
2024-12-16  4:14  4% DTS WG Meeting Minutes - December 5, 2024 Patrick Robb
2024-12-16  4:18  3% Community CI Meeting Minutes - December 12, 2024 Patrick Robb
2024-12-19  7:34     [RFC PATCH] eventdev: adapter API to configure multiple Rx queues Shijith Thotton
2024-12-22 16:28     ` Naga Harish K, S V
2025-01-02  9:40       ` Shijith Thotton
2025-01-13 12:06         ` Shijith Thotton
2025-01-15 16:52           ` Naga Harish K, S V
2025-01-16  6:27             ` Shijith Thotton
2025-01-20  9:52               ` Naga Harish K, S V
2025-01-20 18:23                 ` Shijith Thotton
2025-01-22  5:17                   ` Naga Harish K, S V
2025-01-22 13:42                     ` Shijith Thotton
2025-01-24  3:52  3%                   ` Naga Harish K, S V
2025-01-24 10:00  3%                     ` Shijith Thotton
2025-01-29  5:04  0%                       ` Naga Harish K, S V
2025-01-29  7:43  4%                         ` Jerin Jacob
2025-01-30 15:30  0%                           ` Naga Harish K, S V
2025-01-30 16:48  0%                             ` Jerin Jacob
2025-02-03  4:37  0%                               ` Naga Harish K, S V
2025-02-04  7:15  0%                                 ` Jerin Jacob
2024-12-24  7:36     [v1 00/16] crypto/virtio: vDPA and asymmetric support Gowrishankar Muthukrishnan
2024-12-24  7:37  1% ` [v1 15/16] crypto/virtio: add vhost backend to virtio_user Gowrishankar Muthukrishnan
2025-01-05  9:57  5% [PATCH] ring: add the second version of the RTS interface Huichao Cai
2025-01-05 15:13  5% ` [PATCH v2] " Huichao Cai
2025-01-08  1:41  3%   ` Huichao Cai
2025-01-14 15:04  0%     ` Thomas Monjalon
2025-01-05 15:09  5% Huichao Cai
2025-01-06 16:45     [PATCH 0/2] compile ipsec on Windows Andre Muezerie
2025-01-06 16:45     ` [PATCH 1/2] lib/ipsec: " Andre Muezerie
2025-01-09 15:31  3%   ` Konstantin Ananyev
2025-01-07 18:44     [v2 0/4] crypto/virtio: add vDPA backend support Gowrishankar Muthukrishnan
2025-01-07 18:44  1% ` [v2 3/4] crypto/virtio: add vhost backend to virtio_user Gowrishankar Muthukrishnan
2025-02-06 13:14  0%   ` Maxime Coquelin
2025-01-13  2:55     [PATCH v1 0/2] ethdev: fix skip valid port in probing callback Huisong Li
2025-01-13  2:55  2% ` [PATCH v1 2/2] " Huisong Li
     [not found]     <20220825024425.10534-1-lihuisong@huawei.com>
2024-09-29  5:52     ` [PATCH RESEND v7 0/5] app/testpmd: support multiple process attach and detach port Huisong Li
2024-09-29  5:52       ` [PATCH RESEND v7 2/5] ethdev: fix skip valid port in probing callback Huisong Li
2024-12-10  1:50  0%     ` lihuisong (C)
2025-01-10  3:21  0%       ` lihuisong (C)
2025-01-10 17:54  3%         ` Stephen Hemminger
2025-01-13  2:32  0%           ` lihuisong (C)
2024-10-08  2:32       ` [PATCH RESEND v7 0/5] app/testpmd: support multiple process attach and detach port lihuisong (C)
2024-10-18  1:04         ` Ferruh Yigit
2024-10-18  2:48           ` lihuisong (C)
2024-10-26  4:11  0%         ` lihuisong (C)
2024-10-29 22:12  0%         ` Ferruh Yigit
2024-10-30  4:06  0%           ` lihuisong (C)
2025-01-20  6:42  2% ` [PATCH v8] app/testpmd: add attach and detach port for multiple process Huisong Li
2025-01-24 16:14     [PATCH 1/2] trace: support expression for blob length David Marchand
2025-03-04 16:06     ` [PATCH v4 0/5] Trace point framework enhancement for dmadev David Marchand
2025-03-04 16:06  4%   ` [PATCH v4 2/5] dmadev: avoid copies in tracepoints David Marchand
2025-01-28 16:36     [PATCH 0/4] remove common iavf and idpf drivers Bruce Richardson
2025-01-30 12:48     ` [PATCH v2 " Bruce Richardson
2025-01-30 12:48  2%   ` [PATCH v2 1/4] drivers: merge common and net " Bruce Richardson
2025-01-30 15:12     ` [PATCH v3 0/4] remove common iavf and " Bruce Richardson
2025-01-30 15:12  2%   ` [PATCH v3 1/4] drivers: merge common and net " Bruce Richardson
2025-02-03  8:36  2%     ` Shetty, Praveen
2025-02-05 11:55     ` [PATCH v4 0/4] remove common iavf and " Bruce Richardson
2025-02-05 11:55  2%   ` [PATCH v4 1/4] drivers: merge common and net " Bruce Richardson
2025-02-10 16:44     ` [PATCH v5 0/4] remove common iavf and " Bruce Richardson
2025-02-10 16:44  3%   ` [PATCH v5 1/4] drivers: merge common and net " Bruce Richardson
2025-02-10 16:44  2%   ` [PATCH v5 3/4] drivers: move iavf common folder to iavf net Bruce Richardson
2025-02-11 14:12  0%     ` Stokes, Ian
2025-01-31 12:58     [PATCH v1 00/42] Merge Intel IGC and E1000 drivers, and update E1000 base code Anatoly Burakov
2025-02-03  8:18  3% ` David Marchand
2025-02-04 15:35  0%   ` Burakov, Anatoly
2025-02-05 10:05  0%     ` David Marchand
2025-02-07 12:44     ` [PATCH v3 00/36] " Anatoly Burakov
2025-02-07 12:44  1%   ` [PATCH v3 02/36] net/igc: merge with net/e1000 Anatoly Burakov
2025-02-18 11:58     [PATCH] ethdev: fix get_reg_info Thierry Herbelot
2025-02-19 18:45     ` Stephen Hemminger
2025-03-07  9:33  3%   ` fengchengwen
2025-02-21 12:47     [PATCH] sched: fix wrr parameter data type Megha Ajmera
2025-02-21 19:14  3% ` Stephen Hemminger
2025-02-21 17:41     [v3 0/6] crypto/virtio: enhancements for RSA and vDPA Gowrishankar Muthukrishnan
2025-02-21 17:41  1% ` [v3 4/6] crypto/virtio: add vDPA backend Gowrishankar Muthukrishnan
2025-02-22  9:16     [v4 0/6] crypto/virtio: enhancements for RSA and vDPA Gowrishankar Muthukrishnan
2025-02-22  9:16  1% ` [v4 4/6] crypto/virtio: add vDPA backend Gowrishankar Muthukrishnan
2025-02-26 18:58     [v5 0/6] crypto/virtio: enhancements for RSA and vDPA Gowrishankar Muthukrishnan
2025-02-26 18:58  1% ` [v5 4/6] crypto/virtio: add vDPA backend Gowrishankar Muthukrishnan
2025-03-05  6:16     [v6 0/6] crypto/virtio: enhancements for RSA and vDPA Gowrishankar Muthukrishnan
2025-03-05  6:16  1% ` [v6 4/6] crypto/virtio: add vDPA backend Gowrishankar Muthukrishnan
2025-03-05 21:23  6% [RFC] eal: add new function versioning macros David Marchand
2025-03-06 12:50  6% ` [RFC v2 1/2] " David Marchand
2025-03-06 12:50  7%   ` [RFC v2 2/2] build: generate symbol maps David Marchand
2025-03-11  9:55  3% ` [RFC v3 0/8] Symbol versioning and export rework David Marchand
2025-03-11  9:56 13%   ` [RFC v3 3/8] eal: rework function versioning macros David Marchand
2025-03-13 16:53  0%     ` Bruce Richardson
2025-03-13 17:09  0%       ` David Marchand
2025-03-11  9:56 18%   ` [RFC v3 5/8] build: generate symbol maps David Marchand
2025-03-13 17:26  0%     ` Bruce Richardson
2025-03-14 15:38  0%       ` David Marchand
2025-03-14 15:27  0%     ` Andre Muezerie
2025-03-14 15:51  4%       ` David Marchand
2025-03-11  9:56 16%   ` [RFC v3 7/8] build: use dynamically generated version maps David Marchand
2025-03-11 10:18  3%   ` [RFC v3 0/8] Symbol versioning and export rework Morten Brørup
2025-03-11 13:43  0%     ` David Marchand
2025-03-17 15:42  3% ` [RFC v4 " David Marchand
2025-03-17 15:42 16%   ` [RFC v4 3/8] eal: rework function versioning macros David Marchand
2025-03-17 15:43 18%   ` [RFC v4 5/8] build: generate symbol maps David Marchand
2025-03-10 21:42     [patch v2 0/6] Support VMBUS channels without monitoring enabled longli
2025-03-10 23:20     ` Stephen Hemminger
2025-03-12  0:33  4%   ` [EXTERNAL] " Long Li
2025-03-12 15:36  0%     ` Stephen Hemminger
2025-03-13 22:54  4% Community CI Meeting Minutes - January 9, 2025 Patrick Robb
2025-03-14 12:57  1% [PATCH] raw/cnxk_gpio: switch to character based GPIO interface Tomasz Duszynski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).