* [RFC PATCH 0/2] power: refactor power management library
@ 2024-02-20 15:33 Sivaprasad Tummala
2024-02-20 15:33 ` Sivaprasad Tummala
` (3 more replies)
0 siblings, 4 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-02-20 15:33 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the creation
of dedicated directories for each driver within 'drivers/power/core/*' and
'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances clarity,
and boosts maintainability. It lays the foundation for more focused
development on individual drivers and facilitates seamless integration of
future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Please note that this RFC patch is currently in its initial phase and is
primarily intended for soliciting feedback and comments. As of now,
it has not undergone testing for build or functional issues.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
power: refactor core power management library
power: refactor uncore power management library
drivers/meson.build | 1 +
drivers/power/core/acpi/meson.build | 8 +
.../power/core/acpi}/power_acpi_cpufreq.c | 19 ++
.../power/core/acpi}/power_acpi_cpufreq.h | 0
drivers/power/core/amd-pstate/meson.build | 8 +
.../amd-pstate}/power_amd_pstate_cpufreq.c | 19 ++
.../amd-pstate}/power_amd_pstate_cpufreq.h | 0
drivers/power/core/cppc/meson.build | 8 +
.../power/core/cppc}/power_cppc_cpufreq.c | 19 ++
.../power/core/cppc}/power_cppc_cpufreq.h | 0
.../power/core/kvm-vm}/guest_channel.c | 0
.../power/core/kvm-vm}/guest_channel.h | 0
drivers/power/core/kvm-vm/meson.build | 20 ++
.../power/core/kvm-vm}/power_kvm_vm.c | 19 ++
.../power/core/kvm-vm}/power_kvm_vm.h | 0
drivers/power/core/meson.build | 12 +
drivers/power/core/pstate/meson.build | 8 +
.../power/core/pstate}/power_pstate_cpufreq.c | 19 ++
.../power/core/pstate}/power_pstate_cpufreq.h | 0
drivers/power/meson.build | 9 +
drivers/power/uncore/intel/meson.build | 9 +
.../power/uncore/intel}/power_intel_uncore.c | 15 +
.../power/uncore/intel}/power_intel_uncore.h | 0
drivers/power/uncore/meson.build | 8 +
lib/power/meson.build | 7 -
lib/power/power_common.h | 11 +
lib/power/rte_power.c | 305 ++++++++----------
lib/power/rte_power.h | 207 ++++++++++--
lib/power/rte_power_uncore.c | 163 ++++------
lib/power/rte_power_uncore.h | 150 ++++++++-
lib/power/version.map | 13 +
31 files changed, 742 insertions(+), 315 deletions(-)
create mode 100644 drivers/power/core/acpi/meson.build
rename {lib/power => drivers/power/core/acpi}/power_acpi_cpufreq.c (95%)
rename {lib/power => drivers/power/core/acpi}/power_acpi_cpufreq.h (100%)
create mode 100644 drivers/power/core/amd-pstate/meson.build
rename {lib/power => drivers/power/core/amd-pstate}/power_amd_pstate_cpufreq.c (95%)
rename {lib/power => drivers/power/core/amd-pstate}/power_amd_pstate_cpufreq.h (100%)
create mode 100644 drivers/power/core/cppc/meson.build
rename {lib/power => drivers/power/core/cppc}/power_cppc_cpufreq.c (96%)
rename {lib/power => drivers/power/core/cppc}/power_cppc_cpufreq.h (100%)
rename {lib/power => drivers/power/core/kvm-vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/core/kvm-vm}/guest_channel.h (100%)
create mode 100644 drivers/power/core/kvm-vm/meson.build
rename {lib/power => drivers/power/core/kvm-vm}/power_kvm_vm.c (83%)
rename {lib/power => drivers/power/core/kvm-vm}/power_kvm_vm.h (100%)
create mode 100644 drivers/power/core/meson.build
create mode 100644 drivers/power/core/pstate/meson.build
rename {lib/power => drivers/power/core/pstate}/power_pstate_cpufreq.c (96%)
rename {lib/power => drivers/power/core/pstate}/power_pstate_cpufreq.h (100%)
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/uncore/intel/meson.build
rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.c (95%)
rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.h (100%)
create mode 100644 drivers/power/uncore/meson.build
--
2.25.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [RFC PATCH 0/2] power: refactor power management library
2024-02-20 15:33 [RFC PATCH 0/2] power: refactor power management library Sivaprasad Tummala
@ 2024-02-20 15:33 ` Sivaprasad Tummala
2024-02-20 15:33 ` [RFC PATCH 1/2] power: refactor core " Sivaprasad Tummala
` (2 subsequent siblings)
3 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-02-20 15:33 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the creation
of dedicated directories for each driver within 'drivers/power/core/*' and
'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances clarity,
and boosts maintainability. It lays the foundation for more focused
development on individual drivers and facilitates seamless integration of
future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Please note that this RFC patch is currently in its initial phase and is
primarily intended for soliciting feedback and comments. As of now,
it has not undergone testing for build or functional issues.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
power: refactor core power management library
power: refactor uncore power management library
drivers/meson.build | 1 +
drivers/power/core/acpi/meson.build | 8 +
.../power/core/acpi}/power_acpi_cpufreq.c | 19 ++
.../power/core/acpi}/power_acpi_cpufreq.h | 0
drivers/power/core/amd-pstate/meson.build | 8 +
.../amd-pstate}/power_amd_pstate_cpufreq.c | 19 ++
.../amd-pstate}/power_amd_pstate_cpufreq.h | 0
drivers/power/core/cppc/meson.build | 8 +
.../power/core/cppc}/power_cppc_cpufreq.c | 19 ++
.../power/core/cppc}/power_cppc_cpufreq.h | 0
.../power/core/kvm-vm}/guest_channel.c | 0
.../power/core/kvm-vm}/guest_channel.h | 0
drivers/power/core/kvm-vm/meson.build | 20 ++
.../power/core/kvm-vm}/power_kvm_vm.c | 19 ++
.../power/core/kvm-vm}/power_kvm_vm.h | 0
drivers/power/core/meson.build | 12 +
drivers/power/core/pstate/meson.build | 8 +
.../power/core/pstate}/power_pstate_cpufreq.c | 19 ++
.../power/core/pstate}/power_pstate_cpufreq.h | 0
drivers/power/meson.build | 9 +
drivers/power/uncore/intel/meson.build | 9 +
.../power/uncore/intel}/power_intel_uncore.c | 15 +
.../power/uncore/intel}/power_intel_uncore.h | 0
drivers/power/uncore/meson.build | 8 +
lib/power/meson.build | 7 -
lib/power/power_common.h | 11 +
lib/power/rte_power.c | 305 ++++++++----------
lib/power/rte_power.h | 207 ++++++++++--
lib/power/rte_power_uncore.c | 163 ++++------
lib/power/rte_power_uncore.h | 150 ++++++++-
lib/power/version.map | 13 +
31 files changed, 742 insertions(+), 315 deletions(-)
create mode 100644 drivers/power/core/acpi/meson.build
rename {lib/power => drivers/power/core/acpi}/power_acpi_cpufreq.c (95%)
rename {lib/power => drivers/power/core/acpi}/power_acpi_cpufreq.h (100%)
create mode 100644 drivers/power/core/amd-pstate/meson.build
rename {lib/power => drivers/power/core/amd-pstate}/power_amd_pstate_cpufreq.c (95%)
rename {lib/power => drivers/power/core/amd-pstate}/power_amd_pstate_cpufreq.h (100%)
create mode 100644 drivers/power/core/cppc/meson.build
rename {lib/power => drivers/power/core/cppc}/power_cppc_cpufreq.c (96%)
rename {lib/power => drivers/power/core/cppc}/power_cppc_cpufreq.h (100%)
rename {lib/power => drivers/power/core/kvm-vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/core/kvm-vm}/guest_channel.h (100%)
create mode 100644 drivers/power/core/kvm-vm/meson.build
rename {lib/power => drivers/power/core/kvm-vm}/power_kvm_vm.c (83%)
rename {lib/power => drivers/power/core/kvm-vm}/power_kvm_vm.h (100%)
create mode 100644 drivers/power/core/meson.build
create mode 100644 drivers/power/core/pstate/meson.build
rename {lib/power => drivers/power/core/pstate}/power_pstate_cpufreq.c (96%)
rename {lib/power => drivers/power/core/pstate}/power_pstate_cpufreq.h (100%)
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/uncore/intel/meson.build
rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.c (95%)
rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.h (100%)
create mode 100644 drivers/power/uncore/meson.build
--
2.25.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [RFC PATCH 1/2] power: refactor core power management library
2024-02-20 15:33 [RFC PATCH 0/2] power: refactor power management library Sivaprasad Tummala
2024-02-20 15:33 ` Sivaprasad Tummala
@ 2024-02-20 15:33 ` Sivaprasad Tummala
2024-02-27 16:18 ` Ferruh Yigit
` (2 more replies)
2024-02-20 15:33 ` [RFC PATCH 2/2] power: refactor uncore " Sivaprasad Tummala
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
3 siblings, 3 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-02-20 15:33 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
drivers/power/core/acpi/meson.build | 8 +
.../power/core/acpi}/power_acpi_cpufreq.c | 19 ++
.../power/core/acpi}/power_acpi_cpufreq.h | 0
drivers/power/core/amd-pstate/meson.build | 8 +
.../amd-pstate}/power_amd_pstate_cpufreq.c | 19 ++
.../amd-pstate}/power_amd_pstate_cpufreq.h | 0
drivers/power/core/cppc/meson.build | 8 +
.../power/core/cppc}/power_cppc_cpufreq.c | 19 ++
.../power/core/cppc}/power_cppc_cpufreq.h | 0
.../power/core/kvm-vm}/guest_channel.c | 0
.../power/core/kvm-vm}/guest_channel.h | 0
drivers/power/core/kvm-vm/meson.build | 20 ++
.../power/core/kvm-vm}/power_kvm_vm.c | 19 ++
.../power/core/kvm-vm}/power_kvm_vm.h | 0
drivers/power/core/meson.build | 12 +
drivers/power/core/pstate/meson.build | 8 +
.../power/core/pstate}/power_pstate_cpufreq.c | 19 ++
.../power/core/pstate}/power_pstate_cpufreq.h | 0
drivers/power/meson.build | 8 +
lib/power/meson.build | 6 -
lib/power/power_common.h | 11 +
lib/power/rte_power.c | 305 ++++++++----------
lib/power/rte_power.h | 207 ++++++++++--
lib/power/version.map | 12 +
25 files changed, 506 insertions(+), 203 deletions(-)
create mode 100644 drivers/power/core/acpi/meson.build
rename {lib/power => drivers/power/core/acpi}/power_acpi_cpufreq.c (95%)
rename {lib/power => drivers/power/core/acpi}/power_acpi_cpufreq.h (100%)
create mode 100644 drivers/power/core/amd-pstate/meson.build
rename {lib/power => drivers/power/core/amd-pstate}/power_amd_pstate_cpufreq.c (95%)
rename {lib/power => drivers/power/core/amd-pstate}/power_amd_pstate_cpufreq.h (100%)
create mode 100644 drivers/power/core/cppc/meson.build
rename {lib/power => drivers/power/core/cppc}/power_cppc_cpufreq.c (96%)
rename {lib/power => drivers/power/core/cppc}/power_cppc_cpufreq.h (100%)
rename {lib/power => drivers/power/core/kvm-vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/core/kvm-vm}/guest_channel.h (100%)
create mode 100644 drivers/power/core/kvm-vm/meson.build
rename {lib/power => drivers/power/core/kvm-vm}/power_kvm_vm.c (83%)
rename {lib/power => drivers/power/core/kvm-vm}/power_kvm_vm.h (100%)
create mode 100644 drivers/power/core/meson.build
create mode 100644 drivers/power/core/pstate/meson.build
rename {lib/power => drivers/power/core/pstate}/power_pstate_cpufreq.c (96%)
rename {lib/power => drivers/power/core/pstate}/power_pstate_cpufreq.h (100%)
create mode 100644 drivers/power/meson.build
diff --git a/drivers/meson.build b/drivers/meson.build
index f2be71bc05..e293c3945f 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -28,6 +28,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/drivers/power/core/acpi/meson.build b/drivers/power/core/acpi/meson.build
new file mode 100644
index 0000000000..d10ec8ee94
--- /dev/null
+++ b/drivers/power/core/acpi/meson.build
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+sources = files('power_acpi_cpufreq.c')
+
+headers = files('power_acpi_cpufreq.h')
+
+deps += ['power']
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/core/acpi/power_acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/core/acpi/power_acpi_cpufreq.c
index f8d978d03d..69d80ad2ae 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/core/acpi/power_acpi_cpufreq.c
@@ -577,3 +577,22 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_ops acpi_ops = {
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/core/acpi/power_acpi_cpufreq.h
similarity index 100%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/core/acpi/power_acpi_cpufreq.h
diff --git a/drivers/power/core/amd-pstate/meson.build b/drivers/power/core/amd-pstate/meson.build
new file mode 100644
index 0000000000..8ec4c960f5
--- /dev/null
+++ b/drivers/power/core/amd-pstate/meson.build
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+sources = files('power_amd_pstate_cpufreq.c')
+
+headers = files('power_amd_pstate_cpufreq.h')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
index 028f84416b..9938de72a6 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
@@ -700,3 +700,22 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_ops amd_pstate_ops = {
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
similarity index 100%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
diff --git a/drivers/power/core/cppc/meson.build b/drivers/power/core/cppc/meson.build
new file mode 100644
index 0000000000..06f3b99bb8
--- /dev/null
+++ b/drivers/power/core/cppc/meson.build
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+sources = files('power_cppc_cpufreq.c')
+
+headers = files('power_cppc_cpufreq.h')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/core/cppc/power_cppc_cpufreq.c
similarity index 96%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/core/cppc/power_cppc_cpufreq.c
index 3ddf39bd76..605f633309 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/core/cppc/power_cppc_cpufreq.c
@@ -685,3 +685,22 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_ops cppc_ops = {
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/core/cppc/power_cppc_cpufreq.h
similarity index 100%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/core/cppc/power_cppc_cpufreq.h
diff --git a/lib/power/guest_channel.c b/drivers/power/core/kvm-vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/core/kvm-vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/core/kvm-vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/core/kvm-vm/guest_channel.h
diff --git a/drivers/power/core/kvm-vm/meson.build b/drivers/power/core/kvm-vm/meson.build
new file mode 100644
index 0000000000..3150c6674b
--- /dev/null
+++ b/drivers/power/core/kvm-vm/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 AMD Limited.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'power_kvm_vm.c',
+)
+
+headers = files(
+ 'guest_channel.h',
+ 'power_kvm_vm.h',
+)
+deps += ['power']
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/core/kvm-vm/power_kvm_vm.c
similarity index 83%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/core/kvm-vm/power_kvm_vm.c
index f15be8fac5..a5d6984d26 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/core/kvm-vm/power_kvm_vm.c
@@ -137,3 +137,22 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_ops kvm_vm_ops = {
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/core/kvm-vm/power_kvm_vm.h
similarity index 100%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/core/kvm-vm/power_kvm_vm.h
diff --git a/drivers/power/core/meson.build b/drivers/power/core/meson.build
new file mode 100644
index 0000000000..4081dafaa0
--- /dev/null
+++ b/drivers/power/core/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+drivers = [
+ 'acpi',
+ 'amd-pstate',
+ 'cppc',
+ 'kvm-vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/core/pstate/meson.build b/drivers/power/core/pstate/meson.build
new file mode 100644
index 0000000000..1025c64e48
--- /dev/null
+++ b/drivers/power/core/pstate/meson.build
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+sources = files('power_pstate_cpufreq.c')
+
+headers = files('power_pstate_cpufreq.h')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/core/pstate/power_pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/core/pstate/power_pstate_cpufreq.c
index 73138dc4e4..d4c3645ff8 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/core/pstate/power_pstate_cpufreq.c
@@ -888,3 +888,22 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_ops pstate_ops = {
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/core/pstate/power_pstate_cpufreq.h
similarity index 100%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/core/pstate/power_pstate_cpufreq.h
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..7d9034c7ac
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+drivers = [
+ 'core',
+]
+
+std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..207d96d877 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,14 +12,8 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 30966400ba..c90b611f4f 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -23,13 +23,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..70176807f4 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -8,64 +8,80 @@
#include <rte_spinlock.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
enum power_management_env global_default_env = PM_ENV_NOT_SET;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static struct rte_power_ops rte_power_ops[PM_ENV_MAX];
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+/* register the ops struct in rte_power_ops, return 0 on success. */
+int
+rte_power_register_ops(const struct rte_power_ops *op)
+{
+ struct rte_power_ops *ops;
+
+ if (op->env >= PM_ENV_MAX) {
+ POWER_LOG(ERR, "Unsupported power management environment\n");
+ return -EINVAL;
+ }
+
+ if (op->status != 0) {
+ POWER_LOG(ERR, "Power management env[%d] ops registered already\n",
+ op->env);
+ return -EINVAL;
+ }
+
+ if (!op->init || !op->exit || !op->check_env_support ||
+ !op->get_avail_freqs || !op->get_freq || !op->set_freq ||
+ !op->freq_up || !op->freq_down || !op->freq_max ||
+ !op->freq_min || !op->turbo_status || !op->enable_turbo ||
+ !op->disable_turbo || !op->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops\n");
+ return -EINVAL;
+ }
+
+ ops = &rte_power_ops[op->env];
+ ops->env = op->env;
+ ops->init = op->init;
+ ops->exit = op->exit;
+ ops->check_env_support = op->check_env_support;
+ ops->get_avail_freqs = op->get_avail_freqs;
+ ops->get_freq = op->get_freq;
+ ops->set_freq = op->set_freq;
+ ops->freq_up = op->freq_up;
+ ops->freq_down = op->freq_down;
+ ops->freq_max = op->freq_max;
+ ops->freq_min = op->freq_min;
+ ops->turbo_status = op->turbo_status;
+ ops->enable_turbo = op->enable_turbo;
+ ops->disable_turbo = op->disable_turbo;
+ ops->status = 1; /* registered */
+
+ return 0;
+}
+
+struct rte_power_ops *
+rte_power_get_ops(int ops_index)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ RTE_VERIFY((ops_index >= PM_ENV_NOT_SET) && (ops_index < PM_ENV_MAX));
+ RTE_VERIFY(rte_power_ops[ops_index].status != 0);
+
+ return &rte_power_ops[ops_index];
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
+ struct rte_power_ops *ops;
+
+ if ((env > PM_ENV_NOT_SET) && (env < PM_ENV_MAX)) {
+ ops = rte_power_get_ops(env);
+ return ops->check_env_support();
}
+
+ rte_errno = EINVAL;
+ return -1;
}
int
@@ -80,80 +96,26 @@ rte_power_set_env(enum power_management_env env)
}
int ret = 0;
+ struct rte_power_ops *ops;
+
+ if ((env == PM_ENV_NOT_SET) || (env >= PM_ENV_MAX)) {
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d)"
+ " set\n", env);
+ ret = -1;
+ }
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
+ ops = rte_power_get_ops(env);
+ if (ops->status == 0) {
+ POWER_LOG(ERR, WER,
+ "Power Management Environment(%d) not"
+ " registered\n", env);
ret = -1;
}
if (ret == 0)
global_default_env = env;
- else {
+ else
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -164,7 +126,6 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -177,59 +138,76 @@ int
rte_power_init(unsigned int lcore_id)
{
int ret = -1;
+ struct rte_power_ops *ops;
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
+ if (global_default_env != PM_ENV_NOT_SET) {
+ ops = &rte_power_ops[global_default_env];
+ if (!ops->status) {
+ POWER_LOG(ERR, "Power management env[%d] not"
+ " supported\n", global_default_env);
+ goto out;
+ }
+ return ops->init(lcore_id);
}
+ POWER_LOG(INFO, POWER, "Env isn't set yet!\n");
/* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
+ POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq"
+ " power management...\n");
+ ops = &rte_power_ops[PM_ENV_ACPI_CPUFREQ];
+ if (ops->status) {
+ ret = ops->init(lcore_id);
+ if (ret == 0) {
+ rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+ goto out;
+ }
}
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
+ POWER_LOG(INFO, "Attempting to initialise PSTAT"
+ " power management...\n");
+ ops = &rte_power_ops[PM_ENV_PSTATE_CPUFREQ];
+ if (ops->status) {
+ ret = ops->init(lcore_id);
+ if (ret == 0) {
+ rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
+ goto out;
+ }
}
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
+ POWER_LOG(INFO, "Attempting to initialise AMD PSTATE"
+ " power management...\n");
+ ops = &rte_power_ops[PM_ENV_AMD_PSTATE_CPUFREQ];
+ if (ops->status) {
+ ret = ops->init(lcore_id);
+ if (ret == 0) {
+ rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
+ goto out;
+ }
}
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
+ POWER_LOG(INFO, "Attempting to initialise CPPC power"
+ " management...\n");
+ ops = &rte_power_ops[PM_ENV_CPPC_CPUFREQ];
+ if (ops->status) {
+ ret = ops->init(lcore_id);
+ if (ret == 0) {
+ rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
+ goto out;
+ }
}
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
+ POWER_LOG(INFO, "Attempting to initialise VM power"
+ " management...\n");
+ ops = &rte_power_ops[PM_ENV_KVM_VM];
+ if (ops->status) {
+ ret = ops->init(lcore_id);
+ if (ret == 0) {
+ rte_power_set_env(PM_ENV_KVM_VM);
+ goto out;
+ }
}
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
+ POWER_LOG(ERR, "Unable to set Power Management Environment"
+ " for lcore %u\n", lcore_id);
out:
return ret;
}
@@ -237,21 +215,14 @@ rte_power_init(unsigned int lcore_id)
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ struct rte_power_ops *ops;
+ if (global_default_env != PM_ENV_NOT_SET) {
+ ops = &rte_power_ops[global_default_env];
+ return ops->exit(lcore_id);
}
- return -1;
+ POWER_LOG(ERR, "Environment has not been set, unable "
+ "to exit gracefully\n");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..749bb823ab 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 AMD Limited
*/
#ifndef _RTE_POWER_H
@@ -21,7 +22,7 @@ extern "C" {
/* Power Management Environment State */
enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+ PM_ENV_AMD_PSTATE_CPUFREQ, PM_ENV_MAX};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +67,97 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+typedef int (*rte_power_check_env_support_t)(void);
+
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_ops {
+uint8_t status; /**< ops register status. */
+ enum power_management_env env; /**< power mgmt env. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support; /**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+} __rte_cache_aligned;
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(const struct rte_power_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+ (RTE_INIT(power_hdlr_init_##ops) \
+ { \
+ rte_power_register_ops(&ops); \
+ })
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @param ops_index
+ * The index of the ops struct in the ops struct table.
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_ops *
+rte_power_get_ops(int ops_index);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +200,14 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_ops *ops;
-extern rte_power_freqs_t rte_power_freqs;
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +220,14 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
-extern rte_power_get_freq_t rte_power_get_freq;
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,9 +245,14 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_ops *ops;
-extern rte_power_set_freq_t rte_power_set_freq;
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->set_freq(lcore_id, index);
+}
/**
* Function pointer definition for generic frequency change functions. Review
@@ -167,59 +273,95 @@ typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
+
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
+
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
+
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
+
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
+
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
+
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_ops *ops;
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +377,15 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_ops *ops;
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ ops = rte_power_get_ops(rte_power_get_env());
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/version.map b/lib/power/version.map
index ad92a65f91..2f89645ec2 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -52,3 +52,15 @@ EXPERIMENTAL {
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
+};
--
2.25.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [RFC PATCH 2/2] power: refactor uncore power management library
2024-02-20 15:33 [RFC PATCH 0/2] power: refactor power management library Sivaprasad Tummala
2024-02-20 15:33 ` Sivaprasad Tummala
2024-02-20 15:33 ` [RFC PATCH 1/2] power: refactor core " Sivaprasad Tummala
@ 2024-02-20 15:33 ` Sivaprasad Tummala
2024-03-01 3:33 ` lihuisong (C)
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
3 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-02-20 15:33 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/meson.build | 1 +
drivers/power/uncore/intel/meson.build | 9 +
.../power/uncore/intel}/power_intel_uncore.c | 15 ++
.../power/uncore/intel}/power_intel_uncore.h | 0
drivers/power/uncore/meson.build | 8 +
lib/power/meson.build | 1 -
lib/power/rte_power_uncore.c | 163 +++++++-----------
lib/power/rte_power_uncore.h | 150 ++++++++++++++--
lib/power/version.map | 1 +
9 files changed, 236 insertions(+), 112 deletions(-)
create mode 100644 drivers/power/uncore/intel/meson.build
rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.c (95%)
rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.h (100%)
create mode 100644 drivers/power/uncore/meson.build
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 7d9034c7ac..0803e99027 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -3,6 +3,7 @@
drivers = [
'core',
+ 'uncore',
]
std_deps = ['power']
diff --git a/drivers/power/uncore/intel/meson.build b/drivers/power/uncore/intel/meson.build
new file mode 100644
index 0000000000..187ab15aec
--- /dev/null
+++ b/drivers/power/uncore/intel/meson.build
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 AMD Limited
+
+sources = files('power_intel_uncore.c')
+
+headers = files('power_intel_uncore.h')
+
+deps += ['power']
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/uncore/intel/power_intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/uncore/intel/power_intel_uncore.c
index 3ce8fccec2..3af4cc3bc7 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/uncore/intel/power_intel_uncore.c
@@ -476,3 +476,18 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/uncore/intel/power_intel_uncore.h
similarity index 100%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/uncore/intel/power_intel_uncore.h
diff --git a/drivers/power/uncore/meson.build b/drivers/power/uncore/meson.build
new file mode 100644
index 0000000000..005c0dc622
--- /dev/null
+++ b/drivers/power/uncore/meson.build
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 AMD Limited
+
+drivers = [
+ 'intel',
+]
+
+std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 207d96d877..459e9b6e9b 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..8feb41736b 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -15,88 +15,68 @@
enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static struct rte_power_uncore_ops rte_power_uncore_ops[PM_ENV_MAX];
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(const struct rte_power_uncore_ops *op)
{
- return 0;
-}
+ struct rte_power_uncore_ops *ops;
+
+ if ((op->env != RTE_UNCORE_PM_ENV_INTEL_UNCORE) &&
+ (op->env != RTE_UNCORE_PM_ENV_AMD_HSMP)) {
+ POWER_LOG(ERR,
+ "Unsupported uncore power management environment\n");
+ return -EINVAL;
+ return -EINVAL;
+ }
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
+ if (op->status != 0) {
+ POWER_LOG(ERR,
+ "uncore Power management env[%d] ops registered already\n",
+ op->env);
+ return -EINVAL;
+ }
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+ if (!op->init || !op->exit || !op->get_num_pkgs || !op->get_num_dies ||
+ !op->get_num_freqs || !op->get_avail_freqs || !op->get_freq ||
+ !op->set_freq || !op->freq_max || !op->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops\n");
+ return -EINVAL;
+ }
+ ops = &rte_power_uncore_ops[op->env];
+ ops->env = op->env;
+ ops->init = op->init;
+ ops->exit = op->exit;
+ ops->get_num_pkgs = op->get_num_pkgs;
+ ops->get_num_dies = op->get_num_dies;
+ ops->get_num_freqs = op->get_num_freqs;
+ ops->get_avail_freqs = op->get_avail_freqs;
+ ops->get_freq = op->get_freq;
+ ops->set_freq = op->set_freq;
+ ops->freq_max = op->freq_max;
+ ops->freq_min = op->freq_min;
+ ops->status = 1; /* registered */
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
return 0;
}
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(int ops_index)
{
- return 0;
-}
+ RTE_VERIFY((ops_index != RTE_UNCORE_PM_ENV_INTEL_UNCORE) &&
+ (ops_index != RTE_UNCORE_PM_ENV_AMD_HSMP));
+ RTE_VERIFY(rte_power_uncore_ops[ops_index].status != 0);
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
+ return &rte_power_uncore_ops[ops_index];
}
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = 0;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
@@ -113,24 +93,15 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
+ }
+
+ ops = rte_power_get_uncore_ops(env);
+ if (ops->status == 0) {
POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
ret = -1;
- goto out;
- }
+ } else
+ default_uncore_env = env;
- default_uncore_env = env;
-out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -140,7 +111,6 @@ rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -154,18 +124,18 @@ int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
+ struct rte_power_uncore_ops *ops;
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
+ if ((default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (default_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT)) {
+ ops = rte_power_get_uncore_ops(default_uncore_env);
+ return ops->init(pkg, die);
}
/* Auto detect Environment */
POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
+ ops = rte_power_get_uncore_ops(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
+ ret = ops->init(pkg, die);
if (ret == 0) {
rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
goto out;
@@ -183,12 +153,13 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
+ struct rte_power_uncore_ops *ops;
+
+ if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+ return -1;
}
- return -1;
+ ops = rte_power_get_uncore_ops(default_uncore_env);
+ return ops->exit(pkg, die);
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..fe14a1bbe5 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -58,6 +58,81 @@ void rte_power_unset_uncore_env(void);
__rte_experimental
enum rte_uncore_power_mgmt_env rte_power_get_uncore_env(void);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ uint8_t status; /**< ops register status. */
+ enum rte_uncore_power_mgmt_env env; /**< power mgmt env. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+} __rte_cache_aligned;
+
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(const struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+ (RTE_INIT(power_hdlr_init_uncore_##ops) \
+ { \
+ rte_power_register_uncore_ops(&ops); \
+ })
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @param ops_index
+ * The index of the ops struct in the ops struct table.
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(int ops_index);
+
/**
* Initialize uncore frequency management for specific die on a package.
* It will get the available frequencies and prepare to set new die frequencies.
@@ -116,9 +191,14 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops;
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,9 +221,15 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops;
+
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->set_freq(pkg, die, index);
+}
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
/**
* Function pointer definition for generic frequency change functions.
@@ -169,7 +255,14 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops;
+
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +271,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops;
+
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +300,15 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops;
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,9 +326,14 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops;
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +345,14 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops;
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->get_num_pkgs(void);
+}
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +368,14 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops;
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ ops = rte_power_get_uncore_ops(rte_power_get_uncore_env());
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/version.map b/lib/power/version.map
index 2f89645ec2..d8a6f9436c 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -57,6 +57,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.25.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [RFC PATCH 1/2] power: refactor core power management library
2024-02-20 15:33 ` [RFC PATCH 1/2] power: refactor core " Sivaprasad Tummala
@ 2024-02-27 16:18 ` Ferruh Yigit
2024-02-29 7:10 ` Tummala, Sivaprasad
2024-02-28 12:51 ` Ferruh Yigit
2024-03-01 2:56 ` lihuisong (C)
2 siblings, 1 reply; 139+ messages in thread
From: Ferruh Yigit @ 2024-02-27 16:18 UTC (permalink / raw)
To: Sivaprasad Tummala, david.hunt, anatoly.burakov, jerinj,
radu.nicolau, gakhil, cristian.dumitrescu, konstantin.ananyev
Cc: dev
On 2/20/2024 3:33 PM, Sivaprasad Tummala wrote:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
>
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>
+1 to refactor, thanks for the work.
There are multiple power implementations but all are managed withing the
power library, it is good idea to extract different implementations as
drivers.
<...>
> diff --git a/drivers/power/core/acpi/meson.build b/drivers/power/core/acpi/meson.build
> new file mode 100644
> index 0000000000..d10ec8ee94
> --- /dev/null
> +++ b/drivers/power/core/acpi/meson.build
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
>
It should be as following, same for all:
Copyright (C) 2024, Advanced Micro Devices, Inc.
> +
> +sources = files('power_acpi_cpufreq.c')
> +
> +headers = files('power_acpi_cpufreq.h')
>
In meson, 'headers' variable is used to install the header, this is
required for the case user needs to include the header but I guess that
is not the case for power libraries.
Can you please check if the 'header' variable in meson is required?
<...>
> @@ -577,3 +577,22 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
>
> return 0;
> }
> +
> +static struct rte_power_ops acpi_ops = {
> + .init = power_acpi_cpufreq_init,
> + .exit = power_acpi_cpufreq_exit,
> + .check_env_support = power_acpi_cpufreq_check_supported,
> + .get_avail_freqs = power_acpi_cpufreq_freqs,
> + .get_freq = power_acpi_cpufreq_get_freq,
> + .set_freq = power_acpi_cpufreq_set_freq,
> + .freq_down = power_acpi_cpufreq_freq_down,
> + .freq_up = power_acpi_cpufreq_freq_up,
> + .freq_max = power_acpi_cpufreq_freq_max,
> + .freq_min = power_acpi_cpufreq_freq_min,
> + .turbo_status = power_acpi_turbo_status,
> + .enable_turbo = power_acpi_enable_turbo,
> + .disable_turbo = power_acpi_disable_turbo,
> + .get_caps = power_acpi_get_capabilities
> +};
> +
With current usage of the ops struct, I guess all can be "static const".
<...>
> diff --git a/drivers/power/core/kvm-vm/meson.build b/drivers/power/core/kvm-vm/meson.build
> new file mode 100644
> index 0000000000..3150c6674b
> --- /dev/null
> +++ b/drivers/power/core/kvm-vm/meson.build
> @@ -0,0 +1,20 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2024 AMD Limited.
> +#
> +
> +if not is_linux
> + build = false
> + reason = 'only supported on Linux'
> + subdir_done()
> +endif
>
Before refactoring, in lib/power was supported only for Linux, I assume
this means all existing power libraries supported only for Linux.
If so above check can be added to all drivers.
<...>
> +/* register the ops struct in rte_power_ops, return 0 on success. */
> +int
> +rte_power_register_ops(const struct rte_power_ops *op)
> +{
> + struct rte_power_ops *ops;
> +
> + if (op->env >= PM_ENV_MAX) {
> + POWER_LOG(ERR, "Unsupported power management environment\n");
> + return -EINVAL;
> + }
> +
> + if (op->status != 0) {
> + POWER_LOG(ERR, "Power management env[%d] ops registered already\n",
> + op->env);
> + return -EINVAL;
> + }
> +
> + if (!op->init || !op->exit || !op->check_env_support ||
> + !op->get_avail_freqs || !op->get_freq || !op->set_freq ||
> + !op->freq_up || !op->freq_down || !op->freq_max ||
> + !op->freq_min || !op->turbo_status || !op->enable_turbo ||
> + !op->disable_turbo || !op->get_caps) {
> + POWER_LOG(ERR, "Missing callbacks while registering power ops\n");
> + return -EINVAL;
> + }
> +
> + ops = &rte_power_ops[op->env];
>
I don't see all drivers set 'op->env',
This 'rte_power_register_ops()' function copies ops from driver proved
struct to library global 'rte_power_ops[]' array,
it can be possible to store ops pointer provided by driver, instead of
copying it.
And it can be possible to link the ops in this function, instead of
putting them to specific index, as only one ops can be active in a given
time, it can be possible to store active ops pointer in a global
variable which removes the need to have index accessible array for ops.
<...>
> @@ -177,59 +138,76 @@ int
> rte_power_init(unsigned int lcore_id)
> {
> int ret = -1;
> + struct rte_power_ops *ops;
>
> - switch (global_default_env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_init(lcore_id);
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_init(lcore_id);
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_init(lcore_id);
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_init(lcore_id);
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_init(lcore_id);
> - default:
> - POWER_LOG(INFO, "Env isn't set yet!");
> + if (global_default_env != PM_ENV_NOT_SET) {
> + ops = &rte_power_ops[global_default_env];
> + if (!ops->status) {
> + POWER_LOG(ERR, "Power management env[%d] not"
> + " supported\n", global_default_env);
> + goto out;
> + }
> + return ops->init(lcore_id);
> }
> + POWER_LOG(INFO, POWER, "Env isn't set yet!\n");
>
> /* Auto detect Environment */
> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
> - ret = power_acpi_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> - goto out;
> + POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq"
> + " power management...\n");
>
Shouldn't break the log, can break the line by keeping message whole:
POWER_LOG(INFO,
"Attempting to initialise ACPI cpufreq power management...");
<...>
> @@ -21,7 +22,7 @@ extern "C" {
> /* Power Management Environment State */
> enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> - PM_ENV_AMD_PSTATE_CPUFREQ};
> + PM_ENV_AMD_PSTATE_CPUFREQ, PM_ENV_MAX};
>
Syntax. Can we have enum item per line?
> /**
> * Check if a specific power management environment type is supported on a
> @@ -66,6 +67,97 @@ void rte_power_unset_env(void);
> */
> enum power_management_env rte_power_get_env(void);
>
> +typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
> +typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
> +typedef int (*rte_power_check_env_support_t)(void);
> +
> +typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> + uint32_t num);
> +typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> +typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
> +typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> +
>
I guess above is not required for users, what do you think to create a
driver header file and move these to driver header file?
<...>
> +
> +/**
> + * Macro to statically register the ops of a cpufreq driver.
> + */
> +#define RTE_POWER_REGISTER_OPS(ops) \
> + (RTE_INIT(power_hdlr_init_##ops) \
> + { \
> + rte_power_register_ops(&ops); \
> + })
>
is () required around RTE_INIT()
> +
> +/**
> + * @internal Get the power ops struct from its index.
> + *
> + * @param ops_index
> + * The index of the ops struct in the ops struct table.
> + * @return
> + * The pointer to the ops struct in the table if registered.
> + */
> +struct rte_power_ops *
> +rte_power_get_ops(int ops_index);
> +
> /**
> * Initialize power management for a specific lcore. If rte_power_set_env() has
> * not been called then an auto-detect of the environment will start and
> @@ -108,10 +200,14 @@ int rte_power_exit(unsigned int lcore_id);
> * @return
> * The number of available frequencies.
> */
> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> - uint32_t num);
> +static inline uint32_t
> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
> +{
> + struct rte_power_ops *ops;
>
> -extern rte_power_freqs_t rte_power_freqs;
> + ops = rte_power_get_ops(rte_power_get_env());
> + return ops->get_avail_freqs(lcore_id, freqs, n);
> +}
>
Why not proper functions but "static inline functions"?
>
> /**
> * Return the current index of available frequencies of a specific lcore.
> @@ -124,9 +220,14 @@ extern rte_power_freqs_t rte_power_freqs;
> * @return
> * The current index of available frequencies.
> */
> -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> +static inline uint32_t
> +rte_power_get_freq(unsigned int lcore_id)
> +{
> + struct rte_power_ops *ops;
>
> -extern rte_power_get_freq_t rte_power_get_freq;
> + ops = rte_power_get_ops(rte_power_get_env());
>
As 'rte_power_get_env()' already returns a global variable, why not set
a global ops pointer and directly access to them, is above abstraction
providing any benefit?
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [RFC PATCH 1/2] power: refactor core power management library
2024-02-20 15:33 ` [RFC PATCH 1/2] power: refactor core " Sivaprasad Tummala
2024-02-27 16:18 ` Ferruh Yigit
@ 2024-02-28 12:51 ` Ferruh Yigit
2024-03-01 2:56 ` lihuisong (C)
2 siblings, 0 replies; 139+ messages in thread
From: Ferruh Yigit @ 2024-02-28 12:51 UTC (permalink / raw)
To: Sivaprasad Tummala, david.hunt, anatoly.burakov, jerinj,
radu.nicolau, gakhil, cristian.dumitrescu, konstantin.ananyev
Cc: dev
On 2/20/2024 3:33 PM, Sivaprasad Tummala wrote:
> + ops = rte_power_get_ops(env);
> + if (ops->status == 0) {
> + POWER_LOG(ERR, WER,
> + "Power Management Environment(%d) not"
> + " registered\n", env);
'WER' seems typo, causing build error.
Also CI reports a few other build errors, fyi.
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [RFC PATCH 1/2] power: refactor core power management library
2024-02-27 16:18 ` Ferruh Yigit
@ 2024-02-29 7:10 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-02-29 7:10 UTC (permalink / raw)
To: Yigit, Ferruh, david.hunt, anatoly.burakov, jerinj, radu.nicolau,
gakhil, cristian.dumitrescu, konstantin.ananyev
Cc: dev
[AMD Official Use Only - General]
Hi Ferruh,
> -----Original Message-----
> From: Yigit, Ferruh <Ferruh.Yigit@amd.com>
> Sent: Tuesday, February 27, 2024 9:48 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
> david.hunt@intel.com; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com;
> konstantin.ananyev@huawei.com
> Cc: dev@dpdk.org
> Subject: Re: [RFC PATCH 1/2] power: refactor core power management library
>
> On 2/20/2024 3:33 PM, Sivaprasad Tummala wrote:
> > This patch introduces a comprehensive refactor to the core power
> > management library. The primary focus is on improving modularity and
> > organization by relocating specific driver implementations from the
> > 'lib/power' directory to dedicated directories within
> > 'drivers/power/core/*'. The adjustment of meson.build files enables
> > the selective activation of individual drivers.
> >
> > These changes contribute to a significant enhancement in code
> > organization, providing a clearer structure for driver implementations.
> > The refactor aims to improve overall code clarity and boost
> > maintainability. Additionally, it establishes a foundation for future
> > development, allowing for more focused work on individual drivers and
> > seamless integration of forthcoming enhancements.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> >
>
> +1 to refactor, thanks for the work.
>
> There are multiple power implementations but all are managed withing the power
> library, it is good idea to extract different implementations as drivers.
>
> <...>
>
> > diff --git a/drivers/power/core/acpi/meson.build
> > b/drivers/power/core/acpi/meson.build
> > new file mode 100644
> > index 0000000000..d10ec8ee94
> > --- /dev/null
> > +++ b/drivers/power/core/acpi/meson.build
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> >
>
> It should be as following, same for all:
>
> Copyright (C) 2024, Advanced Micro Devices, Inc.
>
ACK
> > +
> > +sources = files('power_acpi_cpufreq.c')
> > +
> > +headers = files('power_acpi_cpufreq.h')
> >
>
> In meson, 'headers' variable is used to install the header, this is required for the
> case user needs to include the header but I guess that is not the case for power
> libraries.
> Can you please check if the 'header' variable in meson is required?
>
> <...>
>
> > @@ -577,3 +577,22 @@ int power_acpi_get_capabilities(unsigned int
> > lcore_id,
> >
> > return 0;
> > }
> > +
> > +static struct rte_power_ops acpi_ops = {
> > + .init = power_acpi_cpufreq_init,
> > + .exit = power_acpi_cpufreq_exit,
> > + .check_env_support = power_acpi_cpufreq_check_supported,
> > + .get_avail_freqs = power_acpi_cpufreq_freqs,
> > + .get_freq = power_acpi_cpufreq_get_freq,
> > + .set_freq = power_acpi_cpufreq_set_freq,
> > + .freq_down = power_acpi_cpufreq_freq_down,
> > + .freq_up = power_acpi_cpufreq_freq_up,
> > + .freq_max = power_acpi_cpufreq_freq_max,
> > + .freq_min = power_acpi_cpufreq_freq_min,
> > + .turbo_status = power_acpi_turbo_status,
> > + .enable_turbo = power_acpi_enable_turbo,
> > + .disable_turbo = power_acpi_disable_turbo,
> > + .get_caps = power_acpi_get_capabilities };
> > +
>
> With current usage of the ops struct, I guess all can be "static const".
ACK
>
> <...>
>
> > diff --git a/drivers/power/core/kvm-vm/meson.build
> > b/drivers/power/core/kvm-vm/meson.build
> > new file mode 100644
> > index 0000000000..3150c6674b
> > --- /dev/null
> > +++ b/drivers/power/core/kvm-vm/meson.build
> > @@ -0,0 +1,20 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(C) 2024 AMD
> > +Limited.
> > +#
> > +
> > +if not is_linux
> > + build = false
> > + reason = 'only supported on Linux'
> > + subdir_done()
> > +endif
> >
>
> Before refactoring, in lib/power was supported only for Linux, I assume this means
> all existing power libraries supported only for Linux.
> If so above check can be added to all drivers.
ACK
>
> <...>
>
> > +/* register the ops struct in rte_power_ops, return 0 on success. */
> > +int rte_power_register_ops(const struct rte_power_ops *op) {
> > + struct rte_power_ops *ops;
> > +
> > + if (op->env >= PM_ENV_MAX) {
> > + POWER_LOG(ERR, "Unsupported power management
> environment\n");
> > + return -EINVAL;
> > + }
> > +
> > + if (op->status != 0) {
> > + POWER_LOG(ERR, "Power management env[%d] ops registered
> already\n",
> > + op->env);
> > + return -EINVAL;
> > + }
> > +
> > + if (!op->init || !op->exit || !op->check_env_support ||
> > + !op->get_avail_freqs || !op->get_freq || !op->set_freq ||
> > + !op->freq_up || !op->freq_down || !op->freq_max ||
> > + !op->freq_min || !op->turbo_status || !op->enable_turbo ||
> > + !op->disable_turbo || !op->get_caps) {
> > + POWER_LOG(ERR, "Missing callbacks while registering power
> ops\n");
> > + return -EINVAL;
> > + }
> > +
> > + ops = &rte_power_ops[op->env];
> >
>
> I don't see all drivers set 'op->env',
>
> This 'rte_power_register_ops()' function copies ops from driver proved struct to
> library global 'rte_power_ops[]' array,
>
> it can be possible to store ops pointer provided by driver, instead of copying it.
>
> And it can be possible to link the ops in this function, instead of putting them to
> specific index, as only one ops can be active in a given time, it can be possible to
> store active ops pointer in a global variable which removes the need to have index
> accessible array for ops.
Agreed. I will rework this to a new struct which can hold a reference to the respective ops struct.
>
> <...>
>
> > @@ -177,59 +138,76 @@ int
> > rte_power_init(unsigned int lcore_id) {
> > int ret = -1;
> > + struct rte_power_ops *ops;
> >
> > - switch (global_default_env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_init(lcore_id);
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_init(lcore_id);
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_init(lcore_id);
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_init(lcore_id);
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_init(lcore_id);
> > - default:
> > - POWER_LOG(INFO, "Env isn't set yet!");
> > + if (global_default_env != PM_ENV_NOT_SET) {
> > + ops = &rte_power_ops[global_default_env];
> > + if (!ops->status) {
> > + POWER_LOG(ERR, "Power management env[%d] not"
> > + " supported\n", global_default_env);
> > + goto out;
> > + }
> > + return ops->init(lcore_id);
> > }
> > + POWER_LOG(INFO, POWER, "Env isn't set yet!\n");
> >
> > /* Auto detect Environment */
> > - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
> management...");
> > - ret = power_acpi_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> > - goto out;
> > + POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq"
> > + " power management...\n");
> >
>
> Shouldn't break the log, can break the line by keeping message whole:
> POWER_LOG(INFO,
> "Attempting to initialise ACPI cpufreq power management...");
>
> <...>
ACK
>
> > @@ -21,7 +22,7 @@ extern "C" {
> > /* Power Management Environment State */ enum power_management_env
> > {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> > PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> > - PM_ENV_AMD_PSTATE_CPUFREQ};
> > + PM_ENV_AMD_PSTATE_CPUFREQ, PM_ENV_MAX};
> >
>
> Syntax. Can we have enum item per line?
ACK
>
> > /**
> > * Check if a specific power management environment type is supported
> > on a @@ -66,6 +67,97 @@ void rte_power_unset_env(void);
> > */
> > enum power_management_env rte_power_get_env(void);
> >
> > +typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
> > +typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
> > +typedef int (*rte_power_check_env_support_t)(void);
> > +
> > +typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> > + uint32_t num);
> > +typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> > +typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t
> > +index); typedef int (*rte_power_freq_change_t)(unsigned int
> > +lcore_id);
> > +
> >
>
> I guess above is not required for users, what do you think to create a driver header
> file and move these to driver header file?
>
> <...>
>
> > +
> > +/**
> > + * Macro to statically register the ops of a cpufreq driver.
> > + */
> > +#define RTE_POWER_REGISTER_OPS(ops) \
> > + (RTE_INIT(power_hdlr_init_##ops) \
> > + { \
> > + rte_power_register_ops(&ops); \
> > + })
> >
>
> is () required around RTE_INIT()
This was added to address the checkpatch errors.
>
> > +
> > +/**
> > + * @internal Get the power ops struct from its index.
> > + *
> > + * @param ops_index
> > + * The index of the ops struct in the ops struct table.
> > + * @return
> > + * The pointer to the ops struct in the table if registered.
> > + */
> > +struct rte_power_ops *
> > +rte_power_get_ops(int ops_index);
> > +
> > /**
> > * Initialize power management for a specific lcore. If rte_power_set_env() has
> > * not been called then an auto-detect of the environment will start
> > and @@ -108,10 +200,14 @@ int rte_power_exit(unsigned int lcore_id);
> > * @return
> > * The number of available frequencies.
> > */
> > -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> > - uint32_t num);
> > +static inline uint32_t
> > +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
> > + struct rte_power_ops *ops;
> >
> > -extern rte_power_freqs_t rte_power_freqs;
> > + ops = rte_power_get_ops(rte_power_get_env());
> > + return ops->get_avail_freqs(lcore_id, freqs, n); }
> >
>
> Why not proper functions but "static inline functions"?
These inline functions are expected to be called from datapath and to avoid additional cycles with the refactor.
>
> >
> > /**
> > * Return the current index of available frequencies of a specific lcore.
> > @@ -124,9 +220,14 @@ extern rte_power_freqs_t rte_power_freqs;
> > * @return
> > * The current index of available frequencies.
> > */
> > -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> > +static inline uint32_t
> > +rte_power_get_freq(unsigned int lcore_id) {
> > + struct rte_power_ops *ops;
> >
> > -extern rte_power_get_freq_t rte_power_get_freq;
> > + ops = rte_power_get_ops(rte_power_get_env());
> >
>
> As 'rte_power_get_env()' already returns a global variable, why not set a global ops
> pointer and directly access to them, is above abstraction providing any benefit?
rte_power_get_ops() internally will check if the respective ops struct is registered or not.
I will rework it and keep global ops to get populated in rte_power_set_env().
>
>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [RFC PATCH 1/2] power: refactor core power management library
2024-02-20 15:33 ` [RFC PATCH 1/2] power: refactor core " Sivaprasad Tummala
2024-02-27 16:18 ` Ferruh Yigit
2024-02-28 12:51 ` Ferruh Yigit
@ 2024-03-01 2:56 ` lihuisong (C)
2024-03-01 10:39 ` Hunt, David
2024-03-05 4:35 ` Tummala, Sivaprasad
2 siblings, 2 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-03-01 2:56 UTC (permalink / raw)
To: Sivaprasad Tummala, david.hunt, anatoly.burakov, jerinj,
radu.nicolau, gakhil, cristian.dumitrescu, ferruh.yigit,
konstantin.ananyev
Cc: dev
在 2024/2/20 23:33, Sivaprasad Tummala 写道:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
>
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
Good job. +1 to refacotor.
<...>
> diff --git a/drivers/meson.build b/drivers/meson.build
> index f2be71bc05..e293c3945f 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -28,6 +28,7 @@ subdirs = [
> 'event', # depends on common, bus, mempool and net.
> 'baseband', # depends on common and bus.
> 'gpu', # depends on common and bus.
> + 'power', # depends on common (in future).
> ]
>
> if meson.is_cross_build()
> diff --git a/drivers/power/core/acpi/meson.build b/drivers/power/core/acpi/meson.build
> new file mode 100644
> index 0000000000..d10ec8ee94
> --- /dev/null
> +++ b/drivers/power/core/acpi/meson.build
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
> +
> +sources = files('power_acpi_cpufreq.c')
> +
> +headers = files('power_acpi_cpufreq.h')
> +
> +deps += ['power']
> diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/core/acpi/power_acpi_cpufreq.c
> similarity index 95%
> rename from lib/power/power_acpi_cpufreq.c
> rename to drivers/power/core/acpi/power_acpi_cpufreq.c
This file is in power lib.
How about remove the 'power' prefix of this file name?
like acpi_cpufreq.c, cppc_cpufreq.c.
> index f8d978d03d..69d80ad2ae 100644
> --- a/lib/power/power_acpi_cpufreq.c
> +++ b/drivers/power/core/acpi/power_acpi_cpufreq.c
> @@ -577,3 +577,22 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
>
> return 0;
> }
> +
> +static struct rte_power_ops acpi_ops = {
How about use the following structure name?
"struct rte_power_cpufreq_ops" or "struct rte_power_core_ops"
After all, we also have other power ops, like uncore, right?
> + .init = power_acpi_cpufreq_init,
> + .exit = power_acpi_cpufreq_exit,
> + .check_env_support = power_acpi_cpufreq_check_supported,
> + .get_avail_freqs = power_acpi_cpufreq_freqs,
> + .get_freq = power_acpi_cpufreq_get_freq,
> + .set_freq = power_acpi_cpufreq_set_freq,
> + .freq_down = power_acpi_cpufreq_freq_down,
> + .freq_up = power_acpi_cpufreq_freq_up,
> + .freq_max = power_acpi_cpufreq_freq_max,
> + .freq_min = power_acpi_cpufreq_freq_min,
> + .turbo_status = power_acpi_turbo_status,
> + .enable_turbo = power_acpi_enable_turbo,
> + .disable_turbo = power_acpi_disable_turbo,
> + .get_caps = power_acpi_get_capabilities
> +};
> +
> +RTE_POWER_REGISTER_OPS(acpi_ops);
> diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/core/acpi/power_acpi_cpufreq.h
> similarity index 100%
> rename from lib/power/power_acpi_cpufreq.h
> rename to drivers/power/core/acpi/power_acpi_cpufreq.h
> diff --git a/drivers/power/core/amd-pstate/meson.build b/drivers/power/core/amd-pstate/meson.build
> new file mode 100644
> index 0000000000..8ec4c960f5
> --- /dev/null
> +++ b/drivers/power/core/amd-pstate/meson.build
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
> +
> +sources = files('power_amd_pstate_cpufreq.c')
> +
> +headers = files('power_amd_pstate_cpufreq.h')
> +
> +deps += ['power']
> diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
> similarity index 95%
> rename from lib/power/power_amd_pstate_cpufreq.c
> rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
> index 028f84416b..9938de72a6 100644
> --- a/lib/power/power_amd_pstate_cpufreq.c
> +++ b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
> @@ -700,3 +700,22 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
>
> return 0;
> }
> +
> +static struct rte_power_ops amd_pstate_ops = {
> + .init = power_amd_pstate_cpufreq_init,
> + .exit = power_amd_pstate_cpufreq_exit,
> + .check_env_support = power_amd_pstate_cpufreq_check_supported,
> + .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
> + .get_freq = power_amd_pstate_cpufreq_get_freq,
> + .set_freq = power_amd_pstate_cpufreq_set_freq,
> + .freq_down = power_amd_pstate_cpufreq_freq_down,
> + .freq_up = power_amd_pstate_cpufreq_freq_up,
> + .freq_max = power_amd_pstate_cpufreq_freq_max,
> + .freq_min = power_amd_pstate_cpufreq_freq_min,
> + .turbo_status = power_amd_pstate_turbo_status,
> + .enable_turbo = power_amd_pstate_enable_turbo,
> + .disable_turbo = power_amd_pstate_disable_turbo,
> + .get_caps = power_amd_pstate_get_capabilities
> +};
> +
> +RTE_POWER_REGISTER_OPS(amd_pstate_ops);
> diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
> similarity index 100%
> rename from lib/power/power_amd_pstate_cpufreq.h
> rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
> diff --git a/drivers/power/core/cppc/meson.build b/drivers/power/core/cppc/meson.build
> new file mode 100644
> index 0000000000..06f3b99bb8
> --- /dev/null
> +++ b/drivers/power/core/cppc/meson.build
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
> +
> +sources = files('power_cppc_cpufreq.c')
> +
> +headers = files('power_cppc_cpufreq.h')
> +
> +deps += ['power']
> diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/core/cppc/power_cppc_cpufreq.c
> similarity index 96%
> rename from lib/power/power_cppc_cpufreq.c
> rename to drivers/power/core/cppc/power_cppc_cpufreq.c
> index 3ddf39bd76..605f633309 100644
> --- a/lib/power/power_cppc_cpufreq.c
> +++ b/drivers/power/core/cppc/power_cppc_cpufreq.c
> @@ -685,3 +685,22 @@ power_cppc_get_capabilities(unsigned int lcore_id,
>
> return 0;
> }
> +
> +static struct rte_power_ops cppc_ops = {
> + .init = power_cppc_cpufreq_init,
> + .exit = power_cppc_cpufreq_exit,
> + .check_env_support = power_cppc_cpufreq_check_supported,
> + .get_avail_freqs = power_cppc_cpufreq_freqs,
> + .get_freq = power_cppc_cpufreq_get_freq,
> + .set_freq = power_cppc_cpufreq_set_freq,
> + .freq_down = power_cppc_cpufreq_freq_down,
> + .freq_up = power_cppc_cpufreq_freq_up,
> + .freq_max = power_cppc_cpufreq_freq_max,
> + .freq_min = power_cppc_cpufreq_freq_min,
> + .turbo_status = power_cppc_turbo_status,
> + .enable_turbo = power_cppc_enable_turbo,
> + .disable_turbo = power_cppc_disable_turbo,
> + .get_caps = power_cppc_get_capabilities
> +};
> +
> +RTE_POWER_REGISTER_OPS(cppc_ops);
> diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/core/cppc/power_cppc_cpufreq.h
> similarity index 100%
> rename from lib/power/power_cppc_cpufreq.h
> rename to drivers/power/core/cppc/power_cppc_cpufreq.h
> diff --git a/lib/power/guest_channel.c b/drivers/power/core/kvm-vm/guest_channel.c
> similarity index 100%
> rename from lib/power/guest_channel.c
> rename to drivers/power/core/kvm-vm/guest_channel.c
> diff --git a/lib/power/guest_channel.h b/drivers/power/core/kvm-vm/guest_channel.h
> similarity index 100%
> rename from lib/power/guest_channel.h
> rename to drivers/power/core/kvm-vm/guest_channel.h
> diff --git a/drivers/power/core/kvm-vm/meson.build b/drivers/power/core/kvm-vm/meson.build
> new file mode 100644
> index 0000000000..3150c6674b
> --- /dev/null
> +++ b/drivers/power/core/kvm-vm/meson.build
> @@ -0,0 +1,20 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2024 AMD Limited.
> +#
> +
> +if not is_linux
> + build = false
> + reason = 'only supported on Linux'
> + subdir_done()
> +endif
> +
> +sources = files(
> + 'guest_channel.c',
> + 'power_kvm_vm.c',
> +)
> +
> +headers = files(
> + 'guest_channel.h',
> + 'power_kvm_vm.h',
> +)
> +deps += ['power']
> diff --git a/lib/power/power_kvm_vm.c b/drivers/power/core/kvm-vm/power_kvm_vm.c
> similarity index 83%
> rename from lib/power/power_kvm_vm.c
> rename to drivers/power/core/kvm-vm/power_kvm_vm.c
> index f15be8fac5..a5d6984d26 100644
> --- a/lib/power/power_kvm_vm.c
> +++ b/drivers/power/core/kvm-vm/power_kvm_vm.c
> @@ -137,3 +137,22 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
> POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
> return -ENOTSUP;
> }
> +
> +static struct rte_power_ops kvm_vm_ops = {
> + .init = power_kvm_vm_init,
> + .exit = power_kvm_vm_exit,
> + .check_env_support = power_kvm_vm_check_supported,
> + .get_avail_freqs = power_kvm_vm_freqs,
> + .get_freq = power_kvm_vm_get_freq,
> + .set_freq = power_kvm_vm_set_freq,
> + .freq_down = power_kvm_vm_freq_down,
> + .freq_up = power_kvm_vm_freq_up,
> + .freq_max = power_kvm_vm_freq_max,
> + .freq_min = power_kvm_vm_freq_min,
> + .turbo_status = power_kvm_vm_turbo_status,
> + .enable_turbo = power_kvm_vm_enable_turbo,
> + .disable_turbo = power_kvm_vm_disable_turbo,
> + .get_caps = power_kvm_vm_get_capabilities
> +};
> +
> +RTE_POWER_REGISTER_OPS(kvm_vm_ops);
> diff --git a/lib/power/power_kvm_vm.h b/drivers/power/core/kvm-vm/power_kvm_vm.h
> similarity index 100%
> rename from lib/power/power_kvm_vm.h
> rename to drivers/power/core/kvm-vm/power_kvm_vm.h
> diff --git a/drivers/power/core/meson.build b/drivers/power/core/meson.build
> new file mode 100644
> index 0000000000..4081dafaa0
> --- /dev/null
> +++ b/drivers/power/core/meson.build
> @@ -0,0 +1,12 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
> +
> +drivers = [
> + 'acpi',
> + 'amd-pstate',
> + 'cppc',
> + 'kvm-vm',
> + 'pstate'
> +]
> +
> +std_deps = ['power']
> diff --git a/drivers/power/core/pstate/meson.build b/drivers/power/core/pstate/meson.build
> new file mode 100644
> index 0000000000..1025c64e48
> --- /dev/null
> +++ b/drivers/power/core/pstate/meson.build
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
> +
> +sources = files('power_pstate_cpufreq.c')
> +
> +headers = files('power_pstate_cpufreq.h')
> +
> +deps += ['power']
> diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/core/pstate/power_pstate_cpufreq.c
> similarity index 96%
> rename from lib/power/power_pstate_cpufreq.c
> rename to drivers/power/core/pstate/power_pstate_cpufreq.c
> index 73138dc4e4..d4c3645ff8 100644
> --- a/lib/power/power_pstate_cpufreq.c
> +++ b/drivers/power/core/pstate/power_pstate_cpufreq.c
> @@ -888,3 +888,22 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
>
> return 0;
> }
> +
> +static struct rte_power_ops pstate_ops = {
> + .init = power_pstate_cpufreq_init,
> + .exit = power_pstate_cpufreq_exit,
> + .check_env_support = power_pstate_cpufreq_check_supported,
> + .get_avail_freqs = power_pstate_cpufreq_freqs,
> + .get_freq = power_pstate_cpufreq_get_freq,
> + .set_freq = power_pstate_cpufreq_set_freq,
> + .freq_down = power_pstate_cpufreq_freq_down,
> + .freq_up = power_pstate_cpufreq_freq_up,
> + .freq_max = power_pstate_cpufreq_freq_max,
> + .freq_min = power_pstate_cpufreq_freq_min,
> + .turbo_status = power_pstate_turbo_status,
> + .enable_turbo = power_pstate_enable_turbo,
> + .disable_turbo = power_pstate_disable_turbo,
> + .get_caps = power_pstate_get_capabilities
> +};
> +
> +RTE_POWER_REGISTER_OPS(pstate_ops);
> diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/core/pstate/power_pstate_cpufreq.h
> similarity index 100%
> rename from lib/power/power_pstate_cpufreq.h
> rename to drivers/power/core/pstate/power_pstate_cpufreq.h
> diff --git a/drivers/power/meson.build b/drivers/power/meson.build
> new file mode 100644
> index 0000000000..7d9034c7ac
> --- /dev/null
> +++ b/drivers/power/meson.build
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2024 AMD Limited
> +
> +drivers = [
> + 'core',
> +]
> +
> +std_deps = ['power']
> diff --git a/lib/power/meson.build b/lib/power/meson.build
> index b8426589b2..207d96d877 100644
> --- a/lib/power/meson.build
> +++ b/lib/power/meson.build
> @@ -12,14 +12,8 @@ if not is_linux
> reason = 'only supported on Linux'
> endif
> sources = files(
> - 'guest_channel.c',
> - 'power_acpi_cpufreq.c',
> - 'power_amd_pstate_cpufreq.c',
> 'power_common.c',
> - 'power_cppc_cpufreq.c',
> - 'power_kvm_vm.c',
> 'power_intel_uncore.c',
> - 'power_pstate_cpufreq.c',
> 'rte_power.c',
> 'rte_power_uncore.c',
> 'rte_power_pmd_mgmt.c',
> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
> index 30966400ba..c90b611f4f 100644
> --- a/lib/power/power_common.h
> +++ b/lib/power/power_common.h
> @@ -23,13 +23,24 @@ extern int power_logtype;
> #endif
>
> /* check if scaling driver matches one we want */
> +__rte_internal
> int cpufreq_check_scaling_driver(const char *driver);
> +
> +__rte_internal
> int power_set_governor(unsigned int lcore_id, const char *new_governor,
> char *orig_governor, size_t orig_governor_len);
> +
> +__rte_internal
> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> __rte_format_printf(3, 4);
> +
> +__rte_internal
> int read_core_sysfs_u32(FILE *f, uint32_t *val);
> +
> +__rte_internal
> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> +
> +__rte_internal
> int write_core_sysfs_s(FILE *f, const char *str);
>
> #endif /* _POWER_COMMON_H_ */
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> index 36c3f3da98..70176807f4 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power.c
> @@ -8,64 +8,80 @@
> #include <rte_spinlock.h>
>
> #include "rte_power.h"
> -#include "power_acpi_cpufreq.h"
> -#include "power_cppc_cpufreq.h"
> #include "power_common.h"
> -#include "power_kvm_vm.h"
> -#include "power_pstate_cpufreq.h"
> -#include "power_amd_pstate_cpufreq.h"
>
> enum power_management_env global_default_env = PM_ENV_NOT_SET;
use a pointer to save the current power cpufreq ops?
>
> static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> +static struct rte_power_ops rte_power_ops[PM_ENV_MAX];
>
> -/* function pointers */
> -rte_power_freqs_t rte_power_freqs = NULL;
> -rte_power_get_freq_t rte_power_get_freq = NULL;
> -rte_power_set_freq_t rte_power_set_freq = NULL;
> -rte_power_freq_change_t rte_power_freq_up = NULL;
> -rte_power_freq_change_t rte_power_freq_down = NULL;
> -rte_power_freq_change_t rte_power_freq_max = NULL;
> -rte_power_freq_change_t rte_power_freq_min = NULL;
> -rte_power_freq_change_t rte_power_turbo_status;
> -rte_power_freq_change_t rte_power_freq_enable_turbo;
> -rte_power_freq_change_t rte_power_freq_disable_turbo;
> -rte_power_get_capabilities_t rte_power_get_capabilities;
> -
> -static void
> -reset_power_function_ptrs(void)
> +/* register the ops struct in rte_power_ops, return 0 on success. */
> +int
> +rte_power_register_ops(const struct rte_power_ops *op)
> +{
> + struct rte_power_ops *ops;
> +
> + if (op->env >= PM_ENV_MAX) {
> + POWER_LOG(ERR, "Unsupported power management environment\n");
> + return -EINVAL;
> + }
> +
> + if (op->status != 0) {
> + POWER_LOG(ERR, "Power management env[%d] ops registered already\n",
> + op->env);
> + return -EINVAL;
> + }
> +
> + if (!op->init || !op->exit || !op->check_env_support ||
> + !op->get_avail_freqs || !op->get_freq || !op->set_freq ||
> + !op->freq_up || !op->freq_down || !op->freq_max ||
> + !op->freq_min || !op->turbo_status || !op->enable_turbo ||
> + !op->disable_turbo || !op->get_caps) {
> + POWER_LOG(ERR, "Missing callbacks while registering power ops\n");
> + return -EINVAL;
> + }
> +
> + ops = &rte_power_ops[op->env];
It is better to use a global linked list instead of an array.
And we should extract a list structure including this ops structure and
this ops's owner.
> + ops->env = op->env;
> + ops->init = op->init;
> + ops->exit = op->exit;
> + ops->check_env_support = op->check_env_support;
> + ops->get_avail_freqs = op->get_avail_freqs;
> + ops->get_freq = op->get_freq;
> + ops->set_freq = op->set_freq;
> + ops->freq_up = op->freq_up;
> + ops->freq_down = op->freq_down;
> + ops->freq_max = op->freq_max;
> + ops->freq_min = op->freq_min;
> + ops->turbo_status = op->turbo_status;
> + ops->enable_turbo = op->enable_turbo;
> + ops->disable_turbo = op->disable_turbo;
*ops = *op?
> + ops->status = 1; /* registered */
status --> registered?
But if use ops linked list, this flag also can be removed.
> +
> + return 0;
> +}
> +
> +struct rte_power_ops *
> +rte_power_get_ops(int ops_index)
AFAICS, there is only one cpufreq driver on one platform and just have
one power_cpufreq_ops to use for user.
We don't need user to get other power ops, and user just want to know
the power ops using currently, right?
So using 'index' toget this ops is not good.
> {
> - rte_power_freqs = NULL;
> - rte_power_get_freq = NULL;
> - rte_power_set_freq = NULL;
> - rte_power_freq_up = NULL;
> - rte_power_freq_down = NULL;
> - rte_power_freq_max = NULL;
> - rte_power_freq_min = NULL;
> - rte_power_turbo_status = NULL;
> - rte_power_freq_enable_turbo = NULL;
> - rte_power_freq_disable_turbo = NULL;
> - rte_power_get_capabilities = NULL;
> + RTE_VERIFY((ops_index >= PM_ENV_NOT_SET) && (ops_index < PM_ENV_MAX));
> + RTE_VERIFY(rte_power_ops[ops_index].status != 0);
> +
> + return &rte_power_ops[ops_index];
> }
>
> int
> rte_power_check_env_supported(enum power_management_env env)
> {
> - switch (env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_check_supported();
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_check_supported();
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_check_supported();
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_check_supported();
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_check_supported();
> - default:
> - rte_errno = EINVAL;
> - return -1;
> + struct rte_power_ops *ops;
> +
> + if ((env > PM_ENV_NOT_SET) && (env < PM_ENV_MAX)) {
> + ops = rte_power_get_ops(env);
> + return ops->check_env_support();
> }
> +
> + rte_errno = EINVAL;
> + return -1;
> }
>
> int
> @@ -80,80 +96,26 @@ rte_power_set_env(enum power_management_env env)
> }
>
> int ret = 0;
> + struct rte_power_ops *ops;
> +
> + if ((env == PM_ENV_NOT_SET) || (env >= PM_ENV_MAX)) {
> + POWER_LOG(ERR, "Invalid Power Management Environment(%d)"
> + " set\n", env);
> + ret = -1;
> + }
>
<...>
> + ops = rte_power_get_ops(env);
To find the target ops from the global list according to the env?
> + if (ops->status == 0) {
> + POWER_LOG(ERR, WER,
> + "Power Management Environment(%d) not"
> + " registered\n", env);
> ret = -1;
> }
>
> if (ret == 0)
> global_default_env = env;
It is more convenient to use a global variable to point to the default
power_cpufreq ops or its list node.
> - else {
> + else
> global_default_env = PM_ENV_NOT_SET;
> - reset_power_function_ptrs();
> - }
>
> rte_spinlock_unlock(&global_env_cfg_lock);
> return ret;
> @@ -164,7 +126,6 @@ rte_power_unset_env(void)
> {
> rte_spinlock_lock(&global_env_cfg_lock);
> global_default_env = PM_ENV_NOT_SET;
> - reset_power_function_ptrs();
> rte_spinlock_unlock(&global_env_cfg_lock);
> }
>
> @@ -177,59 +138,76 @@ int
> rte_power_init(unsigned int lcore_id)
> {
> int ret = -1;
> + struct rte_power_ops *ops;
>
> - switch (global_default_env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_init(lcore_id);
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_init(lcore_id);
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_init(lcore_id);
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_init(lcore_id);
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_init(lcore_id);
> - default:
> - POWER_LOG(INFO, "Env isn't set yet!");
> + if (global_default_env != PM_ENV_NOT_SET) {
> + ops = &rte_power_ops[global_default_env];
> + if (!ops->status) {
> + POWER_LOG(ERR, "Power management env[%d] not"
> + " supported\n", global_default_env);
> + goto out;
> + }
> + return ops->init(lcore_id);
> }
> + POWER_LOG(INFO, POWER, "Env isn't set yet!\n");
>
> /* Auto detect Environment */
> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
> - ret = power_acpi_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> - goto out;
> + POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq"
> + " power management...\n");
> + ops = &rte_power_ops[PM_ENV_ACPI_CPUFREQ];
> + if (ops->status) {
> + ret = ops->init(lcore_id);
> + if (ret == 0) {
> + rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> + goto out;
> + }
> }
>
> - POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
> - ret = power_pstate_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> - goto out;
> + POWER_LOG(INFO, "Attempting to initialise PSTAT"
> + " power management...\n");
> + ops = &rte_power_ops[PM_ENV_PSTATE_CPUFREQ];
> + if (ops->status) {
> + ret = ops->init(lcore_id);
> + if (ret == 0) {
> + rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> + goto out;
> + }
> }
>
> - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
> - ret = power_amd_pstate_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> - goto out;
> + POWER_LOG(INFO, "Attempting to initialise AMD PSTATE"
> + " power management...\n");
> + ops = &rte_power_ops[PM_ENV_AMD_PSTATE_CPUFREQ];
> + if (ops->status) {
> + ret = ops->init(lcore_id);
> + if (ret == 0) {
> + rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> + goto out;
> + }
> }
>
> - POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
> - ret = power_cppc_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> - goto out;
> + POWER_LOG(INFO, "Attempting to initialise CPPC power"
> + " management...\n");
> + ops = &rte_power_ops[PM_ENV_CPPC_CPUFREQ];
> + if (ops->status) {
> + ret = ops->init(lcore_id);
> + if (ret == 0) {
> + rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> + goto out;
> + }
> }
>
> - POWER_LOG(INFO, "Attempting to initialise VM power management...");
> - ret = power_kvm_vm_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_KVM_VM);
> - goto out;
> + POWER_LOG(INFO, "Attempting to initialise VM power"
> + " management...\n");
> + ops = &rte_power_ops[PM_ENV_KVM_VM];
> + if (ops->status) {
> + ret = ops->init(lcore_id);
> + if (ret == 0) {
> + rte_power_set_env(PM_ENV_KVM_VM);
> + goto out;
> + }
> }
If we use a linked list, above code can be simpled like this:
->
for_each_power_cpufreq_ops(ops, ...) {
ret = ops->init()
if (ret) {
....
}
}
> - POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
> - "%u", lcore_id);
> + POWER_LOG(ERR, "Unable to set Power Management Environment"
> + " for lcore %u\n", lcore_id);
> out:
> return ret;
> }
> @@ -237,21 +215,14 @@ rte_power_init(unsigned int lcore_id)
> int
> rte_power_exit(unsigned int lcore_id)
> {
> - switch (global_default_env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_exit(lcore_id);
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_exit(lcore_id);
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_exit(lcore_id);
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_exit(lcore_id);
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_exit(lcore_id);
> - default:
> - POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
> + struct rte_power_ops *ops;
>
> + if (global_default_env != PM_ENV_NOT_SET) {
> + ops = &rte_power_ops[global_default_env];
> + return ops->exit(lcore_id);
> }
> - return -1;
> + POWER_LOG(ERR, "Environment has not been set, unable "
> + "to exit gracefully\n");
>
> + return -1;
> }
> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
> index 4fa4afe399..749bb823ab 100644
> --- a/lib/power/rte_power.h
> +++ b/lib/power/rte_power.h
> @@ -1,5 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2024 AMD Limited
> */
>
> #ifndef _RTE_POWER_H
> @@ -21,7 +22,7 @@ extern "C" {
> /* Power Management Environment State */
> enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> - PM_ENV_AMD_PSTATE_CPUFREQ};
> + PM_ENV_AMD_PSTATE_CPUFREQ, PM_ENV_MAX};
"enum power_management_env" is not good. may be like "enum
power_cpufreq_driver_type"?
In previous linked list structure to be defined, may be directly use a
string name instead of a fixed enum is better.
Becuase the new "PM_ENV_MAX" will lead to break ABI when add a new
cpufreq driver.
>
> /**
> * Check if a specific power management environment type is supported on a
> @@ -66,6 +67,97 @@ void rte_power_unset_env(void);
> */
> enum power_management_env rte_power_get_env(void);
>
> +typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
> +typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
> +typedef int (*rte_power_check_env_support_t)(void);
> +
> +typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> + uint32_t num);
> +typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> +typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
> +typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> +
> +/**
> + * Function pointer definition for generic frequency change functions. Review
> + * each environments specific documentation for usage.
> + *
> + * @param lcore_id
> + * lcore id.
> + *
> + * @return
> + * - 1 on success with frequency changed.
> + * - 0 on success without frequency changed.
> + * - Negative on error.
> + */
> +
> +/**
> + * Power capabilities summary.
> + */
> +struct rte_power_core_capabilities {
> + union {
> + uint64_t capabilities;
> + struct {
> + uint64_t turbo:1; /**< Turbo can be enabled. */
> + uint64_t priority:1; /**< SST-BF high freq core */
> + };
> + };
> +};
> +
> +typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
> + struct rte_power_core_capabilities *caps);
> +
> +/** Structure defining core power operations structure */
> +struct rte_power_ops {
> +uint8_t status; /**< ops register status. */
> + enum power_management_env env; /**< power mgmt env. */
> + rte_power_cpufreq_init_t init; /**< Initialize power management. */
> + rte_power_cpufreq_exit_t exit; /**< Exit power management. */
> + rte_power_check_env_support_t check_env_support; /**< verify env is supported. */
> + rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
> + rte_power_get_freq_t get_freq; /**< Get frequency index. */
> + rte_power_set_freq_t set_freq; /**< Set frequency index. */
> + rte_power_freq_change_t freq_up; /**< Scale up frequency. */
> + rte_power_freq_change_t freq_down; /**< Scale down frequency. */
> + rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
> + rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
> + rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
> + rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
> + rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
> + rte_power_get_capabilities_t get_caps; /**< power capabilities. */
> +} __rte_cache_aligned;
Suggest that fix this sturcture, like:
struct rte_power_cpufreq_list {
char name[]; // like "cppc_cpufreq", "pstate_cpufreq"
struct rte_power_cpufreq *ops;
struct rte_power_cpufreq_list *node;
}
> +
> +/**
> + * Register power cpu frequency operations.
> + *
> + * @param ops
> + * Pointer to an ops structure to register.
> + * @return
> + * - >=0: Success; return the index of the ops struct in the table.
> + * - -EINVAL - error while registering ops struct.
> + */
> +__rte_internal
> +int rte_power_register_ops(const struct rte_power_ops *ops);
> +
> +/**
> + * Macro to statically register the ops of a cpufreq driver.
> + */
> +#define RTE_POWER_REGISTER_OPS(ops) \
> + (RTE_INIT(power_hdlr_init_##ops) \
> + { \
> + rte_power_register_ops(&ops); \
> + })
> +
> +/**
> + * @internal Get the power ops struct from its index.
> + *
> + * @param ops_index
> + * The index of the ops struct in the ops struct table.
> + * @return
> + * The pointer to the ops struct in the table if registered.
> + */
> +struct rte_power_ops *
> +rte_power_get_ops(int ops_index);
> +
> /**
> * Initialize power management for a specific lcore. If rte_power_set_env() has
> * not been called then an auto-detect of the environment will start and
> @@ -108,10 +200,14 @@ int rte_power_exit(unsigned int lcore_id);
> * @return
> * The number of available frequencies.
> */
> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> - uint32_t num);
> +static inline uint32_t
> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
> +{
> + struct rte_power_ops *ops;
>
> -extern rte_power_freqs_t rte_power_freqs;
> + ops = rte_power_get_ops(rte_power_get_env());
> + return ops->get_avail_freqs(lcore_id, freqs, n);
> +}
nice.
<...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [RFC PATCH 2/2] power: refactor uncore power management library
2024-02-20 15:33 ` [RFC PATCH 2/2] power: refactor uncore " Sivaprasad Tummala
@ 2024-03-01 3:33 ` lihuisong (C)
2024-03-01 6:06 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-03-01 3:33 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, ferruh.yigit, gakhil
Hi,
在 2024/2/20 23:33, Sivaprasad Tummala 写道:
> This patch refactors the power management library, addressing uncore
> power management. The primary changes involve the creation of dedicated
> directories for each driver within 'drivers/power/uncore/*'. The
> adjustment of meson.build files enables the selective activation
> of individual drivers.
+1 to discriminate core and uncore.
>
> This refactor significantly improves code organization, enhances
> clarity and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> drivers/power/meson.build | 1 +
> drivers/power/uncore/intel/meson.build | 9 +
> .../power/uncore/intel}/power_intel_uncore.c | 15 ++
> .../power/uncore/intel}/power_intel_uncore.h | 0
> drivers/power/uncore/meson.build | 8 +
> lib/power/meson.build | 1 -
> lib/power/rte_power_uncore.c | 163 +++++++-----------
> lib/power/rte_power_uncore.h | 150 ++++++++++++++--
> lib/power/version.map | 1 +
> 9 files changed, 236 insertions(+), 112 deletions(-)
> create mode 100644 drivers/power/uncore/intel/meson.build
> rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.c (95%)
> rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.h (100%)
How about remove 'power' in "power_intel_uncore.c"
> create mode 100644 drivers/power/uncore/meson.build
>
> diff --git a/drivers/power/meson.build b/drivers/power/meson.build
> index 7d9034c7ac..0803e99027 100644
> --- a/drivers/power/meson.build
> +++ b/drivers/power/meson.build
> @@ -3,6 +3,7 @@
>
> drivers = [
> 'core',
> + 'uncore',
> ]
>
> std_deps = ['power']
> diff --git a/drivers/power/uncore/intel/meson.build b/drivers/power/uncore/intel/meson.build
> new file mode 100644
> index 0000000000..187ab15aec
> --- /dev/null
> +++ b/drivers/power/uncore/intel/meson.build
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2017 Intel Corporation
> +# Copyright(c) 2024 AMD Limited
> +
> +sources = files('power_intel_uncore.c')
> +
> +headers = files('power_intel_uncore.h')
> +
> +deps += ['power']
> diff --git a/lib/power/power_intel_uncore.c b/drivers/power/uncore/intel/power_intel_uncore.c
> similarity index 95%
> rename from lib/power/power_intel_uncore.c
> rename to drivers/power/uncore/intel/power_intel_uncore.c
> index 3ce8fccec2..3af4cc3bc7 100644
> --- a/lib/power/power_intel_uncore.c
> +++ b/drivers/power/uncore/intel/power_intel_uncore.c
> @@ -476,3 +476,18 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
>
> return count;
> }
> +
> +static struct rte_power_uncore_ops intel_uncore_ops = {
> + .init = power_intel_uncore_init,
> + .exit = power_intel_uncore_exit,
> + .get_avail_freqs = power_intel_uncore_freqs,
> + .get_num_pkgs = power_intel_uncore_get_num_pkgs,
> + .get_num_dies = power_intel_uncore_get_num_dies,
> + .get_num_freqs = power_intel_uncore_get_num_freqs,
> + .get_freq = power_get_intel_uncore_freq,
> + .set_freq = power_set_intel_uncore_freq,
> + .freq_max = power_intel_uncore_freq_max,
> + .freq_min = power_intel_uncore_freq_min,
> +};
> +
> +RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
<...>
> +
> +/** Structure defining uncore power operations structure */
> +struct rte_power_uncore_ops {
> + uint8_t status; /**< ops register status. */
> + enum rte_uncore_power_mgmt_env env; /**< power mgmt env. */
> + rte_power_uncore_init_t init; /**< Initialize power management. */
> + rte_power_uncore_exit_t exit; /**< Exit power management. */
> + rte_power_uncore_get_num_pkgs_t get_num_pkgs;
> + rte_power_uncore_get_num_dies_t get_num_dies;
> + rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
> + rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
> + rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
> + rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
> + rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
> + rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
> +} __rte_cache_aligned;
For all core drivers (cpufreq), they all basically follow the ACPI
specification.
So libray can extract a common ops for all core DVFS driver.
AFAIS, there is only one uncore driver in kernel, namely intel uncore
driver.
But there is not an unify specification to control uncore frequency
scaling(UFS) in kernel.
That is to say, every chip manufacturers can implement their uncore
driver as themselves request.
As a result, there is different system interface for userspace between
manufacturer.
So I am not sure if this new extracted rte_power_uncore_ops sturcture is
very common for all uncore drivers in future.
> +
> +/**
> + * Register power uncore frequency operations.
> + * @param ops
> + * Pointer to an ops structure to register.
> + * @return
> + * - >=0: Success; return the index of the ops struct in the table.
> + * - -EINVAL - error while registering ops struct.
> + */
> +__rte_internal
> +int rte_power_register_uncore_ops(const struct rte_power_uncore_ops *ops);
> +
> +/**
> + * Macro to statically register the ops of an uncore driver.
> + */
> +#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
> + (RTE_INIT(power_hdlr_init_uncore_##ops) \
> + { \
> + rte_power_register_uncore_ops(&ops); \
> + })
> +
<...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [RFC PATCH 2/2] power: refactor uncore power management library
2024-03-01 3:33 ` lihuisong (C)
@ 2024-03-01 6:06 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-03-01 6:06 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil
[AMD Official Use Only - General]
Hi Lihuisong,
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Friday, March 1, 2024 9:04 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> radu.nicolau@intel.com; jerinj@marvell.com; cristian.dumitrescu@intel.com;
> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> gakhil@marvell.com
> Subject: Re: [RFC PATCH 2/2] power: refactor uncore power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hi,
>
> 在 2024/2/20 23:33, Sivaprasad Tummala 写道:
> > This patch refactors the power management library, addressing uncore
> > power management. The primary changes involve the creation of
> > dedicated directories for each driver within 'drivers/power/uncore/*'.
> > The adjustment of meson.build files enables the selective activation
> > of individual drivers.
> +1 to discriminate core and uncore.
> >
> > This refactor significantly improves code organization, enhances
> > clarity and boosts maintainability. It lays the foundation for more
> > focused development on individual drivers and facilitates seamless
> > integration of future enhancements, particularly the AMD uncore driver.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> > drivers/power/meson.build | 1 +
> > drivers/power/uncore/intel/meson.build | 9 +
> > .../power/uncore/intel}/power_intel_uncore.c | 15 ++
> > .../power/uncore/intel}/power_intel_uncore.h | 0
> > drivers/power/uncore/meson.build | 8 +
> > lib/power/meson.build | 1 -
> > lib/power/rte_power_uncore.c | 163 +++++++-----------
> > lib/power/rte_power_uncore.h | 150 ++++++++++++++--
> > lib/power/version.map | 1 +
> > 9 files changed, 236 insertions(+), 112 deletions(-)
> > create mode 100644 drivers/power/uncore/intel/meson.build
> > rename {lib/power => drivers/power/uncore/intel}/power_intel_uncore.c
> (95%)
> > rename {lib/power =>
> > drivers/power/uncore/intel}/power_intel_uncore.h (100%)
> How about remove 'power' in "power_intel_uncore.c"
ACK!
> > create mode 100644 drivers/power/uncore/meson.build
> >
> > diff --git a/drivers/power/meson.build b/drivers/power/meson.build
> > index 7d9034c7ac..0803e99027 100644
> > --- a/drivers/power/meson.build
> > +++ b/drivers/power/meson.build
> > @@ -3,6 +3,7 @@
> >
> > drivers = [
> > 'core',
> > + 'uncore',
> > ]
> >
> > std_deps = ['power']
> > diff --git a/drivers/power/uncore/intel/meson.build
> > b/drivers/power/uncore/intel/meson.build
> > new file mode 100644
> > index 0000000000..187ab15aec
> > --- /dev/null
> > +++ b/drivers/power/uncore/intel/meson.build
> > @@ -0,0 +1,9 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel
> > +Corporation # Copyright(c) 2024 AMD Limited
> > +
> > +sources = files('power_intel_uncore.c')
> > +
> > +headers = files('power_intel_uncore.h')
> > +
> > +deps += ['power']
> > diff --git a/lib/power/power_intel_uncore.c
> > b/drivers/power/uncore/intel/power_intel_uncore.c
> > similarity index 95%
> > rename from lib/power/power_intel_uncore.c rename to
> > drivers/power/uncore/intel/power_intel_uncore.c
> > index 3ce8fccec2..3af4cc3bc7 100644
> > --- a/lib/power/power_intel_uncore.c
> > +++ b/drivers/power/uncore/intel/power_intel_uncore.c
> > @@ -476,3 +476,18 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
> >
> > return count;
> > }
> > +
> > +static struct rte_power_uncore_ops intel_uncore_ops = {
> > + .init = power_intel_uncore_init,
> > + .exit = power_intel_uncore_exit,
> > + .get_avail_freqs = power_intel_uncore_freqs,
> > + .get_num_pkgs = power_intel_uncore_get_num_pkgs,
> > + .get_num_dies = power_intel_uncore_get_num_dies,
> > + .get_num_freqs = power_intel_uncore_get_num_freqs,
> > + .get_freq = power_get_intel_uncore_freq,
> > + .set_freq = power_set_intel_uncore_freq,
> > + .freq_max = power_intel_uncore_freq_max,
> > + .freq_min = power_intel_uncore_freq_min, };
> > +
> > +RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
> <...>
> > +
> > +/** Structure defining uncore power operations structure */ struct
> > +rte_power_uncore_ops {
> > + uint8_t status; /**< ops register status. */
> > + enum rte_uncore_power_mgmt_env env; /**< power mgmt env. */
> > + rte_power_uncore_init_t init; /**< Initialize power management. */
> > + rte_power_uncore_exit_t exit; /**< Exit power management. */
> > + rte_power_uncore_get_num_pkgs_t get_num_pkgs;
> > + rte_power_uncore_get_num_dies_t get_num_dies;
> > + rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of
> available frequencies. */
> > + rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available
> frequencies. */
> > + rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
> > + rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
> > + rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to
> highest. */
> > + rte_power_uncore_freq_change_t freq_min; /**< Scale up
> > +frequency to lowest. */ } __rte_cache_aligned;
> For all core drivers (cpufreq), they all basically follow the ACPI specification.
> So libray can extract a common ops for all core DVFS driver.
> AFAIS, there is only one uncore driver in kernel, namely intel uncore driver.
> But there is not an unify specification to control uncore frequency
> scaling(UFS) in kernel.
> That is to say, every chip manufacturers can implement their uncore driver as
> themselves request.
> As a result, there is different system interface for userspace between
> manufacturer.
> So I am not sure if this new extracted rte_power_uncore_ops sturcture is very
> common for all uncore drivers in future.
Agreed! The uncore implementation (vendor specific) are expected to be abstracted
At driver level. One possible approach I think is to provide different performance levels
(instead of num_freqs) by the uncore library and each driver implementation can
interpret/implement the perf level independently (uncore/crosssocket/pcie/umc frequencies).
Application can query the no. of performance levels (highest to lowest) and can select a
Performance level as needed for power savings.
> > +
> > +/**
> > + * Register power uncore frequency operations.
> > + * @param ops
> > + * Pointer to an ops structure to register.
> > + * @return
> > + * - >=0: Success; return the index of the ops struct in the table.
> > + * - -EINVAL - error while registering ops struct.
> > + */
> > +__rte_internal
> > +int rte_power_register_uncore_ops(const struct rte_power_uncore_ops
> > +*ops);
> > +
> > +/**
> > + * Macro to statically register the ops of an uncore driver.
> > + */
> > +#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
> > + (RTE_INIT(power_hdlr_init_uncore_##ops) \
> > + { \
> > + rte_power_register_uncore_ops(&ops); \
> > + })
> > +
> <...>
Thanks & Regards,
Sivaprasad
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [RFC PATCH 1/2] power: refactor core power management library
2024-03-01 2:56 ` lihuisong (C)
@ 2024-03-01 10:39 ` Hunt, David
2024-03-05 4:35 ` Tummala, Sivaprasad
1 sibling, 0 replies; 139+ messages in thread
From: Hunt, David @ 2024-03-01 10:39 UTC (permalink / raw)
To: lihuisong (C),
Sivaprasad Tummala, anatoly.burakov, jerinj, radu.nicolau,
gakhil, cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
On 01/03/2024 02:56, lihuisong (C) wrote:
>
> 在 2024/2/20 23:33, Sivaprasad Tummala 写道:
>> This patch introduces a comprehensive refactor to the core power
>> management library. The primary focus is on improving modularity
>> and organization by relocating specific driver implementations
>> from the 'lib/power' directory to dedicated directories within
>> 'drivers/power/core/*'. The adjustment of meson.build files
>> enables the selective activation of individual drivers.
>>
>> These changes contribute to a significant enhancement in code
>> organization, providing a clearer structure for driver implementations.
>> The refactor aims to improve overall code clarity and boost
>> maintainability. Additionally, it establishes a foundation for
>> future development, allowing for more focused work on individual
>> drivers and seamless integration of forthcoming enhancements.
>
> Good job. +1 to refacotor.
>
> <...>
>
Also a +1 from me, looks like a sensible re-organisation of the power code.
Regards,
Dave.
>> diff --git a/drivers/meson.build b/drivers/meson.build
>> index f2be71bc05..e293c3945f 100644
>> --- a/drivers/meson.build
>> +++ b/drivers/meson.build
>> @@ -28,6 +28,7 @@ subdirs = [
>> 'event', # depends on common, bus, mempool and net.
>> 'baseband', # depends on common and bus.
>> 'gpu', # depends on common and bus.
>> + 'power', # depends on common (in future).
>> ]
>> if meson.is_cross_build()
>> diff --git a/drivers/power/core/acpi/meson.build
>> b/drivers/power/core/acpi/meson.build
>> new file mode 100644
>> index 0000000000..d10ec8ee94
>> --- /dev/null
>> +++ b/drivers/power/core/acpi/meson.build
>> @@ -0,0 +1,8 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2024 AMD Limited
>> +
>> +sources = files('power_acpi_cpufreq.c')
>> +
>> +headers = files('power_acpi_cpufreq.h')
>> +
>> +deps += ['power']
>> diff --git a/lib/power/power_acpi_cpufreq.c
>> b/drivers/power/core/acpi/power_acpi_cpufreq.c
>> similarity index 95%
>> rename from lib/power/power_acpi_cpufreq.c
>> rename to drivers/power/core/acpi/power_acpi_cpufreq.c
> This file is in power lib.
> How about remove the 'power' prefix of this file name?
> like acpi_cpufreq.c, cppc_cpufreq.c.
>> index f8d978d03d..69d80ad2ae 100644
>> --- a/lib/power/power_acpi_cpufreq.c
>> +++ b/drivers/power/core/acpi/power_acpi_cpufreq.c
>> @@ -577,3 +577,22 @@ int power_acpi_get_capabilities(unsigned int
>> lcore_id,
>> return 0;
>> }
>> +
>> +static struct rte_power_ops acpi_ops = {
> How about use the following structure name?
> "struct rte_power_cpufreq_ops" or "struct rte_power_core_ops"
> After all, we also have other power ops, like uncore, right?
>> + .init = power_acpi_cpufreq_init,
>> + .exit = power_acpi_cpufreq_exit,
>> + .check_env_support = power_acpi_cpufreq_check_supported,
>> + .get_avail_freqs = power_acpi_cpufreq_freqs,
>> + .get_freq = power_acpi_cpufreq_get_freq,
>> + .set_freq = power_acpi_cpufreq_set_freq,
>> + .freq_down = power_acpi_cpufreq_freq_down,
>> + .freq_up = power_acpi_cpufreq_freq_up,
>> + .freq_max = power_acpi_cpufreq_freq_max,
>> + .freq_min = power_acpi_cpufreq_freq_min,
>> + .turbo_status = power_acpi_turbo_status,
>> + .enable_turbo = power_acpi_enable_turbo,
>> + .disable_turbo = power_acpi_disable_turbo,
>> + .get_caps = power_acpi_get_capabilities
>> +};
>> +
>> +RTE_POWER_REGISTER_OPS(acpi_ops);
>> diff --git a/lib/power/power_acpi_cpufreq.h
>> b/drivers/power/core/acpi/power_acpi_cpufreq.h
>> similarity index 100%
>> rename from lib/power/power_acpi_cpufreq.h
>> rename to drivers/power/core/acpi/power_acpi_cpufreq.h
>> diff --git a/drivers/power/core/amd-pstate/meson.build
>> b/drivers/power/core/amd-pstate/meson.build
>> new file mode 100644
>> index 0000000000..8ec4c960f5
>> --- /dev/null
>> +++ b/drivers/power/core/amd-pstate/meson.build
>> @@ -0,0 +1,8 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2024 AMD Limited
>> +
>> +sources = files('power_amd_pstate_cpufreq.c')
>> +
>> +headers = files('power_amd_pstate_cpufreq.h')
>> +
>> +deps += ['power']
>> diff --git a/lib/power/power_amd_pstate_cpufreq.c
>> b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
>> similarity index 95%
>> rename from lib/power/power_amd_pstate_cpufreq.c
>> rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
>> index 028f84416b..9938de72a6 100644
>> --- a/lib/power/power_amd_pstate_cpufreq.c
>> +++ b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
>> @@ -700,3 +700,22 @@ power_amd_pstate_get_capabilities(unsigned int
>> lcore_id,
>> return 0;
>> }
>> +
>> +static struct rte_power_ops amd_pstate_ops = {
>> + .init = power_amd_pstate_cpufreq_init,
>> + .exit = power_amd_pstate_cpufreq_exit,
>> + .check_env_support = power_amd_pstate_cpufreq_check_supported,
>> + .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
>> + .get_freq = power_amd_pstate_cpufreq_get_freq,
>> + .set_freq = power_amd_pstate_cpufreq_set_freq,
>> + .freq_down = power_amd_pstate_cpufreq_freq_down,
>> + .freq_up = power_amd_pstate_cpufreq_freq_up,
>> + .freq_max = power_amd_pstate_cpufreq_freq_max,
>> + .freq_min = power_amd_pstate_cpufreq_freq_min,
>> + .turbo_status = power_amd_pstate_turbo_status,
>> + .enable_turbo = power_amd_pstate_enable_turbo,
>> + .disable_turbo = power_amd_pstate_disable_turbo,
>> + .get_caps = power_amd_pstate_get_capabilities
>> +};
>> +
>> +RTE_POWER_REGISTER_OPS(amd_pstate_ops);
>> diff --git a/lib/power/power_amd_pstate_cpufreq.h
>> b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
>> similarity index 100%
>> rename from lib/power/power_amd_pstate_cpufreq.h
>> rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
>> diff --git a/drivers/power/core/cppc/meson.build
>> b/drivers/power/core/cppc/meson.build
>> new file mode 100644
>> index 0000000000..06f3b99bb8
>> --- /dev/null
>> +++ b/drivers/power/core/cppc/meson.build
>> @@ -0,0 +1,8 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2024 AMD Limited
>> +
>> +sources = files('power_cppc_cpufreq.c')
>> +
>> +headers = files('power_cppc_cpufreq.h')
>> +
>> +deps += ['power']
>> diff --git a/lib/power/power_cppc_cpufreq.c
>> b/drivers/power/core/cppc/power_cppc_cpufreq.c
>> similarity index 96%
>> rename from lib/power/power_cppc_cpufreq.c
>> rename to drivers/power/core/cppc/power_cppc_cpufreq.c
>> index 3ddf39bd76..605f633309 100644
>> --- a/lib/power/power_cppc_cpufreq.c
>> +++ b/drivers/power/core/cppc/power_cppc_cpufreq.c
>> @@ -685,3 +685,22 @@ power_cppc_get_capabilities(unsigned int lcore_id,
>> return 0;
>> }
>> +
>> +static struct rte_power_ops cppc_ops = {
>> + .init = power_cppc_cpufreq_init,
>> + .exit = power_cppc_cpufreq_exit,
>> + .check_env_support = power_cppc_cpufreq_check_supported,
>> + .get_avail_freqs = power_cppc_cpufreq_freqs,
>> + .get_freq = power_cppc_cpufreq_get_freq,
>> + .set_freq = power_cppc_cpufreq_set_freq,
>> + .freq_down = power_cppc_cpufreq_freq_down,
>> + .freq_up = power_cppc_cpufreq_freq_up,
>> + .freq_max = power_cppc_cpufreq_freq_max,
>> + .freq_min = power_cppc_cpufreq_freq_min,
>> + .turbo_status = power_cppc_turbo_status,
>> + .enable_turbo = power_cppc_enable_turbo,
>> + .disable_turbo = power_cppc_disable_turbo,
>> + .get_caps = power_cppc_get_capabilities
>> +};
>> +
>> +RTE_POWER_REGISTER_OPS(cppc_ops);
>> diff --git a/lib/power/power_cppc_cpufreq.h
>> b/drivers/power/core/cppc/power_cppc_cpufreq.h
>> similarity index 100%
>> rename from lib/power/power_cppc_cpufreq.h
>> rename to drivers/power/core/cppc/power_cppc_cpufreq.h
>> diff --git a/lib/power/guest_channel.c
>> b/drivers/power/core/kvm-vm/guest_channel.c
>> similarity index 100%
>> rename from lib/power/guest_channel.c
>> rename to drivers/power/core/kvm-vm/guest_channel.c
>> diff --git a/lib/power/guest_channel.h
>> b/drivers/power/core/kvm-vm/guest_channel.h
>> similarity index 100%
>> rename from lib/power/guest_channel.h
>> rename to drivers/power/core/kvm-vm/guest_channel.h
>> diff --git a/drivers/power/core/kvm-vm/meson.build
>> b/drivers/power/core/kvm-vm/meson.build
>> new file mode 100644
>> index 0000000000..3150c6674b
>> --- /dev/null
>> +++ b/drivers/power/core/kvm-vm/meson.build
>> @@ -0,0 +1,20 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(C) 2024 AMD Limited.
>> +#
>> +
>> +if not is_linux
>> + build = false
>> + reason = 'only supported on Linux'
>> + subdir_done()
>> +endif
>> +
>> +sources = files(
>> + 'guest_channel.c',
>> + 'power_kvm_vm.c',
>> +)
>> +
>> +headers = files(
>> + 'guest_channel.h',
>> + 'power_kvm_vm.h',
>> +)
>> +deps += ['power']
>> diff --git a/lib/power/power_kvm_vm.c
>> b/drivers/power/core/kvm-vm/power_kvm_vm.c
>> similarity index 83%
>> rename from lib/power/power_kvm_vm.c
>> rename to drivers/power/core/kvm-vm/power_kvm_vm.c
>> index f15be8fac5..a5d6984d26 100644
>> --- a/lib/power/power_kvm_vm.c
>> +++ b/drivers/power/core/kvm-vm/power_kvm_vm.c
>> @@ -137,3 +137,22 @@ int power_kvm_vm_get_capabilities(__rte_unused
>> unsigned int lcore_id,
>> POWER_LOG(ERR, "rte_power_get_capabilities is not implemented
>> for Virtual Machine Power Management");
>> return -ENOTSUP;
>> }
>> +
>> +static struct rte_power_ops kvm_vm_ops = {
>> + .init = power_kvm_vm_init,
>> + .exit = power_kvm_vm_exit,
>> + .check_env_support = power_kvm_vm_check_supported,
>> + .get_avail_freqs = power_kvm_vm_freqs,
>> + .get_freq = power_kvm_vm_get_freq,
>> + .set_freq = power_kvm_vm_set_freq,
>> + .freq_down = power_kvm_vm_freq_down,
>> + .freq_up = power_kvm_vm_freq_up,
>> + .freq_max = power_kvm_vm_freq_max,
>> + .freq_min = power_kvm_vm_freq_min,
>> + .turbo_status = power_kvm_vm_turbo_status,
>> + .enable_turbo = power_kvm_vm_enable_turbo,
>> + .disable_turbo = power_kvm_vm_disable_turbo,
>> + .get_caps = power_kvm_vm_get_capabilities
>> +};
>> +
>> +RTE_POWER_REGISTER_OPS(kvm_vm_ops);
>> diff --git a/lib/power/power_kvm_vm.h
>> b/drivers/power/core/kvm-vm/power_kvm_vm.h
>> similarity index 100%
>> rename from lib/power/power_kvm_vm.h
>> rename to drivers/power/core/kvm-vm/power_kvm_vm.h
>> diff --git a/drivers/power/core/meson.build
>> b/drivers/power/core/meson.build
>> new file mode 100644
>> index 0000000000..4081dafaa0
>> --- /dev/null
>> +++ b/drivers/power/core/meson.build
>> @@ -0,0 +1,12 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2024 AMD Limited
>> +
>> +drivers = [
>> + 'acpi',
>> + 'amd-pstate',
>> + 'cppc',
>> + 'kvm-vm',
>> + 'pstate'
>> +]
>> +
>> +std_deps = ['power']
>> diff --git a/drivers/power/core/pstate/meson.build
>> b/drivers/power/core/pstate/meson.build
>> new file mode 100644
>> index 0000000000..1025c64e48
>> --- /dev/null
>> +++ b/drivers/power/core/pstate/meson.build
>> @@ -0,0 +1,8 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2024 AMD Limited
>> +
>> +sources = files('power_pstate_cpufreq.c')
>> +
>> +headers = files('power_pstate_cpufreq.h')
>> +
>> +deps += ['power']
>> diff --git a/lib/power/power_pstate_cpufreq.c
>> b/drivers/power/core/pstate/power_pstate_cpufreq.c
>> similarity index 96%
>> rename from lib/power/power_pstate_cpufreq.c
>> rename to drivers/power/core/pstate/power_pstate_cpufreq.c
>> index 73138dc4e4..d4c3645ff8 100644
>> --- a/lib/power/power_pstate_cpufreq.c
>> +++ b/drivers/power/core/pstate/power_pstate_cpufreq.c
>> @@ -888,3 +888,22 @@ int power_pstate_get_capabilities(unsigned int
>> lcore_id,
>> return 0;
>> }
>> +
>> +static struct rte_power_ops pstate_ops = {
>> + .init = power_pstate_cpufreq_init,
>> + .exit = power_pstate_cpufreq_exit,
>> + .check_env_support = power_pstate_cpufreq_check_supported,
>> + .get_avail_freqs = power_pstate_cpufreq_freqs,
>> + .get_freq = power_pstate_cpufreq_get_freq,
>> + .set_freq = power_pstate_cpufreq_set_freq,
>> + .freq_down = power_pstate_cpufreq_freq_down,
>> + .freq_up = power_pstate_cpufreq_freq_up,
>> + .freq_max = power_pstate_cpufreq_freq_max,
>> + .freq_min = power_pstate_cpufreq_freq_min,
>> + .turbo_status = power_pstate_turbo_status,
>> + .enable_turbo = power_pstate_enable_turbo,
>> + .disable_turbo = power_pstate_disable_turbo,
>> + .get_caps = power_pstate_get_capabilities
>> +};
>> +
>> +RTE_POWER_REGISTER_OPS(pstate_ops);
>> diff --git a/lib/power/power_pstate_cpufreq.h
>> b/drivers/power/core/pstate/power_pstate_cpufreq.h
>> similarity index 100%
>> rename from lib/power/power_pstate_cpufreq.h
>> rename to drivers/power/core/pstate/power_pstate_cpufreq.h
>> diff --git a/drivers/power/meson.build b/drivers/power/meson.build
>> new file mode 100644
>> index 0000000000..7d9034c7ac
>> --- /dev/null
>> +++ b/drivers/power/meson.build
>> @@ -0,0 +1,8 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2024 AMD Limited
>> +
>> +drivers = [
>> + 'core',
>> +]
>> +
>> +std_deps = ['power']
>> diff --git a/lib/power/meson.build b/lib/power/meson.build
>> index b8426589b2..207d96d877 100644
>> --- a/lib/power/meson.build
>> +++ b/lib/power/meson.build
>> @@ -12,14 +12,8 @@ if not is_linux
>> reason = 'only supported on Linux'
>> endif
>> sources = files(
>> - 'guest_channel.c',
>> - 'power_acpi_cpufreq.c',
>> - 'power_amd_pstate_cpufreq.c',
>> 'power_common.c',
>> - 'power_cppc_cpufreq.c',
>> - 'power_kvm_vm.c',
>> 'power_intel_uncore.c',
>> - 'power_pstate_cpufreq.c',
>> 'rte_power.c',
>> 'rte_power_uncore.c',
>> 'rte_power_pmd_mgmt.c',
>> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
>> index 30966400ba..c90b611f4f 100644
>> --- a/lib/power/power_common.h
>> +++ b/lib/power/power_common.h
>> @@ -23,13 +23,24 @@ extern int power_logtype;
>> #endif
>> /* check if scaling driver matches one we want */
>> +__rte_internal
>> int cpufreq_check_scaling_driver(const char *driver);
>> +
>> +__rte_internal
>> int power_set_governor(unsigned int lcore_id, const char
>> *new_governor,
>> char *orig_governor, size_t orig_governor_len);
>> +
>> +__rte_internal
>> int open_core_sysfs_file(FILE **f, const char *mode, const char
>> *format, ...)
>> __rte_format_printf(3, 4);
>> +
>> +__rte_internal
>> int read_core_sysfs_u32(FILE *f, uint32_t *val);
>> +
>> +__rte_internal
>> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
>> +
>> +__rte_internal
>> int write_core_sysfs_s(FILE *f, const char *str);
>> #endif /* _POWER_COMMON_H_ */
>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
>> index 36c3f3da98..70176807f4 100644
>> --- a/lib/power/rte_power.c
>> +++ b/lib/power/rte_power.c
>> @@ -8,64 +8,80 @@
>> #include <rte_spinlock.h>
>> #include "rte_power.h"
>> -#include "power_acpi_cpufreq.h"
>> -#include "power_cppc_cpufreq.h"
>> #include "power_common.h"
>> -#include "power_kvm_vm.h"
>> -#include "power_pstate_cpufreq.h"
>> -#include "power_amd_pstate_cpufreq.h"
>> enum power_management_env global_default_env = PM_ENV_NOT_SET;
> use a pointer to save the current power cpufreq ops?
>> static rte_spinlock_t global_env_cfg_lock =
>> RTE_SPINLOCK_INITIALIZER;
>> +static struct rte_power_ops rte_power_ops[PM_ENV_MAX];
>> -/* function pointers */
>> -rte_power_freqs_t rte_power_freqs = NULL;
>> -rte_power_get_freq_t rte_power_get_freq = NULL;
>> -rte_power_set_freq_t rte_power_set_freq = NULL;
>> -rte_power_freq_change_t rte_power_freq_up = NULL;
>> -rte_power_freq_change_t rte_power_freq_down = NULL;
>> -rte_power_freq_change_t rte_power_freq_max = NULL;
>> -rte_power_freq_change_t rte_power_freq_min = NULL;
>> -rte_power_freq_change_t rte_power_turbo_status;
>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
>> -rte_power_get_capabilities_t rte_power_get_capabilities;
>> -
>> -static void
>> -reset_power_function_ptrs(void)
>> +/* register the ops struct in rte_power_ops, return 0 on success. */
>> +int
>> +rte_power_register_ops(const struct rte_power_ops *op)
>> +{
>> + struct rte_power_ops *ops;
>> +
>> + if (op->env >= PM_ENV_MAX) {
>> + POWER_LOG(ERR, "Unsupported power management environment\n");
>> + return -EINVAL;
>> + }
>> +
>> + if (op->status != 0) {
>> + POWER_LOG(ERR, "Power management env[%d] ops registered
>> already\n",
>> + op->env);
>> + return -EINVAL;
>> + }
>> +
>> + if (!op->init || !op->exit || !op->check_env_support ||
>> + !op->get_avail_freqs || !op->get_freq || !op->set_freq ||
>> + !op->freq_up || !op->freq_down || !op->freq_max ||
>> + !op->freq_min || !op->turbo_status || !op->enable_turbo ||
>> + !op->disable_turbo || !op->get_caps) {
>> + POWER_LOG(ERR, "Missing callbacks while registering power
>> ops\n");
>> + return -EINVAL;
>> + }
>> +
>> + ops = &rte_power_ops[op->env];
> It is better to use a global linked list instead of an array.
> And we should extract a list structure including this ops structure
> and this ops's owner.
>> + ops->env = op->env;
>> + ops->init = op->init;
>> + ops->exit = op->exit;
>> + ops->check_env_support = op->check_env_support;
>> + ops->get_avail_freqs = op->get_avail_freqs;
>> + ops->get_freq = op->get_freq;
>> + ops->set_freq = op->set_freq;
>> + ops->freq_up = op->freq_up;
>> + ops->freq_down = op->freq_down;
>> + ops->freq_max = op->freq_max;
>> + ops->freq_min = op->freq_min;
>> + ops->turbo_status = op->turbo_status;
>> + ops->enable_turbo = op->enable_turbo;
>> + ops->disable_turbo = op->disable_turbo;
> *ops = *op?
>> + ops->status = 1; /* registered */
> status --> registered?
> But if use ops linked list, this flag also can be removed.
>> +
>> + return 0;
>> +}
>> +
>> +struct rte_power_ops *
>> +rte_power_get_ops(int ops_index)
> AFAICS, there is only one cpufreq driver on one platform and just have
> one power_cpufreq_ops to use for user.
> We don't need user to get other power ops, and user just want to know
> the power ops using currently, right?
> So using 'index' toget this ops is not good.
>> {
>> - rte_power_freqs = NULL;
>> - rte_power_get_freq = NULL;
>> - rte_power_set_freq = NULL;
>> - rte_power_freq_up = NULL;
>> - rte_power_freq_down = NULL;
>> - rte_power_freq_max = NULL;
>> - rte_power_freq_min = NULL;
>> - rte_power_turbo_status = NULL;
>> - rte_power_freq_enable_turbo = NULL;
>> - rte_power_freq_disable_turbo = NULL;
>> - rte_power_get_capabilities = NULL;
>> + RTE_VERIFY((ops_index >= PM_ENV_NOT_SET) && (ops_index <
>> PM_ENV_MAX));
>> + RTE_VERIFY(rte_power_ops[ops_index].status != 0);
>> +
>> + return &rte_power_ops[ops_index];
>> }
>> int
>> rte_power_check_env_supported(enum power_management_env env)
>> {
>> - switch (env) {
>> - case PM_ENV_ACPI_CPUFREQ:
>> - return power_acpi_cpufreq_check_supported();
>> - case PM_ENV_PSTATE_CPUFREQ:
>> - return power_pstate_cpufreq_check_supported();
>> - case PM_ENV_KVM_VM:
>> - return power_kvm_vm_check_supported();
>> - case PM_ENV_CPPC_CPUFREQ:
>> - return power_cppc_cpufreq_check_supported();
>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>> - return power_amd_pstate_cpufreq_check_supported();
>> - default:
>> - rte_errno = EINVAL;
>> - return -1;
>> + struct rte_power_ops *ops;
>> +
>> + if ((env > PM_ENV_NOT_SET) && (env < PM_ENV_MAX)) {
>> + ops = rte_power_get_ops(env);
>> + return ops->check_env_support();
>> }
>> +
>> + rte_errno = EINVAL;
>> + return -1;
>> }
>> int
>> @@ -80,80 +96,26 @@ rte_power_set_env(enum power_management_env env)
>> }
>> int ret = 0;
>> + struct rte_power_ops *ops;
>> +
>> + if ((env == PM_ENV_NOT_SET) || (env >= PM_ENV_MAX)) {
>> + POWER_LOG(ERR, "Invalid Power Management Environment(%d)"
>> + " set\n", env);
>> + ret = -1;
>> + }
> <...>
>> + ops = rte_power_get_ops(env);
> To find the target ops from the global list according to the env?
>> + if (ops->status == 0) {
>> + POWER_LOG(ERR, WER,
>> + "Power Management Environment(%d) not"
>> + " registered\n", env);
>> ret = -1;
>> }
>> if (ret == 0)
>> global_default_env = env;
> It is more convenient to use a global variable to point to the default
> power_cpufreq ops or its list node.
>> - else {
>> + else
>> global_default_env = PM_ENV_NOT_SET;
>> - reset_power_function_ptrs();
>> - }
>> rte_spinlock_unlock(&global_env_cfg_lock);
>> return ret;
>> @@ -164,7 +126,6 @@ rte_power_unset_env(void)
>> {
>> rte_spinlock_lock(&global_env_cfg_lock);
>> global_default_env = PM_ENV_NOT_SET;
>> - reset_power_function_ptrs();
>> rte_spinlock_unlock(&global_env_cfg_lock);
>> }
>> @@ -177,59 +138,76 @@ int
>> rte_power_init(unsigned int lcore_id)
>> {
>> int ret = -1;
>> + struct rte_power_ops *ops;
>> - switch (global_default_env) {
>> - case PM_ENV_ACPI_CPUFREQ:
>> - return power_acpi_cpufreq_init(lcore_id);
>> - case PM_ENV_KVM_VM:
>> - return power_kvm_vm_init(lcore_id);
>> - case PM_ENV_PSTATE_CPUFREQ:
>> - return power_pstate_cpufreq_init(lcore_id);
>> - case PM_ENV_CPPC_CPUFREQ:
>> - return power_cppc_cpufreq_init(lcore_id);
>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>> - return power_amd_pstate_cpufreq_init(lcore_id);
>> - default:
>> - POWER_LOG(INFO, "Env isn't set yet!");
>> + if (global_default_env != PM_ENV_NOT_SET) {
>> + ops = &rte_power_ops[global_default_env];
>> + if (!ops->status) {
>> + POWER_LOG(ERR, "Power management env[%d] not"
>> + " supported\n", global_default_env);
>> + goto out;
>> + }
>> + return ops->init(lcore_id);
>> }
>> + POWER_LOG(INFO, POWER, "Env isn't set yet!\n");
>> /* Auto detect Environment */
>> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
>> management...");
>> - ret = power_acpi_cpufreq_init(lcore_id);
>> - if (ret == 0) {
>> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
>> - goto out;
>> + POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq"
>> + " power management...\n");
>> + ops = &rte_power_ops[PM_ENV_ACPI_CPUFREQ];
>> + if (ops->status) {
>> + ret = ops->init(lcore_id);
>> + if (ret == 0) {
>> + rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
>> + goto out;
>> + }
>> }
>> - POWER_LOG(INFO, "Attempting to initialise PSTAT power
>> management...");
>> - ret = power_pstate_cpufreq_init(lcore_id);
>> - if (ret == 0) {
>> - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
>> - goto out;
>> + POWER_LOG(INFO, "Attempting to initialise PSTAT"
>> + " power management...\n");
>> + ops = &rte_power_ops[PM_ENV_PSTATE_CPUFREQ];
>> + if (ops->status) {
>> + ret = ops->init(lcore_id);
>> + if (ret == 0) {
>> + rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
>> + goto out;
>> + }
>> }
>> - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power
>> management...");
>> - ret = power_amd_pstate_cpufreq_init(lcore_id);
>> - if (ret == 0) {
>> - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
>> - goto out;
>> + POWER_LOG(INFO, "Attempting to initialise AMD PSTATE"
>> + " power management...\n");
>> + ops = &rte_power_ops[PM_ENV_AMD_PSTATE_CPUFREQ];
>> + if (ops->status) {
>> + ret = ops->init(lcore_id);
>> + if (ret == 0) {
>> + rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
>> + goto out;
>> + }
>> }
>> - POWER_LOG(INFO, "Attempting to initialise CPPC power
>> management...");
>> - ret = power_cppc_cpufreq_init(lcore_id);
>> - if (ret == 0) {
>> - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
>> - goto out;
>> + POWER_LOG(INFO, "Attempting to initialise CPPC power"
>> + " management...\n");
>> + ops = &rte_power_ops[PM_ENV_CPPC_CPUFREQ];
>> + if (ops->status) {
>> + ret = ops->init(lcore_id);
>> + if (ret == 0) {
>> + rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
>> + goto out;
>> + }
>> }
>> - POWER_LOG(INFO, "Attempting to initialise VM power
>> management...");
>> - ret = power_kvm_vm_init(lcore_id);
>> - if (ret == 0) {
>> - rte_power_set_env(PM_ENV_KVM_VM);
>> - goto out;
>> + POWER_LOG(INFO, "Attempting to initialise VM power"
>> + " management...\n");
>> + ops = &rte_power_ops[PM_ENV_KVM_VM];
>> + if (ops->status) {
>> + ret = ops->init(lcore_id);
>> + if (ret == 0) {
>> + rte_power_set_env(PM_ENV_KVM_VM);
>> + goto out;
>> + }
>> }
> If we use a linked list, above code can be simpled like this:
> ->
> for_each_power_cpufreq_ops(ops, ...) {
> ret = ops->init()
> if (ret) {
> ....
> }
> }
>> - POWER_LOG(ERR, "Unable to set Power Management Environment for
>> lcore "
>> - "%u", lcore_id);
>> + POWER_LOG(ERR, "Unable to set Power Management Environment"
>> + " for lcore %u\n", lcore_id);
>> out:
>> return ret;
>> }
>> @@ -237,21 +215,14 @@ rte_power_init(unsigned int lcore_id)
>> int
>> rte_power_exit(unsigned int lcore_id)
>> {
>> - switch (global_default_env) {
>> - case PM_ENV_ACPI_CPUFREQ:
>> - return power_acpi_cpufreq_exit(lcore_id);
>> - case PM_ENV_KVM_VM:
>> - return power_kvm_vm_exit(lcore_id);
>> - case PM_ENV_PSTATE_CPUFREQ:
>> - return power_pstate_cpufreq_exit(lcore_id);
>> - case PM_ENV_CPPC_CPUFREQ:
>> - return power_cppc_cpufreq_exit(lcore_id);
>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>> - return power_amd_pstate_cpufreq_exit(lcore_id);
>> - default:
>> - POWER_LOG(ERR, "Environment has not been set, unable to exit
>> gracefully");
>> + struct rte_power_ops *ops;
>> + if (global_default_env != PM_ENV_NOT_SET) {
>> + ops = &rte_power_ops[global_default_env];
>> + return ops->exit(lcore_id);
>> }
>> - return -1;
>> + POWER_LOG(ERR, "Environment has not been set, unable "
>> + "to exit gracefully\n");
>> + return -1;
>> }
>> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
>> index 4fa4afe399..749bb823ab 100644
>> --- a/lib/power/rte_power.h
>> +++ b/lib/power/rte_power.h
>> @@ -1,5 +1,6 @@
>> /* SPDX-License-Identifier: BSD-3-Clause
>> * Copyright(c) 2010-2014 Intel Corporation
>> + * Copyright(c) 2024 AMD Limited
>> */
>> #ifndef _RTE_POWER_H
>> @@ -21,7 +22,7 @@ extern "C" {
>> /* Power Management Environment State */
>> enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ,
>> PM_ENV_KVM_VM,
>> PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
>> - PM_ENV_AMD_PSTATE_CPUFREQ};
>> + PM_ENV_AMD_PSTATE_CPUFREQ, PM_ENV_MAX};
> "enum power_management_env" is not good. may be like "enum
> power_cpufreq_driver_type"?
> In previous linked list structure to be defined, may be directly use a
> string name instead of a fixed enum is better.
> Becuase the new "PM_ENV_MAX" will lead to break ABI when add a new
> cpufreq driver.
>> /**
>> * Check if a specific power management environment type is
>> supported on a
>> @@ -66,6 +67,97 @@ void rte_power_unset_env(void);
>> */
>> enum power_management_env rte_power_get_env(void);
>> +typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
>> +typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
>> +typedef int (*rte_power_check_env_support_t)(void);
>> +
>> +typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
>> uint32_t *freqs,
>> + uint32_t num);
>> +typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
>> +typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t
>> index);
>> +typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
>> +
>> +/**
>> + * Function pointer definition for generic frequency change
>> functions. Review
>> + * each environments specific documentation for usage.
>> + *
>> + * @param lcore_id
>> + * lcore id.
>> + *
>> + * @return
>> + * - 1 on success with frequency changed.
>> + * - 0 on success without frequency changed.
>> + * - Negative on error.
>> + */
>> +
>> +/**
>> + * Power capabilities summary.
>> + */
>> +struct rte_power_core_capabilities {
>> + union {
>> + uint64_t capabilities;
>> + struct {
>> + uint64_t turbo:1; /**< Turbo can be enabled. */
>> + uint64_t priority:1; /**< SST-BF high freq core */
>> + };
>> + };
>> +};
>> +
>> +typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
>> + struct rte_power_core_capabilities *caps);
>> +
>> +/** Structure defining core power operations structure */
>> +struct rte_power_ops {
>> +uint8_t status; /**< ops register status. */
>> + enum power_management_env env; /**< power mgmt env. */
>> + rte_power_cpufreq_init_t init; /**< Initialize power
>> management. */
>> + rte_power_cpufreq_exit_t exit; /**< Exit power management. */
>> + rte_power_check_env_support_t check_env_support; /**< verify env
>> is supported. */
>> + rte_power_freqs_t get_avail_freqs; /**< Get the available
>> frequencies. */
>> + rte_power_get_freq_t get_freq; /**< Get frequency index. */
>> + rte_power_set_freq_t set_freq; /**< Set frequency index. */
>> + rte_power_freq_change_t freq_up; /**< Scale up frequency. */
>> + rte_power_freq_change_t freq_down; /**< Scale down frequency. */
>> + rte_power_freq_change_t freq_max; /**< Scale up frequency to
>> highest. */
>> + rte_power_freq_change_t freq_min; /**< Scale up frequency to
>> lowest. */
>> + rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
>> + rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
>> + rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
>> + rte_power_get_capabilities_t get_caps; /**< power capabilities. */
>> +} __rte_cache_aligned;
> Suggest that fix this sturcture, like:
> struct rte_power_cpufreq_list {
> char name[]; // like "cppc_cpufreq", "pstate_cpufreq"
> struct rte_power_cpufreq *ops;
> struct rte_power_cpufreq_list *node;
> }
>> +
>> +/**
>> + * Register power cpu frequency operations.
>> + *
>> + * @param ops
>> + * Pointer to an ops structure to register.
>> + * @return
>> + * - >=0: Success; return the index of the ops struct in the table.
>> + * - -EINVAL - error while registering ops struct.
>> + */
>> +__rte_internal
>> +int rte_power_register_ops(const struct rte_power_ops *ops);
>> +
>> +/**
>> + * Macro to statically register the ops of a cpufreq driver.
>> + */
>> +#define RTE_POWER_REGISTER_OPS(ops) \
>> + (RTE_INIT(power_hdlr_init_##ops) \
>> + { \
>> + rte_power_register_ops(&ops); \
>> + })
>> +
>> +/**
>> + * @internal Get the power ops struct from its index.
>> + *
>> + * @param ops_index
>> + * The index of the ops struct in the ops struct table.
>> + * @return
>> + * The pointer to the ops struct in the table if registered.
>> + */
>> +struct rte_power_ops *
>> +rte_power_get_ops(int ops_index);
>> +
>> /**
>> * Initialize power management for a specific lcore. If
>> rte_power_set_env() has
>> * not been called then an auto-detect of the environment will
>> start and
>> @@ -108,10 +200,14 @@ int rte_power_exit(unsigned int lcore_id);
>> * @return
>> * The number of available frequencies.
>> */
>> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
>> uint32_t *freqs,
>> - uint32_t num);
>> +static inline uint32_t
>> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
>> +{
>> + struct rte_power_ops *ops;
>> -extern rte_power_freqs_t rte_power_freqs;
>> + ops = rte_power_get_ops(rte_power_get_env());
>> + return ops->get_avail_freqs(lcore_id, freqs, n);
>> +}
> nice.
> <...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [RFC PATCH 1/2] power: refactor core power management library
2024-03-01 2:56 ` lihuisong (C)
2024-03-01 10:39 ` Hunt, David
@ 2024-03-05 4:35 ` Tummala, Sivaprasad
1 sibling, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-03-05 4:35 UTC (permalink / raw)
To: lihuisong (C),
david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, Yigit, Ferruh, konstantin.ananyev
Cc: dev
[AMD Official Use Only - General]
Hi Lihuisong,
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Friday, March 1, 2024 8:27 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
> david.hunt@intel.com; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
> Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com
> Cc: dev@dpdk.org
> Subject: Re: [RFC PATCH 1/2] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> 在 2024/2/20 23:33, Sivaprasad Tummala 写道:
> > This patch introduces a comprehensive refactor to the core power
> > management library. The primary focus is on improving modularity and
> > organization by relocating specific driver implementations from the
> > 'lib/power' directory to dedicated directories within
> > 'drivers/power/core/*'. The adjustment of meson.build files enables
> > the selective activation of individual drivers.
> >
> > These changes contribute to a significant enhancement in code
> > organization, providing a clearer structure for driver implementations.
> > The refactor aims to improve overall code clarity and boost
> > maintainability. Additionally, it establishes a foundation for future
> > development, allowing for more focused work on individual drivers and
> > seamless integration of forthcoming enhancements.
>
> Good job. +1 to refacotor.
>
> <...>
>
> > diff --git a/drivers/meson.build b/drivers/meson.build index
> > f2be71bc05..e293c3945f 100644
> > --- a/drivers/meson.build
> > +++ b/drivers/meson.build
> > @@ -28,6 +28,7 @@ subdirs = [
> > 'event', # depends on common, bus, mempool and net.
> > 'baseband', # depends on common and bus.
> > 'gpu', # depends on common and bus.
> > + 'power', # depends on common (in future).
> > ]
> >
> > if meson.is_cross_build()
> > diff --git a/drivers/power/core/acpi/meson.build
> > b/drivers/power/core/acpi/meson.build
> > new file mode 100644
> > index 0000000000..d10ec8ee94
> > --- /dev/null
> > +++ b/drivers/power/core/acpi/meson.build
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> > +
> > +sources = files('power_acpi_cpufreq.c')
> > +
> > +headers = files('power_acpi_cpufreq.h')
> > +
> > +deps += ['power']
> > diff --git a/lib/power/power_acpi_cpufreq.c
> > b/drivers/power/core/acpi/power_acpi_cpufreq.c
> > similarity index 95%
> > rename from lib/power/power_acpi_cpufreq.c rename to
> > drivers/power/core/acpi/power_acpi_cpufreq.c
> This file is in power lib.
> How about remove the 'power' prefix of this file name?
> like acpi_cpufreq.c, cppc_cpufreq.c.
ACK
> > index f8d978d03d..69d80ad2ae 100644
> > --- a/lib/power/power_acpi_cpufreq.c
> > +++ b/drivers/power/core/acpi/power_acpi_cpufreq.c
> > @@ -577,3 +577,22 @@ int power_acpi_get_capabilities(unsigned int
> > lcore_id,
> >
> > return 0;
> > }
> > +
> > +static struct rte_power_ops acpi_ops = {
> How about use the following structure name?
> "struct rte_power_cpufreq_ops" or "struct rte_power_core_ops"
> After all, we also have other power ops, like uncore, right?
Agreed.
> > + .init = power_acpi_cpufreq_init,
> > + .exit = power_acpi_cpufreq_exit,
> > + .check_env_support = power_acpi_cpufreq_check_supported,
> > + .get_avail_freqs = power_acpi_cpufreq_freqs,
> > + .get_freq = power_acpi_cpufreq_get_freq,
> > + .set_freq = power_acpi_cpufreq_set_freq,
> > + .freq_down = power_acpi_cpufreq_freq_down,
> > + .freq_up = power_acpi_cpufreq_freq_up,
> > + .freq_max = power_acpi_cpufreq_freq_max,
> > + .freq_min = power_acpi_cpufreq_freq_min,
> > + .turbo_status = power_acpi_turbo_status,
> > + .enable_turbo = power_acpi_enable_turbo,
> > + .disable_turbo = power_acpi_disable_turbo,
> > + .get_caps = power_acpi_get_capabilities };
> > +
> > +RTE_POWER_REGISTER_OPS(acpi_ops);
> > diff --git a/lib/power/power_acpi_cpufreq.h
> > b/drivers/power/core/acpi/power_acpi_cpufreq.h
> > similarity index 100%
> > rename from lib/power/power_acpi_cpufreq.h rename to
> > drivers/power/core/acpi/power_acpi_cpufreq.h
> > diff --git a/drivers/power/core/amd-pstate/meson.build
> > b/drivers/power/core/amd-pstate/meson.build
> > new file mode 100644
> > index 0000000000..8ec4c960f5
> > --- /dev/null
> > +++ b/drivers/power/core/amd-pstate/meson.build
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> > +
> > +sources = files('power_amd_pstate_cpufreq.c')
> > +
> > +headers = files('power_amd_pstate_cpufreq.h')
> > +
> > +deps += ['power']
> > diff --git a/lib/power/power_amd_pstate_cpufreq.c
> > b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
> > similarity index 95%
> > rename from lib/power/power_amd_pstate_cpufreq.c
> > rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
> > index 028f84416b..9938de72a6 100644
> > --- a/lib/power/power_amd_pstate_cpufreq.c
> > +++ b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.c
> > @@ -700,3 +700,22 @@ power_amd_pstate_get_capabilities(unsigned int
> > lcore_id,
> >
> > return 0;
> > }
> > +
> > +static struct rte_power_ops amd_pstate_ops = {
> > + .init = power_amd_pstate_cpufreq_init,
> > + .exit = power_amd_pstate_cpufreq_exit,
> > + .check_env_support = power_amd_pstate_cpufreq_check_supported,
> > + .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
> > + .get_freq = power_amd_pstate_cpufreq_get_freq,
> > + .set_freq = power_amd_pstate_cpufreq_set_freq,
> > + .freq_down = power_amd_pstate_cpufreq_freq_down,
> > + .freq_up = power_amd_pstate_cpufreq_freq_up,
> > + .freq_max = power_amd_pstate_cpufreq_freq_max,
> > + .freq_min = power_amd_pstate_cpufreq_freq_min,
> > + .turbo_status = power_amd_pstate_turbo_status,
> > + .enable_turbo = power_amd_pstate_enable_turbo,
> > + .disable_turbo = power_amd_pstate_disable_turbo,
> > + .get_caps = power_amd_pstate_get_capabilities };
> > +
> > +RTE_POWER_REGISTER_OPS(amd_pstate_ops);
> > diff --git a/lib/power/power_amd_pstate_cpufreq.h
> > b/drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
> > similarity index 100%
> > rename from lib/power/power_amd_pstate_cpufreq.h
> > rename to drivers/power/core/amd-pstate/power_amd_pstate_cpufreq.h
> > diff --git a/drivers/power/core/cppc/meson.build
> > b/drivers/power/core/cppc/meson.build
> > new file mode 100644
> > index 0000000000..06f3b99bb8
> > --- /dev/null
> > +++ b/drivers/power/core/cppc/meson.build
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> > +
> > +sources = files('power_cppc_cpufreq.c')
> > +
> > +headers = files('power_cppc_cpufreq.h')
> > +
> > +deps += ['power']
> > diff --git a/lib/power/power_cppc_cpufreq.c
> > b/drivers/power/core/cppc/power_cppc_cpufreq.c
> > similarity index 96%
> > rename from lib/power/power_cppc_cpufreq.c rename to
> > drivers/power/core/cppc/power_cppc_cpufreq.c
> > index 3ddf39bd76..605f633309 100644
> > --- a/lib/power/power_cppc_cpufreq.c
> > +++ b/drivers/power/core/cppc/power_cppc_cpufreq.c
> > @@ -685,3 +685,22 @@ power_cppc_get_capabilities(unsigned int
> > lcore_id,
> >
> > return 0;
> > }
> > +
> > +static struct rte_power_ops cppc_ops = {
> > + .init = power_cppc_cpufreq_init,
> > + .exit = power_cppc_cpufreq_exit,
> > + .check_env_support = power_cppc_cpufreq_check_supported,
> > + .get_avail_freqs = power_cppc_cpufreq_freqs,
> > + .get_freq = power_cppc_cpufreq_get_freq,
> > + .set_freq = power_cppc_cpufreq_set_freq,
> > + .freq_down = power_cppc_cpufreq_freq_down,
> > + .freq_up = power_cppc_cpufreq_freq_up,
> > + .freq_max = power_cppc_cpufreq_freq_max,
> > + .freq_min = power_cppc_cpufreq_freq_min,
> > + .turbo_status = power_cppc_turbo_status,
> > + .enable_turbo = power_cppc_enable_turbo,
> > + .disable_turbo = power_cppc_disable_turbo,
> > + .get_caps = power_cppc_get_capabilities };
> > +
> > +RTE_POWER_REGISTER_OPS(cppc_ops);
> > diff --git a/lib/power/power_cppc_cpufreq.h
> > b/drivers/power/core/cppc/power_cppc_cpufreq.h
> > similarity index 100%
> > rename from lib/power/power_cppc_cpufreq.h rename to
> > drivers/power/core/cppc/power_cppc_cpufreq.h
> > diff --git a/lib/power/guest_channel.c
> > b/drivers/power/core/kvm-vm/guest_channel.c
> > similarity index 100%
> > rename from lib/power/guest_channel.c
> > rename to drivers/power/core/kvm-vm/guest_channel.c
> > diff --git a/lib/power/guest_channel.h
> > b/drivers/power/core/kvm-vm/guest_channel.h
> > similarity index 100%
> > rename from lib/power/guest_channel.h
> > rename to drivers/power/core/kvm-vm/guest_channel.h
> > diff --git a/drivers/power/core/kvm-vm/meson.build
> > b/drivers/power/core/kvm-vm/meson.build
> > new file mode 100644
> > index 0000000000..3150c6674b
> > --- /dev/null
> > +++ b/drivers/power/core/kvm-vm/meson.build
> > @@ -0,0 +1,20 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(C) 2024 AMD
> > +Limited.
> > +#
> > +
> > +if not is_linux
> > + build = false
> > + reason = 'only supported on Linux'
> > + subdir_done()
> > +endif
> > +
> > +sources = files(
> > + 'guest_channel.c',
> > + 'power_kvm_vm.c',
> > +)
> > +
> > +headers = files(
> > + 'guest_channel.h',
> > + 'power_kvm_vm.h',
> > +)
> > +deps += ['power']
> > diff --git a/lib/power/power_kvm_vm.c
> > b/drivers/power/core/kvm-vm/power_kvm_vm.c
> > similarity index 83%
> > rename from lib/power/power_kvm_vm.c
> > rename to drivers/power/core/kvm-vm/power_kvm_vm.c
> > index f15be8fac5..a5d6984d26 100644
> > --- a/lib/power/power_kvm_vm.c
> > +++ b/drivers/power/core/kvm-vm/power_kvm_vm.c
> > @@ -137,3 +137,22 @@ int power_kvm_vm_get_capabilities(__rte_unused
> unsigned int lcore_id,
> > POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual
> Machine Power Management");
> > return -ENOTSUP;
> > }
> > +
> > +static struct rte_power_ops kvm_vm_ops = {
> > + .init = power_kvm_vm_init,
> > + .exit = power_kvm_vm_exit,
> > + .check_env_support = power_kvm_vm_check_supported,
> > + .get_avail_freqs = power_kvm_vm_freqs,
> > + .get_freq = power_kvm_vm_get_freq,
> > + .set_freq = power_kvm_vm_set_freq,
> > + .freq_down = power_kvm_vm_freq_down,
> > + .freq_up = power_kvm_vm_freq_up,
> > + .freq_max = power_kvm_vm_freq_max,
> > + .freq_min = power_kvm_vm_freq_min,
> > + .turbo_status = power_kvm_vm_turbo_status,
> > + .enable_turbo = power_kvm_vm_enable_turbo,
> > + .disable_turbo = power_kvm_vm_disable_turbo,
> > + .get_caps = power_kvm_vm_get_capabilities };
> > +
> > +RTE_POWER_REGISTER_OPS(kvm_vm_ops);
> > diff --git a/lib/power/power_kvm_vm.h
> > b/drivers/power/core/kvm-vm/power_kvm_vm.h
> > similarity index 100%
> > rename from lib/power/power_kvm_vm.h
> > rename to drivers/power/core/kvm-vm/power_kvm_vm.h
> > diff --git a/drivers/power/core/meson.build
> > b/drivers/power/core/meson.build new file mode 100644 index
> > 0000000000..4081dafaa0
> > --- /dev/null
> > +++ b/drivers/power/core/meson.build
> > @@ -0,0 +1,12 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> > +
> > +drivers = [
> > + 'acpi',
> > + 'amd-pstate',
> > + 'cppc',
> > + 'kvm-vm',
> > + 'pstate'
> > +]
> > +
> > +std_deps = ['power']
> > diff --git a/drivers/power/core/pstate/meson.build
> > b/drivers/power/core/pstate/meson.build
> > new file mode 100644
> > index 0000000000..1025c64e48
> > --- /dev/null
> > +++ b/drivers/power/core/pstate/meson.build
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> > +
> > +sources = files('power_pstate_cpufreq.c')
> > +
> > +headers = files('power_pstate_cpufreq.h')
> > +
> > +deps += ['power']
> > diff --git a/lib/power/power_pstate_cpufreq.c
> > b/drivers/power/core/pstate/power_pstate_cpufreq.c
> > similarity index 96%
> > rename from lib/power/power_pstate_cpufreq.c rename to
> > drivers/power/core/pstate/power_pstate_cpufreq.c
> > index 73138dc4e4..d4c3645ff8 100644
> > --- a/lib/power/power_pstate_cpufreq.c
> > +++ b/drivers/power/core/pstate/power_pstate_cpufreq.c
> > @@ -888,3 +888,22 @@ int power_pstate_get_capabilities(unsigned int
> > lcore_id,
> >
> > return 0;
> > }
> > +
> > +static struct rte_power_ops pstate_ops = {
> > + .init = power_pstate_cpufreq_init,
> > + .exit = power_pstate_cpufreq_exit,
> > + .check_env_support = power_pstate_cpufreq_check_supported,
> > + .get_avail_freqs = power_pstate_cpufreq_freqs,
> > + .get_freq = power_pstate_cpufreq_get_freq,
> > + .set_freq = power_pstate_cpufreq_set_freq,
> > + .freq_down = power_pstate_cpufreq_freq_down,
> > + .freq_up = power_pstate_cpufreq_freq_up,
> > + .freq_max = power_pstate_cpufreq_freq_max,
> > + .freq_min = power_pstate_cpufreq_freq_min,
> > + .turbo_status = power_pstate_turbo_status,
> > + .enable_turbo = power_pstate_enable_turbo,
> > + .disable_turbo = power_pstate_disable_turbo,
> > + .get_caps = power_pstate_get_capabilities };
> > +
> > +RTE_POWER_REGISTER_OPS(pstate_ops);
> > diff --git a/lib/power/power_pstate_cpufreq.h
> > b/drivers/power/core/pstate/power_pstate_cpufreq.h
> > similarity index 100%
> > rename from lib/power/power_pstate_cpufreq.h rename to
> > drivers/power/core/pstate/power_pstate_cpufreq.h
> > diff --git a/drivers/power/meson.build b/drivers/power/meson.build new
> > file mode 100644 index 0000000000..7d9034c7ac
> > --- /dev/null
> > +++ b/drivers/power/meson.build
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2024 AMD
> > +Limited
> > +
> > +drivers = [
> > + 'core',
> > +]
> > +
> > +std_deps = ['power']
> > diff --git a/lib/power/meson.build b/lib/power/meson.build index
> > b8426589b2..207d96d877 100644
> > --- a/lib/power/meson.build
> > +++ b/lib/power/meson.build
> > @@ -12,14 +12,8 @@ if not is_linux
> > reason = 'only supported on Linux'
> > endif
> > sources = files(
> > - 'guest_channel.c',
> > - 'power_acpi_cpufreq.c',
> > - 'power_amd_pstate_cpufreq.c',
> > 'power_common.c',
> > - 'power_cppc_cpufreq.c',
> > - 'power_kvm_vm.c',
> > 'power_intel_uncore.c',
> > - 'power_pstate_cpufreq.c',
> > 'rte_power.c',
> > 'rte_power_uncore.c',
> > 'rte_power_pmd_mgmt.c',
> > diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
> > 30966400ba..c90b611f4f 100644
> > --- a/lib/power/power_common.h
> > +++ b/lib/power/power_common.h
> > @@ -23,13 +23,24 @@ extern int power_logtype;
> > #endif
> >
> > /* check if scaling driver matches one we want */
> > +__rte_internal
> > int cpufreq_check_scaling_driver(const char *driver);
> > +
> > +__rte_internal
> > int power_set_governor(unsigned int lcore_id, const char *new_governor,
> > char *orig_governor, size_t orig_governor_len);
> > +
> > +__rte_internal
> > int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> > __rte_format_printf(3, 4);
> > +
> > +__rte_internal
> > int read_core_sysfs_u32(FILE *f, uint32_t *val);
> > +
> > +__rte_internal
> > int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> > +
> > +__rte_internal
> > int write_core_sysfs_s(FILE *f, const char *str);
> >
> > #endif /* _POWER_COMMON_H_ */
> > diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
> > 36c3f3da98..70176807f4 100644
> > --- a/lib/power/rte_power.c
> > +++ b/lib/power/rte_power.c
> > @@ -8,64 +8,80 @@
> > #include <rte_spinlock.h>
> >
> > #include "rte_power.h"
> > -#include "power_acpi_cpufreq.h"
> > -#include "power_cppc_cpufreq.h"
> > #include "power_common.h"
> > -#include "power_kvm_vm.h"
> > -#include "power_pstate_cpufreq.h"
> > -#include "power_amd_pstate_cpufreq.h"
> >
> > enum power_management_env global_default_env = PM_ENV_NOT_SET;
> use a pointer to save the current power cpufreq ops?
ACK
> >
> > static rte_spinlock_t global_env_cfg_lock =
> > RTE_SPINLOCK_INITIALIZER;
> > +static struct rte_power_ops rte_power_ops[PM_ENV_MAX];
> >
> > -/* function pointers */
> > -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
> > rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
> > NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
> > -rte_power_freq_change_t rte_power_freq_down = NULL;
> > -rte_power_freq_change_t rte_power_freq_max = NULL;
> > -rte_power_freq_change_t rte_power_freq_min = NULL;
> > -rte_power_freq_change_t rte_power_turbo_status;
> > -rte_power_freq_change_t rte_power_freq_enable_turbo;
> > -rte_power_freq_change_t rte_power_freq_disable_turbo;
> > -rte_power_get_capabilities_t rte_power_get_capabilities;
> > -
> > -static void
> > -reset_power_function_ptrs(void)
> > +/* register the ops struct in rte_power_ops, return 0 on success. */
> > +int rte_power_register_ops(const struct rte_power_ops *op) {
> > + struct rte_power_ops *ops;
> > +
> > + if (op->env >= PM_ENV_MAX) {
> > + POWER_LOG(ERR, "Unsupported power management environment\n");
> > + return -EINVAL;
> > + }
> > +
> > + if (op->status != 0) {
> > + POWER_LOG(ERR, "Power management env[%d] ops registered
> already\n",
> > + op->env);
> > + return -EINVAL;
> > + }
> > +
> > + if (!op->init || !op->exit || !op->check_env_support ||
> > + !op->get_avail_freqs || !op->get_freq || !op->set_freq ||
> > + !op->freq_up || !op->freq_down || !op->freq_max ||
> > + !op->freq_min || !op->turbo_status || !op->enable_turbo ||
> > + !op->disable_turbo || !op->get_caps) {
> > + POWER_LOG(ERR, "Missing callbacks while registering power ops\n");
> > + return -EINVAL;
> > + }
> > +
> > + ops = &rte_power_ops[op->env];
> It is better to use a global linked list instead of an array.
> And we should extract a list structure including this ops structure and this ops's
> owner.
> > + ops->env = op->env;
> > + ops->init = op->init;
> > + ops->exit = op->exit;
> > + ops->check_env_support = op->check_env_support;
> > + ops->get_avail_freqs = op->get_avail_freqs;
> > + ops->get_freq = op->get_freq;
> > + ops->set_freq = op->set_freq;
> > + ops->freq_up = op->freq_up;
> > + ops->freq_down = op->freq_down;
> > + ops->freq_max = op->freq_max;
> > + ops->freq_min = op->freq_min;
> > + ops->turbo_status = op->turbo_status;
> > + ops->enable_turbo = op->enable_turbo;
> > + ops->disable_turbo = op->disable_turbo;
> *ops = *op?
> > + ops->status = 1; /* registered */
> status --> registered?
> But if use ops linked list, this flag also can be removed.
> > +
> > + return 0;
> > +}
> > +
> > +struct rte_power_ops *
> > +rte_power_get_ops(int ops_index)
> AFAICS, there is only one cpufreq driver on one platform and just have one
> power_cpufreq_ops to use for user.
> We don't need user to get other power ops, and user just want to know the power
> ops using currently, right?
> So using 'index' toget this ops is not good.
Agreed! I will rework this to make it global.
> > {
> > - rte_power_freqs = NULL;
> > - rte_power_get_freq = NULL;
> > - rte_power_set_freq = NULL;
> > - rte_power_freq_up = NULL;
> > - rte_power_freq_down = NULL;
> > - rte_power_freq_max = NULL;
> > - rte_power_freq_min = NULL;
> > - rte_power_turbo_status = NULL;
> > - rte_power_freq_enable_turbo = NULL;
> > - rte_power_freq_disable_turbo = NULL;
> > - rte_power_get_capabilities = NULL;
> > + RTE_VERIFY((ops_index >= PM_ENV_NOT_SET) && (ops_index <
> PM_ENV_MAX));
> > + RTE_VERIFY(rte_power_ops[ops_index].status != 0);
> > +
> > + return &rte_power_ops[ops_index];
> > }
> >
> > int
> > rte_power_check_env_supported(enum power_management_env env)
> > {
> > - switch (env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_check_supported();
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_check_supported();
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_check_supported();
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_check_supported();
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_check_supported();
> > - default:
> > - rte_errno = EINVAL;
> > - return -1;
> > + struct rte_power_ops *ops;
> > +
> > + if ((env > PM_ENV_NOT_SET) && (env < PM_ENV_MAX)) {
> > + ops = rte_power_get_ops(env);
> > + return ops->check_env_support();
> > }
> > +
> > + rte_errno = EINVAL;
> > + return -1;
> > }
> >
> > int
> > @@ -80,80 +96,26 @@ rte_power_set_env(enum power_management_env
> env)
> > }
> >
> > int ret = 0;
> > + struct rte_power_ops *ops;
> > +
> > + if ((env == PM_ENV_NOT_SET) || (env >= PM_ENV_MAX)) {
> > + POWER_LOG(ERR, "Invalid Power Management Environment(%d)"
> > + " set\n", env);
> > + ret = -1;
> > + }
> >
> <...>
> > + ops = rte_power_get_ops(env);
> To find the target ops from the global list according to the env?
> > + if (ops->status == 0) {
> > + POWER_LOG(ERR, WER,
> > + "Power Management Environment(%d) not"
> > + " registered\n", env);
> > ret = -1;
> > }
> >
> > if (ret == 0)
> > global_default_env = env;
> It is more convenient to use a global variable to point to the default power_cpufreq
> ops or its list node.
Agreed
> > - else {
> > + else
> > global_default_env = PM_ENV_NOT_SET;
> > - reset_power_function_ptrs();
> > - }
> >
> > rte_spinlock_unlock(&global_env_cfg_lock);
> > return ret;
> > @@ -164,7 +126,6 @@ rte_power_unset_env(void)
> > {
> > rte_spinlock_lock(&global_env_cfg_lock);
> > global_default_env = PM_ENV_NOT_SET;
> > - reset_power_function_ptrs();
> > rte_spinlock_unlock(&global_env_cfg_lock);
> > }
> >
> > @@ -177,59 +138,76 @@ int
> > rte_power_init(unsigned int lcore_id)
> > {
> > int ret = -1;
> > + struct rte_power_ops *ops;
> >
> > - switch (global_default_env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_init(lcore_id);
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_init(lcore_id);
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_init(lcore_id);
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_init(lcore_id);
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_init(lcore_id);
> > - default:
> > - POWER_LOG(INFO, "Env isn't set yet!");
> > + if (global_default_env != PM_ENV_NOT_SET) {
> > + ops = &rte_power_ops[global_default_env];
> > + if (!ops->status) {
> > + POWER_LOG(ERR, "Power management env[%d] not"
> > + " supported\n", global_default_env);
> > + goto out;
> > + }
> > + return ops->init(lcore_id);
> > }
> > + POWER_LOG(INFO, POWER, "Env isn't set yet!\n");
> >
> > /* Auto detect Environment */
> > - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
> management...");
> > - ret = power_acpi_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> > - goto out;
> > + POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq"
> > + " power management...\n");
> > + ops = &rte_power_ops[PM_ENV_ACPI_CPUFREQ];
> > + if (ops->status) {
> > + ret = ops->init(lcore_id);
> > + if (ret == 0) {
> > + rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> > + goto out;
> > + }
> > }
> >
> > - POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
> > - ret = power_pstate_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> > - goto out;
> > + POWER_LOG(INFO, "Attempting to initialise PSTAT"
> > + " power management...\n");
> > + ops = &rte_power_ops[PM_ENV_PSTATE_CPUFREQ];
> > + if (ops->status) {
> > + ret = ops->init(lcore_id);
> > + if (ret == 0) {
> > + rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> > + goto out;
> > + }
> > }
> >
> > - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power
> management...");
> > - ret = power_amd_pstate_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> > - goto out;
> > + POWER_LOG(INFO, "Attempting to initialise AMD PSTATE"
> > + " power management...\n");
> > + ops = &rte_power_ops[PM_ENV_AMD_PSTATE_CPUFREQ];
> > + if (ops->status) {
> > + ret = ops->init(lcore_id);
> > + if (ret == 0) {
> > + rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> > + goto out;
> > + }
> > }
> >
> > - POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
> > - ret = power_cppc_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> > - goto out;
> > + POWER_LOG(INFO, "Attempting to initialise CPPC power"
> > + " management...\n");
> > + ops = &rte_power_ops[PM_ENV_CPPC_CPUFREQ];
> > + if (ops->status) {
> > + ret = ops->init(lcore_id);
> > + if (ret == 0) {
> > + rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> > + goto out;
> > + }
> > }
> >
> > - POWER_LOG(INFO, "Attempting to initialise VM power management...");
> > - ret = power_kvm_vm_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_KVM_VM);
> > - goto out;
> > + POWER_LOG(INFO, "Attempting to initialise VM power"
> > + " management...\n");
> > + ops = &rte_power_ops[PM_ENV_KVM_VM];
> > + if (ops->status) {
> > + ret = ops->init(lcore_id);
> > + if (ret == 0) {
> > + rte_power_set_env(PM_ENV_KVM_VM);
> > + goto out;
> > + }
> > }
> If we use a linked list, above code can be simpled like this:
> ->
> for_each_power_cpufreq_ops(ops, ...) {
> ret = ops->init()
> if (ret) {
> ....
> }
> }
ACK
> > - POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
> > - "%u", lcore_id);
> > + POWER_LOG(ERR, "Unable to set Power Management Environment"
> > + " for lcore %u\n", lcore_id);
> > out:
> > return ret;
> > }
> > @@ -237,21 +215,14 @@ rte_power_init(unsigned int lcore_id)
> > int
> > rte_power_exit(unsigned int lcore_id)
> > {
> > - switch (global_default_env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_exit(lcore_id);
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_exit(lcore_id);
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_exit(lcore_id);
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_exit(lcore_id);
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_exit(lcore_id);
> > - default:
> > - POWER_LOG(ERR, "Environment has not been set, unable to exit
> gracefully");
> > + struct rte_power_ops *ops;
> >
> > + if (global_default_env != PM_ENV_NOT_SET) {
> > + ops = &rte_power_ops[global_default_env];
> > + return ops->exit(lcore_id);
> > }
> > - return -1;
> > + POWER_LOG(ERR, "Environment has not been set, unable "
> > + "to exit gracefully\n");
> >
> > + return -1;
> > }
> > diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
> > 4fa4afe399..749bb823ab 100644
> > --- a/lib/power/rte_power.h
> > +++ b/lib/power/rte_power.h
> > @@ -1,5 +1,6 @@
> > /* SPDX-License-Identifier: BSD-3-Clause
> > * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2024 AMD Limited
> > */
> >
> > #ifndef _RTE_POWER_H
> > @@ -21,7 +22,7 @@ extern "C" {
> > /* Power Management Environment State */
> > enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ,
> PM_ENV_KVM_VM,
> > PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> > - PM_ENV_AMD_PSTATE_CPUFREQ};
> > + PM_ENV_AMD_PSTATE_CPUFREQ, PM_ENV_MAX};
> "enum power_management_env" is not good. may be like "enum
> power_cpufreq_driver_type"?
> In previous linked list structure to be defined, may be directly use a string name
> instead of a fixed enum is better.
> Becuase the new "PM_ENV_MAX" will lead to break ABI when add a new cpufreq
> driver.
I will rework this to remove the max macro.
How changing the enum power_management_env requires ABI versioning.
Will consider this change in future.
> >
> > /**
> > * Check if a specific power management environment type is
> > supported on a @@ -66,6 +67,97 @@ void rte_power_unset_env(void);
> > */
> > enum power_management_env rte_power_get_env(void);
> >
> > +typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
> > +typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
> > +typedef int (*rte_power_check_env_support_t)(void);
> > +
> > +typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> > + uint32_t num); typedef uint32_t
> > +(*rte_power_get_freq_t)(unsigned int lcore_id); typedef int
> > +(*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
> > +typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> > +
> > +/**
> > + * Function pointer definition for generic frequency change
> > +functions. Review
> > + * each environments specific documentation for usage.
> > + *
> > + * @param lcore_id
> > + * lcore id.
> > + *
> > + * @return
> > + * - 1 on success with frequency changed.
> > + * - 0 on success without frequency changed.
> > + * - Negative on error.
> > + */
> > +
> > +/**
> > + * Power capabilities summary.
> > + */
> > +struct rte_power_core_capabilities {
> > + union {
> > + uint64_t capabilities;
> > + struct {
> > + uint64_t turbo:1; /**< Turbo can be enabled. */
> > + uint64_t priority:1; /**< SST-BF high freq core */
> > + };
> > + };
> > +};
> > +
> > +typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
> > + struct rte_power_core_capabilities
> > +*caps);
> > +
> > +/** Structure defining core power operations structure */ struct
> > +rte_power_ops {
> > +uint8_t status; /**< ops register status. */
> > + enum power_management_env env; /**< power mgmt env. */
> > + rte_power_cpufreq_init_t init; /**< Initialize power management. */
> > + rte_power_cpufreq_exit_t exit; /**< Exit power management. */
> > + rte_power_check_env_support_t check_env_support; /**< verify env is
> supported. */
> > + rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
> > + rte_power_get_freq_t get_freq; /**< Get frequency index. */
> > + rte_power_set_freq_t set_freq; /**< Set frequency index. */
> > + rte_power_freq_change_t freq_up; /**< Scale up frequency. */
> > + rte_power_freq_change_t freq_down; /**< Scale down frequency. */
> > + rte_power_freq_change_t freq_max; /**< Scale up frequency to highest.
> */
> > + rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
> > + rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
> > + rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
> > + rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
> > + rte_power_get_capabilities_t get_caps; /**< power capabilities.
> > +*/ } __rte_cache_aligned;
> Suggest that fix this sturcture, like:
> struct rte_power_cpufreq_list {
> char name[]; // like "cppc_cpufreq", "pstate_cpufreq"
> struct rte_power_cpufreq *ops;
> struct rte_power_cpufreq_list *node; }
ACK
> > +
> > +/**
> > + * Register power cpu frequency operations.
> > + *
> > + * @param ops
> > + * Pointer to an ops structure to register.
> > + * @return
> > + * - >=0: Success; return the index of the ops struct in the table.
> > + * - -EINVAL - error while registering ops struct.
> > + */
> > +__rte_internal
> > +int rte_power_register_ops(const struct rte_power_ops *ops);
> > +
> > +/**
> > + * Macro to statically register the ops of a cpufreq driver.
> > + */
> > +#define RTE_POWER_REGISTER_OPS(ops) \
> > + (RTE_INIT(power_hdlr_init_##ops) \
> > + { \
> > + rte_power_register_ops(&ops); \
> > + })
> > +
> > +/**
> > + * @internal Get the power ops struct from its index.
> > + *
> > + * @param ops_index
> > + * The index of the ops struct in the ops struct table.
> > + * @return
> > + * The pointer to the ops struct in the table if registered.
> > + */
> > +struct rte_power_ops *
> > +rte_power_get_ops(int ops_index);
> > +
> > /**
> > * Initialize power management for a specific lcore. If rte_power_set_env() has
> > * not been called then an auto-detect of the environment will start
> > and @@ -108,10 +200,14 @@ int rte_power_exit(unsigned int lcore_id);
> > * @return
> > * The number of available frequencies.
> > */
> > -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> > - uint32_t num);
> > +static inline uint32_t
> > +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
> > + struct rte_power_ops *ops;
> >
> > -extern rte_power_freqs_t rte_power_freqs;
> > + ops = rte_power_get_ops(rte_power_get_env());
> > + return ops->get_avail_freqs(lcore_id, freqs, n); }
> nice.
> <...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v1 0/4] power: refactor power management library
2024-02-20 15:33 [RFC PATCH 0/2] power: refactor power management library Sivaprasad Tummala
` (2 preceding siblings ...)
2024-02-20 15:33 ` [RFC PATCH 2/2] power: refactor uncore " Sivaprasad Tummala
@ 2024-07-20 16:50 ` Sivaprasad Tummala
2024-07-20 16:50 ` [PATCH v1 1/4] power: refactor core " Sivaprasad Tummala
` (5 more replies)
3 siblings, 6 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-07-20 16:50 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, lihuisong, david.marchand,
ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within 'drivers/power/core/*'
and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more focused
development on individual drivers and facilitates seamless integration
of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (4):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore power management support for AMD EPYC
processors
app/test/test_power.c | 95 ------
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 321 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 7 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 ++++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_core_ops.h | 208 ++++++++++++
lib/power/rte_power_uncore.c | 206 +++++------
lib/power/rte_power_uncore.h | 91 ++---
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
38 files changed, 1591 insertions(+), 621 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v1 1/4] power: refactor core power management library
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
@ 2024-07-20 16:50 ` Sivaprasad Tummala
2024-07-23 10:03 ` Hunt, David
2024-07-20 16:50 ` [PATCH v1 2/4] power: refactor uncore " Sivaprasad Tummala
` (4 subsequent siblings)
5 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-07-20 16:50 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, lihuisong, david.marchand,
ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 ++++++------------
lib/power/rte_power.h | 139 ++++++---
lib/power/rte_power_core_ops.h | 208 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 618 insertions(+), 270 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index 81996e1c13..8637c69703 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -577,3 +577,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..1194a7e2a5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 090a0d96cb..f571f4184a 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -700,3 +700,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 97%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..b04b2f28c0 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if amd p-state power management is supported.
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index 32aaacb948..775b8f4434 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -685,3 +685,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..d6e32fdd47 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..64086a67e7 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..405524ce7c
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Advanced Micro Devices, Inc.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 2343121621..c32b1adabc 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -888,3 +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..8b67b2da21 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..f3e3451cdc 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_core_ops.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index 590986d5ef..6c06411e8b 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -12,7 +12,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 83f742f42a..767686ee12 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,13 +24,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..8afb5949b9 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -8,153 +8,86 @@
#include <rte_spinlock.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-static void
-reset_power_function_ptrs(void)
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
+ goto out;
}
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,94 +97,64 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum power_management_env
-rte_power_get_env(void) {
+rte_power_get_env(void)
+{
return global_default_env;
}
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
+{
+ return global_power_core_ops;
+}
+
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
-
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
-
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ if (ops->init(lcore_id) == 0) {
+ for (env = 0; env < RTE_DIM(power_env_str); env++)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
+ }
+
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
- }
- return -1;
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..5e4aacf08b 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_core_ops.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_core_ops.h b/lib/power/rte_power_core_ops.h
new file mode 100644
index 0000000000..356a64df79
--- /dev/null
+++ b/lib/power/rte_power_core_ops.h
@@ -0,0 +1,208 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CORE_OPS_H
+#define _RTE_POWER_CORE_OPS_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+ RTE_INIT(power_hdlr_init_##ops) \
+ { \
+ rte_power_register_ops(&ops); \
+ }
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index ad92a65f91..c2098fd667 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,18 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.07
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v1 2/4] power: refactor uncore power management library
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
2024-07-20 16:50 ` [PATCH v1 1/4] power: refactor core " Sivaprasad Tummala
@ 2024-07-20 16:50 ` Sivaprasad Tummala
2024-07-23 10:26 ` Hunt, David
2024-07-20 16:50 ` [PATCH v1 3/4] test/power: removed function pointer validations Sivaprasad Tummala
` (3 subsequent siblings)
5 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-07-20 16:50 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, lihuisong, david.marchand,
ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 7 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 206 ++++++---------
lib/power/rte_power_uncore.h | 91 ++++---
lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 406 insertions(+), 169 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 9c152e4ed2..6f3b347a8d 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_UNCORE_FREQS 32
@@ -476,3 +476,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..c46202fd6a
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index f3e3451cdc..9b13d98810 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..127f6ed212 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <errno.h>
@@ -10,100 +10,52 @@
#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,25 +65,23 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
+
return ret;
}
@@ -139,42 +89,50 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
+}
+
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+
+ return global_uncore_ops;
}
int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
-
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
}
out:
return ret;
@@ -183,12 +141,12 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..5415032ff4 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -10,9 +10,7 @@
* @file
* RTE Uncore Frequency Management
*/
-
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +114,13 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,26 +143,13 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
-
-/**
- * Function pointer definition for generic frequency change functions.
- *
- * @param pkg
- * Package number.
- * Each physical CPU in a system is referred to as a package.
- * @param die
- * Die number.
- * Each package can have several dies connected together via the uncore mesh.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+ return ops->set_freq(pkg, die, index);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -169,7 +158,13 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +173,13 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +201,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,10 +226,13 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
* by parsing the uncore sysfs directory.
@@ -235,10 +243,13 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
-
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+ return ops->get_num_pkgs();
+}
/**
* Return the number of dies for pakckages (CPUs) specified
* from parsing the uncore sysfs directory.
@@ -253,9 +264,13 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..91cb9ec518
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,239 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+ RTE_INIT(power_hdlr_init_uncore_##ops) \
+ { \
+ rte_power_register_uncore_ops(&ops); \
+ }
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index c2098fd667..112790df73 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v1 3/4] test/power: removed function pointer validations
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
2024-07-20 16:50 ` [PATCH v1 1/4] power: refactor core " Sivaprasad Tummala
2024-07-20 16:50 ` [PATCH v1 2/4] power: refactor uncore " Sivaprasad Tummala
@ 2024-07-20 16:50 ` Sivaprasad Tummala
2024-07-22 10:49 ` Hunt, David
2024-07-20 16:50 ` [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
` (2 subsequent siblings)
5 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-07-20 16:50 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, lihuisong, david.marchand,
ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
3 files changed, 183 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index 619b2811c6..8cb67e662c 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -519,58 +519,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-07-20 16:50 ` [PATCH v1 3/4] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-07-20 16:50 ` Sivaprasad Tummala
2024-07-23 10:33 ` Hunt, David
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor power management library Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
5 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-07-20 16:50 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, lihuisong, david.marchand,
ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 321 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 568 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..f15eaaa307
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,321 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_UNCORE_FREQS 8
+#define MAX_NUMA_DIE 8
+
+#define BUS_FREQ 1000
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+/*
+ * Fopen the sys file for the future setting of the uncore die frequency.
+ */
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ /* open and read all uncore sys files */
+ /* Base max */
+ ui->max_freq = 1800000;
+ ui->min_freq = 1200000;
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die by reading the
+ * sys file.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ int ret = -1;
+ uint32_t i, num_uncore_freqs = 3;
+ uint32_t fabric_freqs[] = {
+ /* to be extended for probing support in future */
+ 1800,
+ 1444,
+ 1200
+ };
+
+ if (num_uncore_freqs >= MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ num_uncore_freqs);
+ goto out;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ for (i = 0; i < num_uncore_freqs; i++)
+ ui->freqs[i] = fabric_freqs[i] * BUS_FREQ;
+
+ ui->nb_freqs = num_uncore_freqs;
+
+ ret = 0;
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ num_uncore_freqs, ui->pkg, ui->die);
+
+out:
+ return ret;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS)
+ esmi_initialized = 1;
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..ec1b741c3a
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+IMB_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v1 0/4] power: refactor power management library
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-07-20 16:50 ` [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
@ 2024-07-20 16:50 ` Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
5 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-07-20 16:50 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, lihuisong, david.marchand,
ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within 'drivers/power/core/*'
and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more focused
development on individual drivers and facilitates seamless integration
of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (4):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore power management support for AMD EPYC
processors
app/test/test_power.c | 95 ------
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 321 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 7 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 ++++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_core_ops.h | 208 ++++++++++++
lib/power/rte_power_uncore.c | 206 +++++------
lib/power/rte_power_uncore.h | 91 ++---
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
38 files changed, 1591 insertions(+), 621 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v1 3/4] test/power: removed function pointer validations
2024-07-20 16:50 ` [PATCH v1 3/4] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-07-22 10:49 ` Hunt, David
2024-07-27 18:45 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: Hunt, David @ 2024-07-22 10:49 UTC (permalink / raw)
To: Sivaprasad Tummala, anatoly.burakov, jerinj, lihuisong,
david.marchand, ferruh.yigit, konstantin.ananyev
Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1016 bytes --]
On 20/07/2024 17:50, Sivaprasad Tummala wrote:
> After refactoring the power library, power management operations are now
> consistently supported regardless of the operating environment, making
> function pointer checks unnecessary and thus removed from applications.
>
> Signed-off-by: Sivaprasad Tummala<sivaprasad.tummala@amd.com>
> ---
> app/test/test_power.c | 95 -----------------------------------
> app/test/test_power_cpufreq.c | 52 -------------------
> app/test/test_power_kvm_vm.c | 36 -------------
> 3 files changed, 183 deletions(-)
>
Hi Sivaprasad,
Nice work on the patch-set.
There's just four function pointer checks remaining that my compiler is
complaining about. They are in examples/l3fwd-power/main.c (lines 443,
452, 1350, 1353). It would be nice to have these removed as well, seeing
as the functions are now inlines and don't need these checks.
I'm running the patch set through some tests here, will keep you posted
on progress.
Rgds,
Dave.
---snip---
[-- Attachment #2: Type: text/html, Size: 1656 bytes --]
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v1 1/4] power: refactor core power management library
2024-07-20 16:50 ` [PATCH v1 1/4] power: refactor core " Sivaprasad Tummala
@ 2024-07-23 10:03 ` Hunt, David
2024-07-27 18:44 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: Hunt, David @ 2024-07-23 10:03 UTC (permalink / raw)
To: Sivaprasad Tummala, anatoly.burakov, jerinj, lihuisong,
david.marchand, ferruh.yigit, konstantin.ananyev
Cc: dev
[-- Attachment #1: Type: text/plain, Size: 12807 bytes --]
Hi Sivaprasad,
A couple of comments below:
On 20/07/2024 17:50, Sivaprasad Tummala wrote:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
>
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
>
> Signed-off-by: Sivaprasad Tummala<sivaprasad.tummala@amd.com>
> ---
> drivers/meson.build | 1 +
> .../power/acpi/acpi_cpufreq.c | 22 +-
> .../power/acpi/acpi_cpufreq.h | 6 +-
> drivers/power/acpi/meson.build | 10 +
> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
> drivers/power/amd_pstate/meson.build | 10 +
> .../power/cppc/cppc_cpufreq.c | 22 +-
> .../power/cppc/cppc_cpufreq.h | 8 +-
> drivers/power/cppc/meson.build | 10 +
> .../power/kvm_vm}/guest_channel.c | 0
> .../power/kvm_vm}/guest_channel.h | 0
> .../power/kvm_vm/kvm_vm.c | 22 +-
> .../power/kvm_vm/kvm_vm.h | 6 +-
> drivers/power/kvm_vm/meson.build | 16 +
> drivers/power/meson.build | 12 +
> drivers/power/pstate/meson.build | 10 +
> .../power/pstate/pstate_cpufreq.c | 22 +-
> .../power/pstate/pstate_cpufreq.h | 6 +-
> lib/power/meson.build | 7 +-
> lib/power/power_common.c | 2 +-
> lib/power/power_common.h | 16 +-
> lib/power/rte_power.c | 287 ++++++------------
> lib/power/rte_power.h | 139 ++++++---
> lib/power/rte_power_core_ops.h | 208 +++++++++++++
> lib/power/version.map | 14 +
> 26 files changed, 618 insertions(+), 270 deletions(-)
> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
> create mode 100644 drivers/power/acpi/meson.build
> rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
> create mode 100644 drivers/power/amd_pstate/meson.build
> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
> create mode 100644 drivers/power/cppc/meson.build
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> create mode 100644 drivers/power/kvm_vm/meson.build
> create mode 100644 drivers/power/meson.build
> create mode 100644 drivers/power/pstate/meson.build
> rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
> rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
> create mode 100644 lib/power/rte_power_core_ops.h
--snip--
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> index 36c3f3da98..8afb5949b9 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power.c
> @@ -8,153 +8,86 @@
> #include <rte_spinlock.h>
>
> #include "rte_power.h"
> -#include "power_acpi_cpufreq.h"
> -#include "power_cppc_cpufreq.h"
> #include "power_common.h"
> -#include "power_kvm_vm.h"
> -#include "power_pstate_cpufreq.h"
> -#include "power_amd_pstate_cpufreq.h"
>
> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static struct rte_power_core_ops *global_power_core_ops;
Suggest initialising this to NULL so we can check in
rte_power_get_core_ops if it's null and throw an error.
--snip--
> +struct rte_power_core_ops *
> +rte_power_get_core_ops(void)
> +{
Need a check here to see if rte_power_get_core_ops is NULL. If it is,
then the developer has probably called a frequency change API before the
relevant init function, so throw an error.
Also, all the functions that call this need to check if it returns NULL
so as to avoid a segfault when they attempts to call the op function.
> + return global_power_core_ops;
> +}
> +
--snip--
> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
> index 4fa4afe399..5e4aacf08b 100644
> --- a/lib/power/rte_power.h
> +++ b/lib/power/rte_power.h
> @@ -1,5 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> */
>
> #ifndef _RTE_POWER_H
> @@ -14,14 +15,21 @@
> #include <rte_log.h>
> #include <rte_power_guest_channel.h>
>
> +#include "rte_power_core_ops.h"
> +
> #ifdef __cplusplus
> extern "C" {
> #endif
>
> /* Power Management Environment State */
> -enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> - PM_ENV_AMD_PSTATE_CPUFREQ};
> +enum power_management_env {
> + PM_ENV_NOT_SET = 0,
> + PM_ENV_ACPI_CPUFREQ,
> + PM_ENV_KVM_VM,
> + PM_ENV_PSTATE_CPUFREQ,
> + PM_ENV_CPPC_CPUFREQ,
> + PM_ENV_AMD_PSTATE_CPUFREQ
> +};
>
> /**
> * Check if a specific power management environment type is supported on a
> @@ -66,6 +74,15 @@ void rte_power_unset_env(void);
> */
> enum power_management_env rte_power_get_env(void);
>
> +/**
> + * @internal Get the power ops struct from its index.
> + *
> + * @return
> + * The pointer to the ops struct in the table if registered.
> + */
> +struct rte_power_core_ops *
> +rte_power_get_core_ops(void);
> +
> /**
> * Initialize power management for a specific lcore. If rte_power_set_env() has
> * not been called then an auto-detect of the environment will start and
> @@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
> * @return
> * The number of available frequencies.
> */
> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> - uint32_t num);
> +static inline uint32_t
> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -extern rte_power_freqs_t rte_power_freqs;
> + return ops->get_avail_freqs(lcore_id, freqs, n);
This function will segfault if is called before the appropriate init is
performed. See comments above on global_power_core_ops.
Same for all the functions below that call global_power_core_ops().
> +}
>
> /**
> * Return the current index of available frequencies of a specific lcore.
> @@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
> * @return
> * The current index of available frequencies.
> */
> -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> +static inline uint32_t
> +rte_power_get_freq(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -extern rte_power_get_freq_t rte_power_get_freq;
> + return ops->get_freq(lcore_id);
> +}
>
> /**
> * Set the new frequency for a specific lcore by indicating the index of
> @@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
> * - 0 on success without frequency changed.
> * - Negative on error.
> */
> -typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
> -
> -extern rte_power_set_freq_t rte_power_set_freq;
> +static inline uint32_t
> +rte_power_set_freq(unsigned int lcore_id, uint32_t index)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -/**
> - * Function pointer definition for generic frequency change functions. Review
> - * each environments specific documentation for usage.
> - *
> - * @param lcore_id
> - * lcore id.
> - *
> - * @return
> - * - 1 on success with frequency changed.
> - * - 0 on success without frequency changed.
> - * - Negative on error.
> - */
> -typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> + return ops->set_freq(lcore_id, index);
> +}
>
> /**
> * Scale up the frequency of a specific lcore according to the available
> * frequencies.
> * Review each environments specific documentation for usage.
> */
> -extern rte_power_freq_change_t rte_power_freq_up;
> +static inline int
> +rte_power_freq_up(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_up(lcore_id);
> +}
>
> /**
> * Scale down the frequency of a specific lcore according to the available
> * frequencies.
> * Review each environments specific documentation for usage.
> */
> -extern rte_power_freq_change_t rte_power_freq_down;
> +static inline int
> +rte_power_freq_down(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_down(lcore_id);
> +}
>
> /**
> * Scale up the frequency of a specific lcore to the highest according to the
> * available frequencies.
> * Review each environments specific documentation for usage.
> */
> -extern rte_power_freq_change_t rte_power_freq_max;
> +static inline int
> +rte_power_freq_max(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_max(lcore_id);
> +}
>
> /**
> * Scale down the frequency of a specific lcore to the lowest according to the
> * available frequencies.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_freq_min;
> +static inline int
> +rte_power_freq_min(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_min(lcore_id);
> +}
>
> /**
> * Query the Turbo Boost status of a specific lcore.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_turbo_status;
> +static inline int
> +rte_power_turbo_status(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->turbo_status(lcore_id);
> +}
>
> /**
> * Enable Turbo Boost for this lcore.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_freq_enable_turbo;
> +static inline int
> +rte_power_freq_enable_turbo(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->enable_turbo(lcore_id);
> +}
>
> /**
> * Disable Turbo Boost for this lcore.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_freq_disable_turbo;
> +static inline int
> +rte_power_freq_disable_turbo(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -/**
> - * Power capabilities summary.
> - */
> -struct rte_power_core_capabilities {
> - union {
> - uint64_t capabilities;
> - struct {
> - uint64_t turbo:1; /**< Turbo can be enabled. */
> - uint64_t priority:1; /**< SST-BF high freq core */
> - };
> - };
> -};
> + return ops->disable_turbo(lcore_id);
> +}
>
> /**
> * Returns power capabilities for a specific lcore.
> @@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
> * - 0 on success.
> * - Negative on error.
> */
> -typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
> - struct rte_power_core_capabilities *caps);
> +static inline int
> +rte_power_get_capabilities(unsigned int lcore_id,
> + struct rte_power_core_capabilities *caps)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -extern rte_power_get_capabilities_t rte_power_get_capabilities;
> + return ops->get_caps(lcore_id, caps);
> +}
>
> #ifdef __cplusplus
> }
--snip--
[-- Attachment #2: Type: text/html, Size: 14043 bytes --]
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v1 2/4] power: refactor uncore power management library
2024-07-20 16:50 ` [PATCH v1 2/4] power: refactor uncore " Sivaprasad Tummala
@ 2024-07-23 10:26 ` Hunt, David
0 siblings, 0 replies; 139+ messages in thread
From: Hunt, David @ 2024-07-23 10:26 UTC (permalink / raw)
To: Sivaprasad Tummala, anatoly.burakov, jerinj, lihuisong,
david.marchand, ferruh.yigit, konstantin.ananyev
Cc: dev
[-- Attachment #1: Type: text/plain, Size: 2515 bytes --]
On 20/07/2024 17:50, Sivaprasad Tummala wrote:
> This patch refactors the power management library, addressing uncore
> power management. The primary changes involve the creation of dedicated
> directories for each driver within 'drivers/power/uncore/*'. The
> adjustment of meson.build files enables the selective activation
> of individual drivers.
>
> This refactor significantly improves code organization, enhances
> clarity and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Signed-off-by: Sivaprasad Tummala<sivaprasad.tummala@amd.com>
> ---
> .../power/intel_uncore/intel_uncore.c | 18 +-
> .../power/intel_uncore/intel_uncore.h | 8 +-
> drivers/power/intel_uncore/meson.build | 7 +
> drivers/power/meson.build | 3 +-
> lib/power/meson.build | 2 +-
> lib/power/rte_power_uncore.c | 206 ++++++---------
> lib/power/rte_power_uncore.h | 91 ++++---
> lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
> lib/power/version.map | 1 +
> 9 files changed, 406 insertions(+), 169 deletions(-)
> rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
> rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
> create mode 100644 drivers/power/intel_uncore/meson.build
> create mode 100644 lib/power/rte_power_uncore_ops.h
>
--snip--
>
> diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
> index 48c75a5da0..127f6ed212 100644
> --- a/lib/power/rte_power_uncore.c
> +++ b/lib/power/rte_power_uncore.c
> @@ -1,6 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> - * Copyright(c) 2023 AMD Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> */
>
--snip--
> +struct rte_power_uncore_ops *
> +rte_power_get_uncore_ops(void)
> +{
> + RTE_ASSERT(global_uncore_ops != NULL);
I'm only seeing this now after sending the email for the first patch.
This would be a good solution for the global_core_ops check in
rte_power_get_core_ops() in rte_power.c, and would be the smaller
change, rather than checking everywhere rte_power_get_env() is called.
> +
> + return global_uncore_ops;
> }
>
--snip--
[-- Attachment #2: Type: text/html, Size: 3651 bytes --]
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors
2024-07-20 16:50 ` [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
@ 2024-07-23 10:33 ` Hunt, David
2024-07-27 18:46 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: Hunt, David @ 2024-07-23 10:33 UTC (permalink / raw)
To: Sivaprasad Tummala, anatoly.burakov, jerinj, lihuisong,
david.marchand, ferruh.yigit, konstantin.ananyev
Cc: dev
On 20/07/2024 17:50, Sivaprasad Tummala wrote:
> This patch introduces driver support for power management of uncore
> components in AMD EPYC processors.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> drivers/power/amd_uncore/amd_uncore.c | 321 ++++++++++++++++++++++++++
> drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
> drivers/power/amd_uncore/meson.build | 20 ++
> drivers/power/meson.build | 1 +
> 4 files changed, 568 insertions(+)
> create mode 100644 drivers/power/amd_uncore/amd_uncore.c
> create mode 100644 drivers/power/amd_uncore/amd_uncore.h
> create mode 100644 drivers/power/amd_uncore/meson.build
>
> diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
> new file mode 100644
> index 0000000000..f15eaaa307
> --- /dev/null
> +++ b/drivers/power/amd_uncore/amd_uncore.c
> @@ -0,0 +1,321 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> + */
> +
> +#include <errno.h>
> +#include <dirent.h>
> +#include <fnmatch.h>
> +
> +#include <rte_memcpy.h>
> +
> +#include "amd_uncore.h"
> +#include "power_common.h"
> +#include "e_smi/e_smi.h"
> +
> +#define MAX_UNCORE_FREQS 8
> +#define MAX_NUMA_DIE 8
> +
> +#define BUS_FREQ 1000
> +
> +struct __rte_cache_aligned uncore_power_info {
> + unsigned int die; /* Core die id */
> + unsigned int pkg; /* Package id */
> + uint32_t freqs[MAX_UNCORE_FREQS]; /* Frequency array */
> + uint32_t nb_freqs; /* Number of available freqs */
> + uint32_t curr_idx; /* Freq index in freqs array */
> + uint32_t max_freq; /* System max uncore freq */
> + uint32_t min_freq; /* System min uncore freq */
> +};
> +
> +static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
> +static int esmi_initialized;
> +
> +static int
> +set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
> +{
> + int ret;
> +
> + if (idx >= MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
> + POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
> + "should be less than %u", idx, ui->nb_freqs);
> + return -1;
> + }
> +
> + ret = esmi_apb_disable(ui->pkg, idx);
> + if (ret != ESMI_SUCCESS) {
> + POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
> + idx, ui->pkg);
> + return -1;
> + }
> +
> + POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
> + idx, ui->pkg, ui->die);
> +
> + /* write the minimum value first if the target freq is less than current max */
> + ui->curr_idx = idx;
> +
> + return 0;
> +}
> +
> +/*
> + * Fopen the sys file for the future setting of the uncore die frequency.
> + */
Comment may need updating, as function is not reading any sysfs files
(for the moment, at least).
> +static int
> +power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
> +{
> + /* open and read all uncore sys files */
Comment may need updating, as function is not reading any sysfs files
(for the moment, at least).
> + /* Base max */
> + ui->max_freq = 1800000;
> + ui->min_freq = 1200000;
> +
> + return 0;
> +}
> +
> +/*
> + * Get the available uncore frequencies of the specific die by reading the
> + * sys file.
> + */
Comment may need updating, as function is not reading any sysfs files. 3
uncore frequencies hard-coded for the moment, may get via esmi or sysfs
in the future.
> +static int
> +power_get_available_uncore_freqs(struct uncore_power_info *ui)
> +{
> + int ret = -1;
> + uint32_t i, num_uncore_freqs = 3;
> + uint32_t fabric_freqs[] = {
> + /* to be extended for probing support in future */
> + 1800,
> + 1444,
> + 1200
> + };
> +
> + if (num_uncore_freqs >= MAX_UNCORE_FREQS) {
> + POWER_LOG(ERR, "Too many available uncore frequencies: %d",
> + num_uncore_freqs);
> + goto out;
> + }
> +
> + /* Generate the uncore freq bucket array. */
> + for (i = 0; i < num_uncore_freqs; i++)
> + ui->freqs[i] = fabric_freqs[i] * BUS_FREQ;
> +
> + ui->nb_freqs = num_uncore_freqs;
> +
> + ret = 0;
> +
> + POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
> + num_uncore_freqs, ui->pkg, ui->die);
> +
> +out:
> + return ret;
> +}
> +
--snip--
>
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v1 1/4] power: refactor core power management library
2024-07-23 10:03 ` Hunt, David
@ 2024-07-27 18:44 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-07-27 18:44 UTC (permalink / raw)
To: Hunt, David, anatoly.burakov, jerinj, lihuisong, david.marchand,
Yigit, Ferruh, konstantin.ananyev
Cc: dev
[-- Attachment #1: Type: text/plain, Size: 14178 bytes --]
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Dave,
From: Hunt, David <david.hunt@intel.com>
Sent: Tuesday, July 23, 2024 3:34 PM
To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>; anatoly.burakov@intel.com; jerinj@marvell.com; lihuisong@huawei.com; david.marchand@redhat.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com
Cc: dev@dpdk.org
Subject: Re: [PATCH v1 1/4] power: refactor core power management library
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Sivaprasad,
A couple of comments below:
On 20/07/2024 17:50, Sivaprasad Tummala wrote:
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com><mailto:sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 ++++++------------
lib/power/rte_power.h | 139 ++++++---
lib/power/rte_power_core_ops.h | 208 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 618 insertions(+), 270 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
--snip--
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..8afb5949b9 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -8,153 +8,86 @@
#include <rte_spinlock.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
Suggest initialising this to NULL so we can check in rte_power_get_core_ops if it's null and throw an error.
[Siva] rte_power_core_ops as static global is initialized to NULL at runtime. Not sure, if it’s still required to initialize to NULL.
--snip--
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
+{
Need a check here to see if rte_power_get_core_ops is NULL. If it is, then the developer has probably called a frequency change API before the relevant init function, so throw an error.
Also, all the functions that call this need to check if it returns NULL so as to avoid a segfault when they attempts to call the op function.
[Siva] ACK. Will fix this in next version.
+ return global_power_core_ops;
+}
+
--snip--
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..5e4aacf08b 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_core_ops.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
This function will segfault if is called before the appropriate init is performed. See comments above on global_power_core_ops.
Same for all the functions below that call global_power_core_ops().
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
--snip--
[-- Attachment #2: Type: text/html, Size: 29984 bytes --]
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v1 3/4] test/power: removed function pointer validations
2024-07-22 10:49 ` Hunt, David
@ 2024-07-27 18:45 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-07-27 18:45 UTC (permalink / raw)
To: Hunt, David, anatoly.burakov, jerinj, lihuisong, david.marchand,
Yigit, Ferruh, konstantin.ananyev
Cc: dev
[-- Attachment #1: Type: text/plain, Size: 1739 bytes --]
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Dave,
Inline..
From: Hunt, David <david.hunt@intel.com>
Sent: Monday, July 22, 2024 4:20 PM
To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>; anatoly.burakov@intel.com; jerinj@marvell.com; lihuisong@huawei.com; david.marchand@redhat.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com
Cc: dev@dpdk.org
Subject: Re: [PATCH v1 3/4] test/power: removed function pointer validations
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
On 20/07/2024 17:50, Sivaprasad Tummala wrote:
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com><mailto:sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
3 files changed, 183 deletions(-)
Hi Sivaprasad,
Nice work on the patch-set.
There's just four function pointer checks remaining that my compiler is complaining about. They are in examples/l3fwd-power/main.c (lines 443, 452, 1350, 1353). It would be nice to have these removed as well, seeing as the functions are now inlines and don't need these checks.
[Siva] ACK. Will fix this in next version.
I'm running the patch set through some tests here, will keep you posted on progress.
Rgds,
Dave.
---snip---
[-- Attachment #2: Type: text/html, Size: 6297 bytes --]
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors
2024-07-23 10:33 ` Hunt, David
@ 2024-07-27 18:46 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-07-27 18:46 UTC (permalink / raw)
To: Hunt, David, anatoly.burakov, jerinj, lihuisong, david.marchand,
Yigit, Ferruh, konstantin.ananyev
Cc: dev
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Dave,
> -----Original Message-----
> From: Hunt, David <david.hunt@intel.com>
> Sent: Tuesday, July 23, 2024 4:03 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
> anatoly.burakov@intel.com; jerinj@marvell.com; lihuisong@huawei.com;
> david.marchand@redhat.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> konstantin.ananyev@huawei.com
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v1 4/4] power/amd_uncore: uncore power management
> support for AMD EPYC processors
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On 20/07/2024 17:50, Sivaprasad Tummala wrote:
> > This patch introduces driver support for power management of uncore
> > components in AMD EPYC processors.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> > drivers/power/amd_uncore/amd_uncore.c | 321
> ++++++++++++++++++++++++++
> > drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
> > drivers/power/amd_uncore/meson.build | 20 ++
> > drivers/power/meson.build | 1 +
> > 4 files changed, 568 insertions(+)
> > create mode 100644 drivers/power/amd_uncore/amd_uncore.c
> > create mode 100644 drivers/power/amd_uncore/amd_uncore.h
> > create mode 100644 drivers/power/amd_uncore/meson.build
> >
> > diff --git a/drivers/power/amd_uncore/amd_uncore.c
> > b/drivers/power/amd_uncore/amd_uncore.c
> > new file mode 100644
> > index 0000000000..f15eaaa307
> > --- /dev/null
> > +++ b/drivers/power/amd_uncore/amd_uncore.c
> > @@ -0,0 +1,321 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> > + */
> > +
> > +#include <errno.h>
> > +#include <dirent.h>
> > +#include <fnmatch.h>
> > +
> > +#include <rte_memcpy.h>
> > +
> > +#include "amd_uncore.h"
> > +#include "power_common.h"
> > +#include "e_smi/e_smi.h"
> > +
> > +#define MAX_UNCORE_FREQS 8
> > +#define MAX_NUMA_DIE 8
> > +
> > +#define BUS_FREQ 1000
> > +
> > +struct __rte_cache_aligned uncore_power_info {
> > + unsigned int die; /* Core die id */
> > + unsigned int pkg; /* Package id */
> > + uint32_t freqs[MAX_UNCORE_FREQS]; /* Frequency array */
> > + uint32_t nb_freqs; /* Number of available freqs */
> > + uint32_t curr_idx; /* Freq index in freqs array */
> > + uint32_t max_freq; /* System max uncore freq */
> > + uint32_t min_freq; /* System min uncore freq */
> > +};
> > +
> > +static struct uncore_power_info
> > +uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
> > +static int esmi_initialized;
> > +
> > +static int
> > +set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
> > +{
> > + int ret;
> > +
> > + if (idx >= MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
> > + POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
> > + "should be less than %u", idx, ui->nb_freqs);
> > + return -1;
> > + }
> > +
> > + ret = esmi_apb_disable(ui->pkg, idx);
> > + if (ret != ESMI_SUCCESS) {
> > + POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
> > + idx, ui->pkg);
> > + return -1;
> > + }
> > +
> > + POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
> > + idx, ui->pkg, ui->die);
> > +
> > + /* write the minimum value first if the target freq is less than current max
> */
> > + ui->curr_idx = idx;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * Fopen the sys file for the future setting of the uncore die frequency.
> > + */
>
>
> Comment may need updating, as function is not reading any sysfs files (for the
> moment, at least).
ACK! Will address this in next version.
>
>
> > +static int
> > +power_init_for_setting_uncore_freq(struct uncore_power_info *ui) {
> > + /* open and read all uncore sys files */
>
>
> Comment may need updating, as function is not reading any sysfs files (for the
> moment, at least).
ACK! Will address this in next version.
>
>
>
> > + /* Base max */
> > + ui->max_freq = 1800000;
> > + ui->min_freq = 1200000;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * Get the available uncore frequencies of the specific die by
> > +reading the
> > + * sys file.
> > + */
>
>
> Comment may need updating, as function is not reading any sysfs files. 3
> uncore frequencies hard-coded for the moment, may get via esmi or sysfs in
> the future.
ACK! Will address this in next version.
>
>
> > +static int
> > +power_get_available_uncore_freqs(struct uncore_power_info *ui) {
> > + int ret = -1;
> > + uint32_t i, num_uncore_freqs = 3;
> > + uint32_t fabric_freqs[] = {
> > + /* to be extended for probing support in future */
> > + 1800,
> > + 1444,
> > + 1200
> > + };
> > +
> > + if (num_uncore_freqs >= MAX_UNCORE_FREQS) {
> > + POWER_LOG(ERR, "Too many available uncore frequencies: %d",
> > + num_uncore_freqs);
> > + goto out;
> > + }
> > +
> > + /* Generate the uncore freq bucket array. */
> > + for (i = 0; i < num_uncore_freqs; i++)
> > + ui->freqs[i] = fabric_freqs[i] * BUS_FREQ;
> > +
> > + ui->nb_freqs = num_uncore_freqs;
> > +
> > + ret = 0;
> > +
> > + POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are
> available",
> > + num_uncore_freqs, ui->pkg, ui->die);
> > +
> > +out:
> > + return ret;
> > +}
> > +
>
>
> --snip--
>
> >
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v2 0/4] power: refactor power management library
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor power management library Sivaprasad Tummala
@ 2024-08-26 13:06 ` Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 1/4] power: refactor core " Sivaprasad Tummala
` (6 more replies)
5 siblings, 7 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-08-26 13:06 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless integration
of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (4):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore power management support for AMD EPYC
processors
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 328 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 291 ++++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_core_ops.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 205 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
39 files changed, 1604 insertions(+), 625 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v2 1/4] power: refactor core power management library
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
@ 2024-08-26 13:06 ` Sivaprasad Tummala
2024-08-26 15:26 ` Stephen Hemminger
2024-08-27 8:21 ` lihuisong (C)
2024-08-26 13:06 ` [PATCH v2 2/4] power: refactor uncore " Sivaprasad Tummala
` (5 subsequent siblings)
6 siblings, 2 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-08-26 13:06 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 291 ++++++------------
lib/power/rte_power.h | 139 ++++++---
lib/power/rte_power_core_ops.h | 208 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 621 insertions(+), 271 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index 81996e1c13..8637c69703 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -577,3 +577,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..1194a7e2a5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 090a0d96cb..f571f4184a 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -700,3 +700,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 97%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..b04b2f28c0 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if amd p-state power management is supported.
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index 32aaacb948..775b8f4434 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -685,3 +685,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..d6e32fdd47 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..64086a67e7 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..405524ce7c
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Advanced Micro Devices, Inc.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 2343121621..c32b1adabc 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -888,3 +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..8b67b2da21 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_core_ops.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..f3e3451cdc 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_core_ops.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index 590986d5ef..6c06411e8b 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -12,7 +12,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 83f742f42a..767686ee12 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,13 +24,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..2bf6d40517 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -8,153 +8,86 @@
#include <rte_spinlock.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
+ goto out;
}
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,94 +97,66 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum power_management_env
-rte_power_get_env(void) {
+rte_power_get_env(void)
+{
return global_default_env;
}
-int
-rte_power_init(unsigned int lcore_id)
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
{
- int ret = -1;
+ RTE_ASSERT(global_power_core_ops != NULL);
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ return global_power_core_ops;
+}
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+int
+rte_power_init(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ if (ops->init(lcore_id) == 0) {
+ for (env = 0; env < RTE_DIM(power_env_str); env++)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
+ }
+
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
- }
- return -1;
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..5e4aacf08b 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_core_ops.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_core_ops.h b/lib/power/rte_power_core_ops.h
new file mode 100644
index 0000000000..356a64df79
--- /dev/null
+++ b/lib/power/rte_power_core_ops.h
@@ -0,0 +1,208 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CORE_OPS_H
+#define _RTE_POWER_CORE_OPS_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+ RTE_INIT(power_hdlr_init_##ops) \
+ { \
+ rte_power_register_ops(&ops); \
+ }
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..bd64e0828f 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,18 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.07
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v2 2/4] power: refactor uncore power management library
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 1/4] power: refactor core " Sivaprasad Tummala
@ 2024-08-26 13:06 ` Sivaprasad Tummala
2024-08-27 13:02 ` lihuisong (C)
2024-08-26 13:06 ` [PATCH v2 3/4] test/power: removed function pointer validations Sivaprasad Tummala
` (4 subsequent siblings)
6 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-08-26 13:06 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 205 ++++++---------
lib/power/rte_power_uncore.h | 87 ++++---
lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 405 insertions(+), 164 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index f3e3451cdc..9b13d98810 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..9f8771224f 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <errno.h>
@@ -12,98 +13,50 @@
#include "rte_power_uncore.h"
#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +66,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,15 +89,22 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
+}
+
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+
+ return global_uncore_ops;
}
int
@@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +142,12 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..c9fba02568 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +12,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +116,13 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,26 +145,13 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-/**
- * Function pointer definition for generic frequency change functions.
- *
- * @param pkg
- * Package number.
- * Each physical CPU in a system is referred to as a package.
- * @param die
- * Die number.
- * Each package can have several dies connected together via the uncore mesh.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+ return ops->set_freq(pkg, die, index);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -169,7 +160,13 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +175,13 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +203,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,9 +228,13 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +246,13 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+ return ops->get_num_pkgs();
+}
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +268,13 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..623d63800c
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,239 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+ RTE_INIT(power_hdlr_init_uncore_##ops) \
+ { \
+ rte_power_register_uncore_ops(&ops); \
+ }
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index bd64e0828f..f1eabd7c9a 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v2 3/4] test/power: removed function pointer validations
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 1/4] power: refactor core " Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 2/4] power: refactor uncore " Sivaprasad Tummala
@ 2024-08-26 13:06 ` Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
` (3 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-08-26 13:06 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index 619b2811c6..8cb67e662c 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -519,58 +519,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v2 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-08-26 13:06 ` [PATCH v2 3/4] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-08-26 13:06 ` Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 0/4] power: refactor power management library Sivaprasad Tummala
` (2 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-08-26 13:06 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 328 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 575 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..e667a783cd
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,328 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ num_uncore_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v2 0/4] power: refactor power management library
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-08-26 13:06 ` [PATCH v2 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
@ 2024-08-26 13:06 ` Sivaprasad Tummala
2024-10-07 18:01 ` Stephen Hemminger
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-08-26 13:06 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless integration
of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (4):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore power management support for AMD EPYC
processors
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 328 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 291 ++++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_core_ops.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 205 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
39 files changed, 1604 insertions(+), 625 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_core_ops.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 1/4] power: refactor core power management library
2024-08-26 13:06 ` [PATCH v2 1/4] power: refactor core " Sivaprasad Tummala
@ 2024-08-26 15:26 ` Stephen Hemminger
2024-10-07 19:25 ` Tummala, Sivaprasad
2024-08-27 8:21 ` lihuisong (C)
1 sibling, 1 reply; 139+ messages in thread
From: Stephen Hemminger @ 2024-08-26 15:26 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, dev
On Mon, 26 Aug 2024 13:06:46 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> +static struct rte_power_core_ops acpi_ops = {
> + .name = "acpi",
> + .init = power_acpi_cpufreq_init,
> + .exit = power_acpi_cpufreq_exit,
> + .check_env_support = power_acpi_cpufreq_check_supported,
> + .get_avail_freqs = power_acpi_cpufreq_freqs,
> + .get_freq = power_acpi_cpufreq_get_freq,
> + .set_freq = power_acpi_cpufreq_set_freq,
> + .freq_down = power_acpi_cpufreq_freq_down,
> + .freq_up = power_acpi_cpufreq_freq_up,
> + .freq_max = power_acpi_cpufreq_freq_max,
> + .freq_min = power_acpi_cpufreq_freq_min,
> + .turbo_status = power_acpi_turbo_status,
> + .enable_turbo = power_acpi_enable_turbo,
> + .disable_turbo = power_acpi_disable_turbo,
> + .get_caps = power_acpi_get_capabilities
> +};
> +
Can this be made const?
It is good for security and overall safety to have structures with
function pointers marked const.
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 1/4] power: refactor core power management library
2024-08-26 13:06 ` [PATCH v2 1/4] power: refactor core " Sivaprasad Tummala
2024-08-26 15:26 ` Stephen Hemminger
@ 2024-08-27 8:21 ` lihuisong (C)
2024-09-12 11:17 ` Tummala, Sivaprasad
1 sibling, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-08-27 8:21 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau,
cristian.dumitrescu, jerinj, konstantin.ananyev, ferruh.yigit,
gakhil
Hi Sivaprasa,
Some comments inline.
/Huisong
在 2024/8/26 21:06, Sivaprasad Tummala 写道:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
>
> v2:
> - added NULL check for global_core_ops in rte_power_get_core_ops
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> drivers/meson.build | 1 +
> .../power/acpi/acpi_cpufreq.c | 22 +-
> .../power/acpi/acpi_cpufreq.h | 6 +-
> drivers/power/acpi/meson.build | 10 +
> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
> drivers/power/amd_pstate/meson.build | 10 +
> .../power/cppc/cppc_cpufreq.c | 22 +-
> .../power/cppc/cppc_cpufreq.h | 8 +-
> drivers/power/cppc/meson.build | 10 +
> .../power/kvm_vm}/guest_channel.c | 0
> .../power/kvm_vm}/guest_channel.h | 0
> .../power/kvm_vm/kvm_vm.c | 22 +-
> .../power/kvm_vm/kvm_vm.h | 6 +-
> drivers/power/kvm_vm/meson.build | 16 +
> drivers/power/meson.build | 12 +
> drivers/power/pstate/meson.build | 10 +
> .../power/pstate/pstate_cpufreq.c | 22 +-
> .../power/pstate/pstate_cpufreq.h | 6 +-
> lib/power/meson.build | 7 +-
> lib/power/power_common.c | 2 +-
> lib/power/power_common.h | 16 +-
> lib/power/rte_power.c | 291 ++++++------------
> lib/power/rte_power.h | 139 ++++++---
> lib/power/rte_power_core_ops.h | 208 +++++++++++++
> lib/power/version.map | 14 +
> 26 files changed, 621 insertions(+), 271 deletions(-)
> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
> create mode 100644 drivers/power/acpi/meson.build
> rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
> create mode 100644 drivers/power/amd_pstate/meson.build
> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
> create mode 100644 drivers/power/cppc/meson.build
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> create mode 100644 drivers/power/kvm_vm/meson.build
> create mode 100644 drivers/power/meson.build
> create mode 100644 drivers/power/pstate/meson.build
> rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
> rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
> create mode 100644 lib/power/rte_power_core_ops.h
How about use the following directory structure?
*For power libs*
lib/power/power_common.*
lib/power/rte_power_pmd_mgmt.*
lib/power/rte_power_cpufreq_api.* (replacing rte_power.c file maybe
simple for us. but I'm not sure if we can put the init of core, uncore
and pmd mgmt to rte_power_init.c in rte_power.c.)
lib/power/rte_power_uncore_freq_api.*
*And has directories under drivers/power:*
1> For core dvfs driver:
drivers/power/cpufreq/acpi_cpufreq.c
drivers/power/cpufreq/cppc_cpufreq.c
drivers/power/cpufreq/amd_pstate_cpufreq.c
drivers/power/cpufreq/intel_pstate_cpufreq.c
drivers/power/cpufreq/kvm_cpufreq.c
The code of each cpufreq driver is not too much and doesn't probably
increase. So don't need to use a directory for it.
2> For uncore dvfs driver:
drivers/power/uncorefreq/intel_uncore.*
> diff --git a/drivers/meson.build b/drivers/meson.build
> index 66931d4241..9d77e0deab 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -29,6 +29,7 @@ subdirs = [
> 'event', # depends on common, bus, mempool and net.
> 'baseband', # depends on common and bus.
> 'gpu', # depends on common and bus.
> + 'power', # depends on common (in future).
> ]
>
> if meson.is_cross_build()
> diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
> similarity index 95%
> rename from lib/power/power_acpi_cpufreq.c
> rename to drivers/power/acpi/acpi_cpufreq.c
do not suggest to create one directory for each cpufreq driver.
Because pstate drivers also comply with ACPI spec, right?
In addition, the code of each cpufreq drivers are not too much.
There is just one file under one directory which is not good.
> index 81996e1c13..8637c69703 100644
> --- a/lib/power/power_acpi_cpufreq.c
> +++ b/drivers/power/acpi/acpi_cpufreq.c
> @@ -10,7 +10,7 @@
> #include <rte_stdatomic.h>
> #include <rte_string_fns.h>
>
> -#include "power_acpi_cpufreq.h"
> +#include "acpi_cpufreq.h"
> #include "power_common.h"
>
<...>
> +if not is_linux
> + build = false
> + reason = 'only supported on Linux'
> +endif
> +sources = files('pstate_cpufreq.c')
> +
> +deps += ['power']
> diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
> similarity index 96%
> rename from lib/power/power_pstate_cpufreq.c
> rename to drivers/power/pstate/pstate_cpufreq.c
pstate_cpufreq.c is actually intel_pstate cpufreq driver, right?
So how about modify this file name to intel_pstate_cpufreq.c?
> index 2343121621..c32b1adabc 100644
> --- a/lib/power/power_pstate_cpufreq.c
> +++ b/drivers/power/pstate/pstate_cpufreq.c
> @@ -15,7 +15,7 @@
> #include <rte_stdatomic.h>
>
> #include "rte_power_pmd_mgmt.h"
> -#include "power_pstate_cpufreq.h"
> +#include "pstate_cpufreq.h"
> #include "power_common.h"
>
> /* macros used for rounding frequency to nearest 100000 */
> @@ -888,3 +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
>
> return 0;
> }
> +
<...>
> diff --git a/lib/power/power_common.c b/lib/power/power_common.c
> index 590986d5ef..6c06411e8b 100644
> --- a/lib/power/power_common.c
> +++ b/lib/power/power_common.c
> @@ -12,7 +12,7 @@
>
> #include "power_common.h"
>
> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>
> #define POWER_SYSFILE_SCALING_DRIVER \
> "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
> index 83f742f42a..767686ee12 100644
> --- a/lib/power/power_common.h
> +++ b/lib/power/power_common.h
> @@ -6,12 +6,13 @@
> #define _POWER_COMMON_H_
>
> #include <rte_common.h>
> +#include <rte_compat.h>
> #include <rte_log.h>
>
> #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>
> -extern int power_logtype;
> -#define RTE_LOGTYPE_POWER power_logtype
> +extern int rte_power_logtype;
> +#define RTE_LOGTYPE_POWER rte_power_logtype
> #define POWER_LOG(level, ...) \
> RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>
> @@ -23,13 +24,24 @@ extern int power_logtype;
> #endif
>
> /* check if scaling driver matches one we want */
> +__rte_internal
> int cpufreq_check_scaling_driver(const char *driver);
> +
> +__rte_internal
> int power_set_governor(unsigned int lcore_id, const char *new_governor,
> char *orig_governor, size_t orig_governor_len);
suggest that move cpufreq interfaces like this to the
rte_power_cpufreq_api.* I proposed above.
The interfaces in power_comm.* can be used by all power modules, like
core/uncore/pmd mgmt.
> +
> +__rte_internal
> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> __rte_format_printf(3, 4);
> +
> +__rte_internal
> int read_core_sysfs_u32(FILE *f, uint32_t *val);
> +
> +__rte_internal
> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> +
> +__rte_internal
> int write_core_sysfs_s(FILE *f, const char *str);
>
> #endif /* _POWER_COMMON_H_ */
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
The name of the rte_power.c file is impropriate now. The context in this
file is just for cpufreq, right?
So I suggest that we need to rename this file as the rte_power_cpufreq_api.c
> index 36c3f3da98..2bf6d40517 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power.c
> @@ -8,153 +8,86 @@
> #include <rte_spinlock.h>
>
> #include "rte_power.h"
> -#include "power_acpi_cpufreq.h"
> -#include "power_cppc_cpufreq.h"
> #include "power_common.h"
> -#include "power_kvm_vm.h"
> -#include "power_pstate_cpufreq.h"
> -#include "power_amd_pstate_cpufreq.h"
>
> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static struct rte_power_core_ops *global_power_core_ops;
>
> static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> + TAILQ_HEAD_INITIALIZER(core_ops_list);
>
> -/* function pointers */
> -rte_power_freqs_t rte_power_freqs = NULL;
> -rte_power_get_freq_t rte_power_get_freq = NULL;
> -rte_power_set_freq_t rte_power_set_freq = NULL;
> -rte_power_freq_change_t rte_power_freq_up = NULL;
> -rte_power_freq_change_t rte_power_freq_down = NULL;
> -rte_power_freq_change_t rte_power_freq_max = NULL;
> -rte_power_freq_change_t rte_power_freq_min = NULL;
> -rte_power_freq_change_t rte_power_turbo_status;
> -rte_power_freq_change_t rte_power_freq_enable_turbo;
> -rte_power_freq_change_t rte_power_freq_disable_turbo;
> -rte_power_get_capabilities_t rte_power_get_capabilities;
> -
> -static void
> -reset_power_function_ptrs(void)
> +
> +const char *power_env_str[] = {
> + "not set",
> + "acpi",
> + "kvm-vm",
> + "pstate",
> + "cppc",
> + "amd-pstate"
> +};
> +
> +/* register the ops struct in rte_power_core_ops, return 0 on success. */
> +int
> +rte_power_register_ops(struct rte_power_core_ops *driver_ops)
> {
> - rte_power_freqs = NULL;
> - rte_power_get_freq = NULL;
> - rte_power_set_freq = NULL;
> - rte_power_freq_up = NULL;
> - rte_power_freq_down = NULL;
> - rte_power_freq_max = NULL;
> - rte_power_freq_min = NULL;
> - rte_power_turbo_status = NULL;
> - rte_power_freq_enable_turbo = NULL;
> - rte_power_freq_disable_turbo = NULL;
> - rte_power_get_capabilities = NULL;
> + if (!driver_ops->init || !driver_ops->exit ||
> + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
> + !driver_ops->get_freq || !driver_ops->set_freq ||
> + !driver_ops->freq_up || !driver_ops->freq_down ||
> + !driver_ops->freq_max || !driver_ops->freq_min ||
> + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
> + !driver_ops->disable_turbo || !driver_ops->get_caps) {
> + POWER_LOG(ERR, "Missing callbacks while registering power ops");
turbo_status(), enable_turbo() and disable turbo() are not necessary,
right?
These depand on the capabilities from get_caps().
> + return -EINVAL;
> + }
> +
> + TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
> +
> + return 0;
> }
>
> int
> rte_power_check_env_supported(enum power_management_env env)
> {
> - switch (env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_check_supported();
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_check_supported();
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_check_supported();
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_check_supported();
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_check_supported();
> - default:
> - rte_errno = EINVAL;
> - return -1;
> - }
> + struct rte_power_core_ops *ops;
> +
> + if (env >= RTE_DIM(power_env_str))
> + return 0;
> +
> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> + if (strncmp(ops->name, power_env_str[env],
> + RTE_POWER_DRIVER_NAMESZ) == 0)
> + return ops->check_env_support();
> +
> + return 0;
> }
>
> int
> rte_power_set_env(enum power_management_env env)
> {
> + struct rte_power_core_ops *ops;
> + int ret = -1;
> +
> rte_spinlock_lock(&global_env_cfg_lock);
>
> if (global_default_env != PM_ENV_NOT_SET) {
> POWER_LOG(ERR, "Power Management Environment already set.");
> - rte_spinlock_unlock(&global_env_cfg_lock);
> - return -1;
> - }
> -
<...>
> - if (ret == 0)
> - global_default_env = env;
> - else {
> - global_default_env = PM_ENV_NOT_SET;
> - reset_power_function_ptrs();
> - }
> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> + if (strncmp(ops->name, power_env_str[env],
> + RTE_POWER_DRIVER_NAMESZ) == 0) {
> + global_power_core_ops = ops;
> + global_default_env = env;
> + ret = 0;
> + goto out;
> + }
> + POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
> + env);
>
> +out:
> rte_spinlock_unlock(&global_env_cfg_lock);
> return ret;
> }
> @@ -164,94 +97,66 @@ rte_power_unset_env(void)
> {
> rte_spinlock_lock(&global_env_cfg_lock);
> global_default_env = PM_ENV_NOT_SET;
> - reset_power_function_ptrs();
> + global_power_core_ops = NULL;
> rte_spinlock_unlock(&global_env_cfg_lock);
> }
>
> enum power_management_env
> -rte_power_get_env(void) {
> +rte_power_get_env(void)
> +{
> return global_default_env;
> }
>
> -int
> -rte_power_init(unsigned int lcore_id)
> +struct rte_power_core_ops *
> +rte_power_get_core_ops(void)
> {
> - int ret = -1;
> + RTE_ASSERT(global_power_core_ops != NULL);
>
> - switch (global_default_env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_init(lcore_id);
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_init(lcore_id);
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_init(lcore_id);
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_init(lcore_id);
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_init(lcore_id);
> - default:
> - POWER_LOG(INFO, "Env isn't set yet!");
> - }
> + return global_power_core_ops;
> +}
>
> - /* Auto detect Environment */
> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
> - ret = power_acpi_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> - goto out;
> - }
> +int
> +rte_power_init(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops;
> + uint8_t env;
>
> - POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
> - ret = power_pstate_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> - goto out;
> - }
> + if (global_default_env != PM_ENV_NOT_SET)
> + return global_power_core_ops->init(lcore_id);
>
> - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
> - ret = power_amd_pstate_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> - goto out;
> - }
> + POWER_LOG(INFO, "Env isn't set yet!");
remove this log?
>
> - POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
> - ret = power_cppc_cpufreq_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> - goto out;
> - }
> + /* Auto detect Environment */
> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> + if (ops) {
> + POWER_LOG(INFO,
> + "Attempting to initialise %s cpufreq power management...",
> + ops->name);
> + if (ops->init(lcore_id) == 0) {
> + for (env = 0; env < RTE_DIM(power_env_str); env++)
> + if (strncmp(ops->name, power_env_str[env],
> + RTE_POWER_DRIVER_NAMESZ) == 0) {
> + rte_power_set_env(env);
> + return 0;
> + }
> + }
> + }
Can we change the logic of rte_power_set_env()? like:
RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
for (env = 0; env < RTE_DIM(power_env_str); env++) {
if (strncmp(ops->name, power_env_str[env],
RTE_POWER_DRIVER_NAMESZ) == 0 &&
ops->init(lcore_id) == 0) {
global_power_core_ops = ops;
global_default_env = env;
}
}
}
That is easier to follow code.
> +
> + POWER_LOG(ERR,
> + "Unable to set Power Management Environment for lcore %u",
> + lcore_id);
>
> - POWER_LOG(INFO, "Attempting to initialise VM power management...");
> - ret = power_kvm_vm_init(lcore_id);
> - if (ret == 0) {
> - rte_power_set_env(PM_ENV_KVM_VM);
> - goto out;
> - }
> - POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
> - "%u", lcore_id);
> -out:
> - return ret;
> + return -1;
> }
>
> int
> rte_power_exit(unsigned int lcore_id)
> {
> - switch (global_default_env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_exit(lcore_id);
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_exit(lcore_id);
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_exit(lcore_id);
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_exit(lcore_id);
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_exit(lcore_id);
> - default:
> - POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
> + if (global_default_env != PM_ENV_NOT_SET)
> + return global_power_core_ops->exit(lcore_id);
>
> - }
> - return -1;
> + POWER_LOG(ERR,
> + "Environment has not been set, unable to exit gracefully");
>
> + return -1;
> }
> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
> index 4fa4afe399..5e4aacf08b 100644
> --- a/lib/power/rte_power.h
> +++ b/lib/power/rte_power.h
> @@ -1,5 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> */
>
> #ifndef _RTE_POWER_H
> @@ -14,14 +15,21 @@
> #include <rte_log.h>
> #include <rte_power_guest_channel.h>
>
> +#include "rte_power_core_ops.h"
> +
> #ifdef __cplusplus
> extern "C" {
> #endif
>
> /* Power Management Environment State */
> -enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> - PM_ENV_AMD_PSTATE_CPUFREQ};
> +enum power_management_env {
> + PM_ENV_NOT_SET = 0,
> + PM_ENV_ACPI_CPUFREQ,
> + PM_ENV_KVM_VM,
> + PM_ENV_PSTATE_CPUFREQ,
> + PM_ENV_CPPC_CPUFREQ,
> + PM_ENV_AMD_PSTATE_CPUFREQ
> +};
>
> /**
> * Check if a specific power management environment type is supported on a
> @@ -66,6 +74,15 @@ void rte_power_unset_env(void);
> */
> enum power_management_env rte_power_get_env(void);
I'd like to let user not know used which cpufreq driver, which is
friendly to user.
So we can rethink if this API is necessary.
>
> +/**
> + * @internal Get the power ops struct from its index.
> + *
> + * @return
> + * The pointer to the ops struct in the table if registered.
> + */
> +struct rte_power_core_ops *
> +rte_power_get_core_ops(void);
> +
> /**
> * Initialize power management for a specific lcore. If rte_power_set_env() has
> * not been called then an auto-detect of the environment will start and
> @@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
> * @return
> * The number of available frequencies.
> */
> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> - uint32_t num);
> +static inline uint32_t
> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -extern rte_power_freqs_t rte_power_freqs;
> + return ops->get_avail_freqs(lcore_id, freqs, n);
> +}
>
> /**
> * Return the current index of available frequencies of a specific lcore.
> @@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
> * @return
> * The current index of available frequencies.
> */
> -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> +static inline uint32_t
> +rte_power_get_freq(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -extern rte_power_get_freq_t rte_power_get_freq;
> + return ops->get_freq(lcore_id);
> +}
>
> /**
> * Set the new frequency for a specific lcore by indicating the index of
> @@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
> * - 0 on success without frequency changed.
> * - Negative on error.
> */
> -typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
> -
> -extern rte_power_set_freq_t rte_power_set_freq;
> +static inline uint32_t
> +rte_power_set_freq(unsigned int lcore_id, uint32_t index)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -/**
> - * Function pointer definition for generic frequency change functions. Review
> - * each environments specific documentation for usage.
> - *
> - * @param lcore_id
> - * lcore id.
> - *
> - * @return
> - * - 1 on success with frequency changed.
> - * - 0 on success without frequency changed.
> - * - Negative on error.
> - */
> -typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> + return ops->set_freq(lcore_id, index);
> +}
>
> /**
> * Scale up the frequency of a specific lcore according to the available
> * frequencies.
> * Review each environments specific documentation for usage.
> */
> -extern rte_power_freq_change_t rte_power_freq_up;
> +static inline int
> +rte_power_freq_up(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_up(lcore_id);
> +}
>
> /**
> * Scale down the frequency of a specific lcore according to the available
> * frequencies.
> * Review each environments specific documentation for usage.
> */
> -extern rte_power_freq_change_t rte_power_freq_down;
> +static inline int
> +rte_power_freq_down(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_down(lcore_id);
> +}
>
> /**
> * Scale up the frequency of a specific lcore to the highest according to the
> * available frequencies.
> * Review each environments specific documentation for usage.
> */
> -extern rte_power_freq_change_t rte_power_freq_max;
> +static inline int
> +rte_power_freq_max(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_max(lcore_id);
> +}
>
> /**
> * Scale down the frequency of a specific lcore to the lowest according to the
> * available frequencies.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_freq_min;
> +static inline int
> +rte_power_freq_min(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->freq_min(lcore_id);
> +}
>
> /**
> * Query the Turbo Boost status of a specific lcore.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_turbo_status;
> +static inline int
> +rte_power_turbo_status(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->turbo_status(lcore_id);
> +}
>
> /**
> * Enable Turbo Boost for this lcore.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_freq_enable_turbo;
> +static inline int
> +rte_power_freq_enable_turbo(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> +
> + return ops->enable_turbo(lcore_id);
> +}
>
> /**
> * Disable Turbo Boost for this lcore.
> * Review each environments specific documentation for usage..
> */
> -extern rte_power_freq_change_t rte_power_freq_disable_turbo;
> +static inline int
> +rte_power_freq_disable_turbo(unsigned int lcore_id)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -/**
> - * Power capabilities summary.
> - */
> -struct rte_power_core_capabilities {
> - union {
> - uint64_t capabilities;
> - struct {
> - uint64_t turbo:1; /**< Turbo can be enabled. */
> - uint64_t priority:1; /**< SST-BF high freq core */
> - };
> - };
> -};
> + return ops->disable_turbo(lcore_id);
> +}
>
> /**
> * Returns power capabilities for a specific lcore.
> @@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
> * - 0 on success.
> * - Negative on error.
> */
> -typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
> - struct rte_power_core_capabilities *caps);
> +static inline int
> +rte_power_get_capabilities(unsigned int lcore_id,
> + struct rte_power_core_capabilities *caps)
> +{
> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>
> -extern rte_power_get_capabilities_t rte_power_get_capabilities;
> + return ops->get_caps(lcore_id, caps);
> +}
>
> #ifdef __cplusplus
> }
> diff --git a/lib/power/rte_power_core_ops.h b/lib/power/rte_power_core_ops.h
> new file mode 100644
> index 0000000000..356a64df79
> --- /dev/null
> +++ b/lib/power/rte_power_core_ops.h
> @@ -0,0 +1,208 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> + */
> +
> +#ifndef _RTE_POWER_CORE_OPS_H
> +#define _RTE_POWER_CORE_OPS_H
> +
suggest rename the file as rte_power_cpufreq_api.h.
If so, the role of this file is more clearly.
> +__rte_internal
> +int rte_power_register_ops(struct rte_power_core_ops *ops);
> +
> +/**
> + * Macro to statically register the ops of a cpufreq driver.
> + */
> +#define RTE_POWER_REGISTER_OPS(ops) \
> + RTE_INIT(power_hdlr_init_##ops) \
> + { \
> + rte_power_register_ops(&ops); \
> + }
> +
> +/**
> + * @internal Get the power ops struct from its index.
> + *
> + * @return
> + * The pointer to the ops struct in the table if registered.
> + */
> +struct rte_power_core_ops *
> +rte_power_get_core_ops(void);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
> diff --git a/lib/power/version.map b/lib/power/version.map
> index c9a226614e..bd64e0828f 100644
> --- a/lib/power/version.map
> +++ b/lib/power/version.map
> @@ -51,4 +51,18 @@ EXPERIMENTAL {
> rte_power_set_uncore_env;
> rte_power_uncore_freqs;
> rte_power_unset_uncore_env;
> + # added in 24.07
24.07-->24.11?
> + rte_power_logtype;
> +};
> +
> +INTERNAL {
> + global:
> +
> + rte_power_register_ops;
> + cpufreq_check_scaling_driver;
> + power_set_governor;
> + open_core_sysfs_file;
> + read_core_sysfs_u32;
> + read_core_sysfs_s;
> + write_core_sysfs_s;
> };
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 2/4] power: refactor uncore power management library
2024-08-26 13:06 ` [PATCH v2 2/4] power: refactor uncore " Sivaprasad Tummala
@ 2024-08-27 13:02 ` lihuisong (C)
2024-10-08 6:19 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-08-27 13:02 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, ferruh.yigit, gakhil
Hi Sivaprasad,
Suggest to split this patch into two patches for easiler to review:
patch-1: abstract a file for uncore dvfs core level, namely, the
rte_power_uncore_ops.c you did.
patch-2: move and rename, lib/power/power_intel_uncore.c =>
drivers/power/intel_uncore/intel_uncore.c
patch[1/4] is also too big and not good to review.
In addition, I have some question and am not sure if we can adjust
uncore init process.
/Huisong
在 2024/8/26 21:06, Sivaprasad Tummala 写道:
> This patch refactors the power management library, addressing uncore
> power management. The primary changes involve the creation of dedicated
> directories for each driver within 'drivers/power/uncore/*'. The
> adjustment of meson.build files enables the selective activation
> of individual drivers.
>
> This refactor significantly improves code organization, enhances
> clarity and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> .../power/intel_uncore/intel_uncore.c | 18 +-
> .../power/intel_uncore/intel_uncore.h | 8 +-
> drivers/power/intel_uncore/meson.build | 6 +
> drivers/power/meson.build | 3 +-
> lib/power/meson.build | 2 +-
> lib/power/rte_power_uncore.c | 205 ++++++---------
> lib/power/rte_power_uncore.h | 87 ++++---
> lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
> lib/power/version.map | 1 +
> 9 files changed, 405 insertions(+), 164 deletions(-)
> rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
> rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
> create mode 100644 drivers/power/intel_uncore/meson.build
> create mode 100644 lib/power/rte_power_uncore_ops.h
>
> diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
> similarity index 95%
> rename from lib/power/power_intel_uncore.c
> rename to drivers/power/intel_uncore/intel_uncore.c
> index 4eb9c5900a..804ad5d755 100644
> --- a/lib/power/power_intel_uncore.c
> +++ b/drivers/power/intel_uncore/intel_uncore.c
> @@ -8,7 +8,7 @@
>
> #include <rte_memcpy.h>
>
> -#include "power_intel_uncore.h"
> +#include "intel_uncore.h"
> #include "power_common.h"
>
> #define MAX_NUMA_DIE 8
> @@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
>
> return count;
> }
<...>
>
> -#endif /* POWER_INTEL_UNCORE_H */
> +#endif /* INTEL_UNCORE_H */
> diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
> new file mode 100644
> index 0000000000..876df8ad14
> --- /dev/null
> +++ b/drivers/power/intel_uncore/meson.build
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2017 Intel Corporation
> +# Copyright(c) 2024 Advanced Micro Devices, Inc.
> +
> +sources = files('intel_uncore.c')
> +deps += ['power']
> diff --git a/drivers/power/meson.build b/drivers/power/meson.build
> index 8c7215c639..c83047af94 100644
> --- a/drivers/power/meson.build
> +++ b/drivers/power/meson.build
> @@ -6,7 +6,8 @@ drivers = [
> 'amd_pstate',
> 'cppc',
> 'kvm_vm',
> - 'pstate'
> + 'pstate',
> + 'intel_uncore'
The cppc, amd_pstate and so on belong to cpufreq scope.
And intel_uncore belongs to uncore dvfs scope.
They are not the same level. So I proposes that we need to create one
directory called like cpufreq or core.
This 'intel_uncore' name don't seems appropriate. what do you think the
following directory structure:
drivers/power/uncore/intel_uncore.c
drivers/power/uncore/amd_uncore.c (according to the patch[4/4]).
> ]
> std_deps = ['power']
> diff --git a/lib/power/meson.build b/lib/power/meson.build
> index f3e3451cdc..9b13d98810 100644
> --- a/lib/power/meson.build
> +++ b/lib/power/meson.build
> @@ -13,7 +13,6 @@ if not is_linux
> endif
> sources = files(
> 'power_common.c',
> - 'power_intel_uncore.c',
> 'rte_power.c',
> 'rte_power_uncore.c',
> 'rte_power_pmd_mgmt.c',
> @@ -24,6 +23,7 @@ headers = files(
> 'rte_power_guest_channel.h',
> 'rte_power_pmd_mgmt.h',
> 'rte_power_uncore.h',
> + 'rte_power_uncore_ops.h',
> )
> if cc.has_argument('-Wno-cast-qual')
> cflags += '-Wno-cast-qual'
> diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
> index 48c75a5da0..9f8771224f 100644
> --- a/lib/power/rte_power_uncore.c
> +++ b/lib/power/rte_power_uncore.c
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> * Copyright(c) 2023 AMD Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> */
>
> #include <errno.h>
> @@ -12,98 +13,50 @@
> #include "rte_power_uncore.h"
> #include "power_intel_uncore.h"
>
> -enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> +static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> +static struct rte_power_uncore_ops *global_uncore_ops;
>
> static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> +static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
> + TAILQ_HEAD_INITIALIZER(uncore_ops_list);
>
> -static uint32_t
> -power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
> - unsigned int die __rte_unused)
> -{
> - return 0;
> -}
> -
> -static int
> -power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
> - unsigned int die __rte_unused, uint32_t index __rte_unused)
> -{
> - return 0;
> -}
> +const char *uncore_env_str[] = {
> + "not set",
> + "auto-detect",
> + "intel-uncore",
> + "amd-hsmp"
> +};
Why open the "auto-detect" mode to user?
Why not set this automatically at framework initialization?
After all, the uncore driver is fixed for one platform.
>
> -static int
> -power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
> - unsigned int die __rte_unused)
> -{
> - return 0;
> -}
> -
<...>
> -static int
> -power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
> - unsigned int die __rte_unused)
> +/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
> +int
> +rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
> {
> - return 0;
> -}
> + if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
> + !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
> + !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
> + !driver_ops->set_freq || !driver_ops->freq_max ||
> + !driver_ops->freq_min) {
> + POWER_LOG(ERR, "Missing callbacks while registering power ops");
> + return -1;
> + }
> + if (driver_ops->cb)
> + driver_ops->cb();
>
> -static unsigned int
> -power_dummy_uncore_get_num_pkgs(void)
> -{
> - return 0;
> -}
> + TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
>
> -static unsigned int
> -power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
> -{
> return 0;
> }
> -
> -/* function pointers */
> -rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
> -rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
> -rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
> -rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
> -rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
> -rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
> -rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
> -rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
> -
> -static void
> -reset_power_uncore_function_ptrs(void)
> -{
> - rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
> - rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
> - rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
> - rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
> - rte_power_uncore_freqs = power_dummy_uncore_freqs;
> - rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
> - rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
> - rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
> -}
> -
> int
> rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
> {
> - int ret;
> + int ret = -1;
> + struct rte_power_uncore_ops *ops;
>
> rte_spinlock_lock(&global_env_cfg_lock);
>
> - if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
> + if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
> POWER_LOG(ERR, "Uncore Power Management Env already set.");
> - rte_spinlock_unlock(&global_env_cfg_lock);
> - return -1;
> + goto out;
> }
>
<...>
> + if (env <= RTE_DIM(uncore_env_str)) {
> + RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
> + if (strncmp(ops->name, uncore_env_str[env],
> + RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
> + global_uncore_env = env;
> + global_uncore_ops = ops;
> + ret = 0;
> + goto out;
> + }
> + POWER_LOG(ERR, "Power Management (%s) not supported",
> + uncore_env_str[env]);
> + } else
> + POWER_LOG(ERR, "Invalid Power Management Environment");
>
> - default_uncore_env = env;
> out:
> rte_spinlock_unlock(&global_env_cfg_lock);
> return ret;
> @@ -139,15 +89,22 @@ void
> rte_power_unset_uncore_env(void)
> {
> rte_spinlock_lock(&global_env_cfg_lock);
> - default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> - reset_power_uncore_function_ptrs();
> + global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> rte_spinlock_unlock(&global_env_cfg_lock);
> }
>
How about abstract an ABI interface to intialize or set the uncore
driver on platform by automatical.
And later do power_intel_uncore_init_on_die() for each die on different
package.
> enum rte_uncore_power_mgmt_env
> rte_power_get_uncore_env(void)
> {
> - return default_uncore_env;
> + return global_uncore_env;
> +}
> +
> +struct rte_power_uncore_ops *
> +rte_power_get_uncore_ops(void)
> +{
> + RTE_ASSERT(global_uncore_ops != NULL);
> +
> + return global_uncore_ops;
> }
>
> int
> @@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
This pkg means the socket id on the platform, right?
If so, I am not sure that the
uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE] used in uncore lib is
universal for all uncore driver.
For example, uncore driver just support do uncore dvfs based on the
socket unit.
What shoud we do for this? we may need to think twice.
> {
> int ret = -1;
>
<...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v2 1/4] power: refactor core power management library
2024-08-27 8:21 ` lihuisong (C)
@ 2024-09-12 11:17 ` Tummala, Sivaprasad
2024-09-13 7:34 ` lihuisong (C)
0 siblings, 1 reply; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-09-12 11:17 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau,
cristian.dumitrescu, jerinj, konstantin.ananyev, Yigit, Ferruh,
gakhil
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Huisong,
Please find my response inline.
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Tuesday, August 27, 2024 1:51 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> radu.nicolau@intel.com; cristian.dumitrescu@intel.com; jerinj@marvell.com;
> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> gakhil@marvell.com
> Subject: Re: [PATCH v2 1/4] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hi Sivaprasa,
>
> Some comments inline.
>
> /Huisong
>
> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
> > This patch introduces a comprehensive refactor to the core power
> > management library. The primary focus is on improving modularity and
> > organization by relocating specific driver implementations from the
> > 'lib/power' directory to dedicated directories within
> > 'drivers/power/core/*'. The adjustment of meson.build files enables
> > the selective activation of individual drivers.
> > These changes contribute to a significant enhancement in code
> > organization, providing a clearer structure for driver implementations.
> > The refactor aims to improve overall code clarity and boost
> > maintainability. Additionally, it establishes a foundation for future
> > development, allowing for more focused work on individual drivers and
> > seamless integration of forthcoming enhancements.
> >
> > v2:
> > - added NULL check for global_core_ops in rte_power_get_core_ops
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> > drivers/meson.build | 1 +
> > .../power/acpi/acpi_cpufreq.c | 22 +-
> > .../power/acpi/acpi_cpufreq.h | 6 +-
> > drivers/power/acpi/meson.build | 10 +
> > .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> > .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
> > drivers/power/amd_pstate/meson.build | 10 +
> > .../power/cppc/cppc_cpufreq.c | 22 +-
> > .../power/cppc/cppc_cpufreq.h | 8 +-
> > drivers/power/cppc/meson.build | 10 +
> > .../power/kvm_vm}/guest_channel.c | 0
> > .../power/kvm_vm}/guest_channel.h | 0
> > .../power/kvm_vm/kvm_vm.c | 22 +-
> > .../power/kvm_vm/kvm_vm.h | 6 +-
> > drivers/power/kvm_vm/meson.build | 16 +
> > drivers/power/meson.build | 12 +
> > drivers/power/pstate/meson.build | 10 +
> > .../power/pstate/pstate_cpufreq.c | 22 +-
> > .../power/pstate/pstate_cpufreq.h | 6 +-
> > lib/power/meson.build | 7 +-
> > lib/power/power_common.c | 2 +-
> > lib/power/power_common.h | 16 +-
> > lib/power/rte_power.c | 291 ++++++------------
> > lib/power/rte_power.h | 139 ++++++---
> > lib/power/rte_power_core_ops.h | 208 +++++++++++++
> > lib/power/version.map | 14 +
> > 26 files changed, 621 insertions(+), 271 deletions(-)
> > rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c
> (95%)
> > rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h
> (98%)
> > create mode 100644 drivers/power/acpi/meson.build
> > rename lib/power/power_amd_pstate_cpufreq.c =>
> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> > rename lib/power/power_amd_pstate_cpufreq.h =>
> drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
> > create mode 100644 drivers/power/amd_pstate/meson.build
> > rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c
> (95%)
> > rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h
> (97%)
> > create mode 100644 drivers/power/cppc/meson.build
> > rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> > rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> > rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> > rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> > create mode 100644 drivers/power/kvm_vm/meson.build
> > create mode 100644 drivers/power/meson.build
> > create mode 100644 drivers/power/pstate/meson.build
> > rename lib/power/power_pstate_cpufreq.c =>
> drivers/power/pstate/pstate_cpufreq.c (96%)
> > rename lib/power/power_pstate_cpufreq.h =>
> drivers/power/pstate/pstate_cpufreq.h (98%)
> > create mode 100644 lib/power/rte_power_core_ops.h
> How about use the following directory structure?
> *For power libs*
> lib/power/power_common.*
> lib/power/rte_power_pmd_mgmt.*
> lib/power/rte_power_cpufreq_api.* (replacing rte_power.c file maybe simple for us.
> but I'm not sure if we can put the init of core, uncore and pmd mgmt to
> rte_power_init.c in rte_power.c.)
> lib/power/rte_power_uncore_freq_api.*
Yes, renaming rte_power.c is definitely a possible incremental change that could be considered later.
However, for the time being, our focus will be on refactoring the cpufreq drivers only.
>
> *And has directories under drivers/power:*
> 1> For core dvfs driver:
> drivers/power/cpufreq/acpi_cpufreq.c
> drivers/power/cpufreq/cppc_cpufreq.c
> drivers/power/cpufreq/amd_pstate_cpufreq.c
> drivers/power/cpufreq/intel_pstate_cpufreq.c
> drivers/power/cpufreq/kvm_cpufreq.c
> The code of each cpufreq driver is not too much and doesn't probably increase. So
> don't need to use a directory for it.
>
> 2> For uncore dvfs driver:
> drivers/power/uncorefreq/intel_uncore.*
> > diff --git a/drivers/meson.build b/drivers/meson.build index
> > 66931d4241..9d77e0deab 100644
> > --- a/drivers/meson.build
> > +++ b/drivers/meson.build
> > @@ -29,6 +29,7 @@ subdirs = [
> > 'event', # depends on common, bus, mempool and net.
> > 'baseband', # depends on common and bus.
> > 'gpu', # depends on common and bus.
> > + 'power', # depends on common (in future).
> > ]
> >
> > if meson.is_cross_build()
> > diff --git a/lib/power/power_acpi_cpufreq.c
> > b/drivers/power/acpi/acpi_cpufreq.c
> > similarity index 95%
> > rename from lib/power/power_acpi_cpufreq.c rename to
> > drivers/power/acpi/acpi_cpufreq.c
> do not suggest to create one directory for each cpufreq driver.
> Because pstate drivers also comply with ACPI spec, right?
> In addition, the code of each cpufreq drivers are not too much.
> There is just one file under one directory which is not good.
One of our objectives for the refactoring is to selectively disable non-essential drivers using Meson build options.
However, by rearranging the driver structure, we risk disrupting this capability.
> > index 81996e1c13..8637c69703 100644
> > --- a/lib/power/power_acpi_cpufreq.c
> > +++ b/drivers/power/acpi/acpi_cpufreq.c
> > @@ -10,7 +10,7 @@
> > #include <rte_stdatomic.h>
> > #include <rte_string_fns.h>
> >
> > -#include "power_acpi_cpufreq.h"
> > +#include "acpi_cpufreq.h"
> > #include "power_common.h"
> >
> <...>
> > +if not is_linux
> > + build = false
> > + reason = 'only supported on Linux'
> > +endif
> > +sources = files('pstate_cpufreq.c')
> > +
> > +deps += ['power']
> > diff --git a/lib/power/power_pstate_cpufreq.c
> > b/drivers/power/pstate/pstate_cpufreq.c
> > similarity index 96%
> > rename from lib/power/power_pstate_cpufreq.c rename to
> > drivers/power/pstate/pstate_cpufreq.c
> pstate_cpufreq.c is actually intel_pstate cpufreq driver, right?
> So how about modify this file name to intel_pstate_cpufreq.c?
Yes, will fix this in next version.
> > index 2343121621..c32b1adabc 100644
> > --- a/lib/power/power_pstate_cpufreq.c
> > +++ b/drivers/power/pstate/pstate_cpufreq.c
> > @@ -15,7 +15,7 @@
> > #include <rte_stdatomic.h>
> >
> > #include "rte_power_pmd_mgmt.h"
> > -#include "power_pstate_cpufreq.h"
> > +#include "pstate_cpufreq.h"
> > #include "power_common.h"
> >
> > /* macros used for rounding frequency to nearest 100000 */ @@ -888,3
> > +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
> >
> > return 0;
> > }
> > +
> <...>
> > diff --git a/lib/power/power_common.c b/lib/power/power_common.c index
> > 590986d5ef..6c06411e8b 100644
> > --- a/lib/power/power_common.c
> > +++ b/lib/power/power_common.c
> > @@ -12,7 +12,7 @@
> >
> > #include "power_common.h"
> >
> > -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> > +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
> >
> > #define POWER_SYSFILE_SCALING_DRIVER \
> > "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> > diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
> > 83f742f42a..767686ee12 100644
> > --- a/lib/power/power_common.h
> > +++ b/lib/power/power_common.h
> > @@ -6,12 +6,13 @@
> > #define _POWER_COMMON_H_
> >
> > #include <rte_common.h>
> > +#include <rte_compat.h>
> > #include <rte_log.h>
> >
> > #define RTE_POWER_INVALID_FREQ_INDEX (~0)
> >
> > -extern int power_logtype;
> > -#define RTE_LOGTYPE_POWER power_logtype
> > +extern int rte_power_logtype;
> > +#define RTE_LOGTYPE_POWER rte_power_logtype
> > #define POWER_LOG(level, ...) \
> > RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
> >
> > @@ -23,13 +24,24 @@ extern int power_logtype;
> > #endif
> >
> > /* check if scaling driver matches one we want */
> > +__rte_internal
> > int cpufreq_check_scaling_driver(const char *driver);
> > +
> > +__rte_internal
> > int power_set_governor(unsigned int lcore_id, const char *new_governor,
> > char *orig_governor, size_t orig_governor_len);
> suggest that move cpufreq interfaces like this to the
> rte_power_cpufreq_api.* I proposed above.
This is an internal API and isn’t intended for direct use by applications.
By moving it to rte_power_*, we risk exposing it inadvertently.
> The interfaces in power_comm.* can be used by all power modules, like
> core/uncore/pmd mgmt.
> > +
> > +__rte_internal
> > int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> > __rte_format_printf(3, 4);
> > +
> > +__rte_internal
> > int read_core_sysfs_u32(FILE *f, uint32_t *val);
> > +
> > +__rte_internal
> > int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> > +
> > +__rte_internal
> > int write_core_sysfs_s(FILE *f, const char *str);
> >
> > #endif /* _POWER_COMMON_H_ */
> > diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> The name of the rte_power.c file is impropriate now. The context in this file is just for
> cpufreq, right?
> So I suggest that we need to rename this file as the rte_power_cpufreq_api.c
Yes, renaming rte_power.c to rte_power_cpufreq.c is definitely a possible incremental change
and will fix this as a separate patch.
.
> > index 36c3f3da98..2bf6d40517 100644
> > --- a/lib/power/rte_power.c
> > +++ b/lib/power/rte_power.c
> > @@ -8,153 +8,86 @@
> > #include <rte_spinlock.h>
> >
> > #include "rte_power.h"
> > -#include "power_acpi_cpufreq.h"
> > -#include "power_cppc_cpufreq.h"
> > #include "power_common.h"
> > -#include "power_kvm_vm.h"
> > -#include "power_pstate_cpufreq.h"
> > -#include "power_amd_pstate_cpufreq.h"
> >
> > -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> > +static enum power_management_env global_default_env =
> PM_ENV_NOT_SET;
> > +static struct rte_power_core_ops *global_power_core_ops;
> >
> > static rte_spinlock_t global_env_cfg_lock =
> > RTE_SPINLOCK_INITIALIZER;
> > +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> > + TAILQ_HEAD_INITIALIZER(core_ops_list);
> >
> > -/* function pointers */
> > -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
> > rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
> > NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
> > -rte_power_freq_change_t rte_power_freq_down = NULL;
> > -rte_power_freq_change_t rte_power_freq_max = NULL;
> > -rte_power_freq_change_t rte_power_freq_min = NULL;
> > -rte_power_freq_change_t rte_power_turbo_status;
> > -rte_power_freq_change_t rte_power_freq_enable_turbo;
> > -rte_power_freq_change_t rte_power_freq_disable_turbo;
> > -rte_power_get_capabilities_t rte_power_get_capabilities;
> > -
> > -static void
> > -reset_power_function_ptrs(void)
> > +
> > +const char *power_env_str[] = {
> > + "not set",
> > + "acpi",
> > + "kvm-vm",
> > + "pstate",
> > + "cppc",
> > + "amd-pstate"
> > +};
> > +
> > +/* register the ops struct in rte_power_core_ops, return 0 on
> > +success. */ int rte_power_register_ops(struct rte_power_core_ops
> > +*driver_ops)
> > {
> > - rte_power_freqs = NULL;
> > - rte_power_get_freq = NULL;
> > - rte_power_set_freq = NULL;
> > - rte_power_freq_up = NULL;
> > - rte_power_freq_down = NULL;
> > - rte_power_freq_max = NULL;
> > - rte_power_freq_min = NULL;
> > - rte_power_turbo_status = NULL;
> > - rte_power_freq_enable_turbo = NULL;
> > - rte_power_freq_disable_turbo = NULL;
> > - rte_power_get_capabilities = NULL;
> > + if (!driver_ops->init || !driver_ops->exit ||
> > + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
> > + !driver_ops->get_freq || !driver_ops->set_freq ||
> > + !driver_ops->freq_up || !driver_ops->freq_down ||
> > + !driver_ops->freq_max || !driver_ops->freq_min ||
> > + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
> > + !driver_ops->disable_turbo || !driver_ops->get_caps) {
> > + POWER_LOG(ERR, "Missing callbacks while registering
> > + power ops");
> turbo_status(), enable_turbo() and disable turbo() are not necessary, right?
Nope, this is required to get the current status unlike the capability API (get_caps()).
> These depand on the capabilities from get_caps().
> > + return -EINVAL;
> > + }
> > +
> > + TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
> > +
> > + return 0;
> > }
> >
> > int
> > rte_power_check_env_supported(enum power_management_env env)
> > {
> > - switch (env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_check_supported();
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_check_supported();
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_check_supported();
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_check_supported();
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_check_supported();
> > - default:
> > - rte_errno = EINVAL;
> > - return -1;
> > - }
> > + struct rte_power_core_ops *ops;
> > +
> > + if (env >= RTE_DIM(power_env_str))
> > + return 0;
> > +
> > + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> > + if (strncmp(ops->name, power_env_str[env],
> > + RTE_POWER_DRIVER_NAMESZ) == 0)
> > + return ops->check_env_support();
> > +
> > + return 0;
> > }
> >
> > int
> > rte_power_set_env(enum power_management_env env)
> > {
> > + struct rte_power_core_ops *ops;
> > + int ret = -1;
> > +
> > rte_spinlock_lock(&global_env_cfg_lock);
> >
> > if (global_default_env != PM_ENV_NOT_SET) {
> > POWER_LOG(ERR, "Power Management Environment already set.");
> > - rte_spinlock_unlock(&global_env_cfg_lock);
> > - return -1;
> > - }
> > -
> <...>
> > - if (ret == 0)
> > - global_default_env = env;
> > - else {
> > - global_default_env = PM_ENV_NOT_SET;
> > - reset_power_function_ptrs();
> > - }
> > + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> > + if (strncmp(ops->name, power_env_str[env],
> > + RTE_POWER_DRIVER_NAMESZ) == 0) {
> > + global_power_core_ops = ops;
> > + global_default_env = env;
> > + ret = 0;
> > + goto out;
> > + }
> > + POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
> > + env);
> >
> > +out:
> > rte_spinlock_unlock(&global_env_cfg_lock);
> > return ret;
> > }
> > @@ -164,94 +97,66 @@ rte_power_unset_env(void)
> > {
> > rte_spinlock_lock(&global_env_cfg_lock);
> > global_default_env = PM_ENV_NOT_SET;
> > - reset_power_function_ptrs();
> > + global_power_core_ops = NULL;
> > rte_spinlock_unlock(&global_env_cfg_lock);
> > }
> >
> > enum power_management_env
> > -rte_power_get_env(void) {
> > +rte_power_get_env(void)
> > +{
> > return global_default_env;
> > }
> >
> > -int
> > -rte_power_init(unsigned int lcore_id)
> > +struct rte_power_core_ops *
> > +rte_power_get_core_ops(void)
> > {
> > - int ret = -1;
> > + RTE_ASSERT(global_power_core_ops != NULL);
> >
> > - switch (global_default_env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_init(lcore_id);
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_init(lcore_id);
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_init(lcore_id);
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_init(lcore_id);
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_init(lcore_id);
> > - default:
> > - POWER_LOG(INFO, "Env isn't set yet!");
> > - }
> > + return global_power_core_ops;
> > +}
> >
> > - /* Auto detect Environment */
> > - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
> management...");
> > - ret = power_acpi_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> > - goto out;
> > - }
> > +int
> > +rte_power_init(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops;
> > + uint8_t env;
> >
> > - POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
> > - ret = power_pstate_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> > - goto out;
> > - }
> > + if (global_default_env != PM_ENV_NOT_SET)
> > + return global_power_core_ops->init(lcore_id);
> >
> > - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power
> management...");
> > - ret = power_amd_pstate_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> > - goto out;
> > - }
> > + POWER_LOG(INFO, "Env isn't set yet!");
> remove this log?
> >
> > - POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
> > - ret = power_cppc_cpufreq_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> > - goto out;
> > - }
> > + /* Auto detect Environment */
> > + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> > + if (ops) {
> > + POWER_LOG(INFO,
> > + "Attempting to initialise %s cpufreq power management...",
> > + ops->name);
> > + if (ops->init(lcore_id) == 0) {
> > + for (env = 0; env < RTE_DIM(power_env_str); env++)
> > + if (strncmp(ops->name, power_env_str[env],
> > + RTE_POWER_DRIVER_NAMESZ) == 0) {
> > + rte_power_set_env(env);
> > + return 0;
> > + }
> > + }
> > + }
> Can we change the logic of rte_power_set_env()? like:
> RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
> for (env = 0; env < RTE_DIM(power_env_str); env++) {
> if (strncmp(ops->name, power_env_str[env],
> RTE_POWER_DRIVER_NAMESZ) == 0 &&
> ops->init(lcore_id) == 0) {
> global_power_core_ops = ops;
> global_default_env = env;
> }
> }
> }
> That is easier to follow code.
Yes, will fix in next version.
> > +
> > + POWER_LOG(ERR,
> > + "Unable to set Power Management Environment for lcore %u",
> > + lcore_id);
> >
> > - POWER_LOG(INFO, "Attempting to initialise VM power management...");
> > - ret = power_kvm_vm_init(lcore_id);
> > - if (ret == 0) {
> > - rte_power_set_env(PM_ENV_KVM_VM);
> > - goto out;
> > - }
> > - POWER_LOG(ERR, "Unable to set Power Management Environment for lcore
> "
> > - "%u", lcore_id);
> > -out:
> > - return ret;
> > + return -1;
> > }
> >
> > int
> > rte_power_exit(unsigned int lcore_id)
> > {
> > - switch (global_default_env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_exit(lcore_id);
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_exit(lcore_id);
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_exit(lcore_id);
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_exit(lcore_id);
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_exit(lcore_id);
> > - default:
> > - POWER_LOG(ERR, "Environment has not been set, unable to exit
> gracefully");
> > + if (global_default_env != PM_ENV_NOT_SET)
> > + return global_power_core_ops->exit(lcore_id);
> >
> > - }
> > - return -1;
> > + POWER_LOG(ERR,
> > + "Environment has not been set, unable to exit
> > + gracefully");
> >
> > + return -1;
> > }
> > diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
> > 4fa4afe399..5e4aacf08b 100644
> > --- a/lib/power/rte_power.h
> > +++ b/lib/power/rte_power.h
> > @@ -1,5 +1,6 @@
> > /* SPDX-License-Identifier: BSD-3-Clause
> > * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> > */
> >
> > #ifndef _RTE_POWER_H
> > @@ -14,14 +15,21 @@
> > #include <rte_log.h>
> > #include <rte_power_guest_channel.h>
> >
> > +#include "rte_power_core_ops.h"
> > +
> > #ifdef __cplusplus
> > extern "C" {
> > #endif
> >
> > /* Power Management Environment State */ -enum power_management_env
> > {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> > - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> > - PM_ENV_AMD_PSTATE_CPUFREQ};
> > +enum power_management_env {
> > + PM_ENV_NOT_SET = 0,
> > + PM_ENV_ACPI_CPUFREQ,
> > + PM_ENV_KVM_VM,
> > + PM_ENV_PSTATE_CPUFREQ,
> > + PM_ENV_CPPC_CPUFREQ,
> > + PM_ENV_AMD_PSTATE_CPUFREQ
> > +};
> >
> > /**
> > * Check if a specific power management environment type is
> > supported on a @@ -66,6 +74,15 @@ void rte_power_unset_env(void);
> > */
> > enum power_management_env rte_power_get_env(void);
>
> I'd like to let user not know used which cpufreq driver, which is friendly to user.
>
> So we can rethink if this API is necessary.
For any API changes, could we handle this as a separate RFC for discussion?
It’s important that these changes are not included within the scope of this patch.
>
> >
> > +/**
> > + * @internal Get the power ops struct from its index.
> > + *
> > + * @return
> > + * The pointer to the ops struct in the table if registered.
> > + */
> > +struct rte_power_core_ops *
> > +rte_power_get_core_ops(void);
> > +
> > /**
> > * Initialize power management for a specific lcore. If rte_power_set_env() has
> > * not been called then an auto-detect of the environment will start
> > and @@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
> > * @return
> > * The number of available frequencies.
> > */
> > -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> > - uint32_t num);
> > +static inline uint32_t
> > +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >
> > -extern rte_power_freqs_t rte_power_freqs;
> > + return ops->get_avail_freqs(lcore_id, freqs, n); }
> >
> > /**
> > * Return the current index of available frequencies of a specific lcore.
> > @@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
> > * @return
> > * The current index of available frequencies.
> > */
> > -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> > +static inline uint32_t
> > +rte_power_get_freq(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >
> > -extern rte_power_get_freq_t rte_power_get_freq;
> > + return ops->get_freq(lcore_id);
> > +}
> >
> > /**
> > * Set the new frequency for a specific lcore by indicating the
> > index of @@ -144,82 +168,101 @@ extern rte_power_get_freq_t
> rte_power_get_freq;
> > * - 0 on success without frequency changed.
> > * - Negative on error.
> > */
> > -typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t
> > index);
> > -
> > -extern rte_power_set_freq_t rte_power_set_freq;
> > +static inline uint32_t
> > +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >
> > -/**
> > - * Function pointer definition for generic frequency change
> > functions. Review
> > - * each environments specific documentation for usage.
> > - *
> > - * @param lcore_id
> > - * lcore id.
> > - *
> > - * @return
> > - * - 1 on success with frequency changed.
> > - * - 0 on success without frequency changed.
> > - * - Negative on error.
> > - */
> > -typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> > + return ops->set_freq(lcore_id, index); }
> >
> > /**
> > * Scale up the frequency of a specific lcore according to the available
> > * frequencies.
> > * Review each environments specific documentation for usage.
> > */
> > -extern rte_power_freq_change_t rte_power_freq_up;
> > +static inline int
> > +rte_power_freq_up(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> > +
> > + return ops->freq_up(lcore_id);
> > +}
> >
> > /**
> > * Scale down the frequency of a specific lcore according to the available
> > * frequencies.
> > * Review each environments specific documentation for usage.
> > */
> > -extern rte_power_freq_change_t rte_power_freq_down;
> > +static inline int
> > +rte_power_freq_down(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> > +
> > + return ops->freq_down(lcore_id); }
> >
> > /**
> > * Scale up the frequency of a specific lcore to the highest according to the
> > * available frequencies.
> > * Review each environments specific documentation for usage.
> > */
> > -extern rte_power_freq_change_t rte_power_freq_max;
> > +static inline int
> > +rte_power_freq_max(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> > +
> > + return ops->freq_max(lcore_id);
> > +}
> >
> > /**
> > * Scale down the frequency of a specific lcore to the lowest according to the
> > * available frequencies.
> > * Review each environments specific documentation for usage..
> > */
> > -extern rte_power_freq_change_t rte_power_freq_min;
> > +static inline int
> > +rte_power_freq_min(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> > +
> > + return ops->freq_min(lcore_id);
> > +}
> >
> > /**
> > * Query the Turbo Boost status of a specific lcore.
> > * Review each environments specific documentation for usage..
> > */
> > -extern rte_power_freq_change_t rte_power_turbo_status;
> > +static inline int
> > +rte_power_turbo_status(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> > +
> > + return ops->turbo_status(lcore_id); }
> >
> > /**
> > * Enable Turbo Boost for this lcore.
> > * Review each environments specific documentation for usage..
> > */
> > -extern rte_power_freq_change_t rte_power_freq_enable_turbo;
> > +static inline int
> > +rte_power_freq_enable_turbo(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> > +
> > + return ops->enable_turbo(lcore_id); }
> >
> > /**
> > * Disable Turbo Boost for this lcore.
> > * Review each environments specific documentation for usage..
> > */
> > -extern rte_power_freq_change_t rte_power_freq_disable_turbo;
> > +static inline int
> > +rte_power_freq_disable_turbo(unsigned int lcore_id) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >
> > -/**
> > - * Power capabilities summary.
> > - */
> > -struct rte_power_core_capabilities {
> > - union {
> > - uint64_t capabilities;
> > - struct {
> > - uint64_t turbo:1; /**< Turbo can be enabled. */
> > - uint64_t priority:1; /**< SST-BF high freq core */
> > - };
> > - };
> > -};
> > + return ops->disable_turbo(lcore_id); }
> >
> > /**
> > * Returns power capabilities for a specific lcore.
> > @@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
> > * - 0 on success.
> > * - Negative on error.
> > */
> > -typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
> > - struct rte_power_core_capabilities *caps);
> > +static inline int
> > +rte_power_get_capabilities(unsigned int lcore_id,
> > + struct rte_power_core_capabilities *caps) {
> > + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >
> > -extern rte_power_get_capabilities_t rte_power_get_capabilities;
> > + return ops->get_caps(lcore_id, caps); }
> >
> > #ifdef __cplusplus
> > }
> > diff --git a/lib/power/rte_power_core_ops.h
> > b/lib/power/rte_power_core_ops.h new file mode 100644 index
> > 0000000000..356a64df79
> > --- /dev/null
> > +++ b/lib/power/rte_power_core_ops.h
> > @@ -0,0 +1,208 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> > + */
> > +
> > +#ifndef _RTE_POWER_CORE_OPS_H
> > +#define _RTE_POWER_CORE_OPS_H
> > +
> suggest rename the file as rte_power_cpufreq_api.h.
> If so, the role of this file is more clearly.
> > +__rte_internal
> > +int rte_power_register_ops(struct rte_power_core_ops *ops);
> > +
> > +/**
> > + * Macro to statically register the ops of a cpufreq driver.
> > + */
> > +#define RTE_POWER_REGISTER_OPS(ops) \
> > + RTE_INIT(power_hdlr_init_##ops) \
> > + { \
> > + rte_power_register_ops(&ops); \
> > + }
> > +
> > +/**
> > + * @internal Get the power ops struct from its index.
> > + *
> > + * @return
> > + * The pointer to the ops struct in the table if registered.
> > + */
> > +struct rte_power_core_ops *
> > +rte_power_get_core_ops(void);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif
> > diff --git a/lib/power/version.map b/lib/power/version.map index
> > c9a226614e..bd64e0828f 100644
> > --- a/lib/power/version.map
> > +++ b/lib/power/version.map
> > @@ -51,4 +51,18 @@ EXPERIMENTAL {
> > rte_power_set_uncore_env;
> > rte_power_uncore_freqs;
> > rte_power_unset_uncore_env;
> > + # added in 24.07
> 24.07-->24.11?
> > + rte_power_logtype;
> > +};
> > +
> > +INTERNAL {
> > + global:
> > +
> > + rte_power_register_ops;
> > + cpufreq_check_scaling_driver;
> > + power_set_governor;
> > + open_core_sysfs_file;
> > + read_core_sysfs_u32;
> > + read_core_sysfs_s;
> > + write_core_sysfs_s;
> > };
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 1/4] power: refactor core power management library
2024-09-12 11:17 ` Tummala, Sivaprasad
@ 2024-09-13 7:34 ` lihuisong (C)
2024-09-18 8:37 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-09-13 7:34 UTC (permalink / raw)
To: Tummala, Sivaprasad
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau,
cristian.dumitrescu, jerinj, konstantin.ananyev, Yigit, Ferruh,
gakhil
在 2024/9/12 19:17, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Huisong,
>
> Please find my response inline.
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Tuesday, August 27, 2024 1:51 PM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
>> radu.nicolau@intel.com; cristian.dumitrescu@intel.com; jerinj@marvell.com;
>> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
>> gakhil@marvell.com
>> Subject: Re: [PATCH v2 1/4] power: refactor core power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> Hi Sivaprasa,
>>
>> Some comments inline.
>>
>> /Huisong
>>
>> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
>>> This patch introduces a comprehensive refactor to the core power
>>> management library. The primary focus is on improving modularity and
>>> organization by relocating specific driver implementations from the
>>> 'lib/power' directory to dedicated directories within
>>> 'drivers/power/core/*'. The adjustment of meson.build files enables
>>> the selective activation of individual drivers.
>>> These changes contribute to a significant enhancement in code
>>> organization, providing a clearer structure for driver implementations.
>>> The refactor aims to improve overall code clarity and boost
>>> maintainability. Additionally, it establishes a foundation for future
>>> development, allowing for more focused work on individual drivers and
>>> seamless integration of forthcoming enhancements.
>>>
>>> v2:
>>> - added NULL check for global_core_ops in rte_power_get_core_ops
>>>
>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>> ---
>>> drivers/meson.build | 1 +
>>> .../power/acpi/acpi_cpufreq.c | 22 +-
>>> .../power/acpi/acpi_cpufreq.h | 6 +-
>>> drivers/power/acpi/meson.build | 10 +
>>> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
>>> .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
>>> drivers/power/amd_pstate/meson.build | 10 +
>>> .../power/cppc/cppc_cpufreq.c | 22 +-
>>> .../power/cppc/cppc_cpufreq.h | 8 +-
>>> drivers/power/cppc/meson.build | 10 +
>>> .../power/kvm_vm}/guest_channel.c | 0
>>> .../power/kvm_vm}/guest_channel.h | 0
>>> .../power/kvm_vm/kvm_vm.c | 22 +-
>>> .../power/kvm_vm/kvm_vm.h | 6 +-
>>> drivers/power/kvm_vm/meson.build | 16 +
>>> drivers/power/meson.build | 12 +
>>> drivers/power/pstate/meson.build | 10 +
>>> .../power/pstate/pstate_cpufreq.c | 22 +-
>>> .../power/pstate/pstate_cpufreq.h | 6 +-
>>> lib/power/meson.build | 7 +-
>>> lib/power/power_common.c | 2 +-
>>> lib/power/power_common.h | 16 +-
>>> lib/power/rte_power.c | 291 ++++++------------
>>> lib/power/rte_power.h | 139 ++++++---
>>> lib/power/rte_power_core_ops.h | 208 +++++++++++++
>>> lib/power/version.map | 14 +
>>> 26 files changed, 621 insertions(+), 271 deletions(-)
>>> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c
>> (95%)
>>> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h
>> (98%)
>>> create mode 100644 drivers/power/acpi/meson.build
>>> rename lib/power/power_amd_pstate_cpufreq.c =>
>> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
>>> rename lib/power/power_amd_pstate_cpufreq.h =>
>> drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
>>> create mode 100644 drivers/power/amd_pstate/meson.build
>>> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c
>> (95%)
>>> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h
>> (97%)
>>> create mode 100644 drivers/power/cppc/meson.build
>>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
>>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
>>> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
>>> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
>>> create mode 100644 drivers/power/kvm_vm/meson.build
>>> create mode 100644 drivers/power/meson.build
>>> create mode 100644 drivers/power/pstate/meson.build
>>> rename lib/power/power_pstate_cpufreq.c =>
>> drivers/power/pstate/pstate_cpufreq.c (96%)
>>> rename lib/power/power_pstate_cpufreq.h =>
>> drivers/power/pstate/pstate_cpufreq.h (98%)
>>> create mode 100644 lib/power/rte_power_core_ops.h
>> How about use the following directory structure?
>> *For power libs*
>> lib/power/power_common.*
>> lib/power/rte_power_pmd_mgmt.*
>> lib/power/rte_power_cpufreq_api.* (replacing rte_power.c file maybe simple for us.
>> but I'm not sure if we can put the init of core, uncore and pmd mgmt to
>> rte_power_init.c in rte_power.c.)
>> lib/power/rte_power_uncore_freq_api.*
> Yes, renaming rte_power.c is definitely a possible incremental change that could be considered later.
> However, for the time being, our focus will be on refactoring the cpufreq drivers only.
The rte_power.c just works for the initialization of cpufreq driver. Now
that you are reworking core and uncore power library and rearrange the
directory under power.
I think renaming this file name should be more appropriate in this series.
>> *And has directories under drivers/power:*
>> 1> For core dvfs driver:
>> drivers/power/cpufreq/acpi_cpufreq.c
>> drivers/power/cpufreq/cppc_cpufreq.c
>> drivers/power/cpufreq/amd_pstate_cpufreq.c
>> drivers/power/cpufreq/intel_pstate_cpufreq.c
>> drivers/power/cpufreq/kvm_cpufreq.c
>> The code of each cpufreq driver is not too much and doesn't probably increase. So
>> don't need to use a directory for it.
>>
>> 2> For uncore dvfs driver:
>> drivers/power/uncorefreq/intel_uncore.*
>>> diff --git a/drivers/meson.build b/drivers/meson.build index
>>> 66931d4241..9d77e0deab 100644
>>> --- a/drivers/meson.build
>>> +++ b/drivers/meson.build
>>> @@ -29,6 +29,7 @@ subdirs = [
>>> 'event', # depends on common, bus, mempool and net.
>>> 'baseband', # depends on common and bus.
>>> 'gpu', # depends on common and bus.
>>> + 'power', # depends on common (in future).
>>> ]
>>>
>>> if meson.is_cross_build()
>>> diff --git a/lib/power/power_acpi_cpufreq.c
>>> b/drivers/power/acpi/acpi_cpufreq.c
>>> similarity index 95%
>>> rename from lib/power/power_acpi_cpufreq.c rename to
>>> drivers/power/acpi/acpi_cpufreq.c
>> do not suggest to create one directory for each cpufreq driver.
>> Because pstate drivers also comply with ACPI spec, right?
>> In addition, the code of each cpufreq drivers are not too much.
>> There is just one file under one directory which is not good.
> One of our objectives for the refactoring is to selectively disable non-essential drivers using Meson build options.
> However, by rearranging the driver structure, we risk disrupting this capability.
I get your purpose.
The cpufreq library has the feature and interface to detect which driver
to use, right?
So it is not necessary for cpufreq library to introduce the Meson build
options, which probably makes it complicate.
>>> index 81996e1c13..8637c69703 100644
>>> --- a/lib/power/power_acpi_cpufreq.c
>>> +++ b/drivers/power/acpi/acpi_cpufreq.c
>>> @@ -10,7 +10,7 @@
>>> #include <rte_stdatomic.h>
>>> #include <rte_string_fns.h>
>>>
>>> -#include "power_acpi_cpufreq.h"
>>> +#include "acpi_cpufreq.h"
>>> #include "power_common.h"
>>>
>> <...>
>>> +if not is_linux
>>> + build = false
>>> + reason = 'only supported on Linux'
>>> +endif
>>> +sources = files('pstate_cpufreq.c')
>>> +
>>> +deps += ['power']
>>> diff --git a/lib/power/power_pstate_cpufreq.c
>>> b/drivers/power/pstate/pstate_cpufreq.c
>>> similarity index 96%
>>> rename from lib/power/power_pstate_cpufreq.c rename to
>>> drivers/power/pstate/pstate_cpufreq.c
>> pstate_cpufreq.c is actually intel_pstate cpufreq driver, right?
>> So how about modify this file name to intel_pstate_cpufreq.c?
> Yes, will fix this in next version.
>>> index 2343121621..c32b1adabc 100644
>>> --- a/lib/power/power_pstate_cpufreq.c
>>> +++ b/drivers/power/pstate/pstate_cpufreq.c
>>> @@ -15,7 +15,7 @@
>>> #include <rte_stdatomic.h>
>>>
>>> #include "rte_power_pmd_mgmt.h"
>>> -#include "power_pstate_cpufreq.h"
>>> +#include "pstate_cpufreq.h"
>>> #include "power_common.h"
>>>
>>> /* macros used for rounding frequency to nearest 100000 */ @@ -888,3
>>> +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
>>>
>>> return 0;
>>> }
>>> +
>> <...>
>>> diff --git a/lib/power/power_common.c b/lib/power/power_common.c index
>>> 590986d5ef..6c06411e8b 100644
>>> --- a/lib/power/power_common.c
>>> +++ b/lib/power/power_common.c
>>> @@ -12,7 +12,7 @@
>>>
>>> #include "power_common.h"
>>>
>>> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
>>> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>>>
>>> #define POWER_SYSFILE_SCALING_DRIVER \
>>> "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
>>> diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
>>> 83f742f42a..767686ee12 100644
>>> --- a/lib/power/power_common.h
>>> +++ b/lib/power/power_common.h
>>> @@ -6,12 +6,13 @@
>>> #define _POWER_COMMON_H_
>>>
>>> #include <rte_common.h>
>>> +#include <rte_compat.h>
>>> #include <rte_log.h>
>>>
>>> #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>>>
>>> -extern int power_logtype;
>>> -#define RTE_LOGTYPE_POWER power_logtype
>>> +extern int rte_power_logtype;
>>> +#define RTE_LOGTYPE_POWER rte_power_logtype
>>> #define POWER_LOG(level, ...) \
>>> RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>>>
>>> @@ -23,13 +24,24 @@ extern int power_logtype;
>>> #endif
>>>
>>> /* check if scaling driver matches one we want */
>>> +__rte_internal
>>> int cpufreq_check_scaling_driver(const char *driver);
>>> +
>>> +__rte_internal
>>> int power_set_governor(unsigned int lcore_id, const char *new_governor,
>>> char *orig_governor, size_t orig_governor_len);
>> suggest that move cpufreq interfaces like this to the
>> rte_power_cpufreq_api.* I proposed above.
> This is an internal API and isn’t intended for direct use by applications.
> By moving it to rte_power_*, we risk exposing it inadvertently.
we don't expose these to applications. application do not include this
header file.
power_set_governor() and cpufreq_check_scaling_driver() is just used by
cpufreq driver. So they just can be seen by cpufreq lib or module, right?
But if these interface are in power_common.h, pmd_mgmt and uncore driver
also include this header file and can see them. This is not good.
AFAIS, the power_common.h should just contain the kind of interfaces
that are used by all power libs or sub-modules, like cpufreq, uncore,
pmd_mgmt and so on.
>
>> The interfaces in power_comm.* can be used by all power modules, like
>> core/uncore/pmd mgmt.
>>> +
>>> +__rte_internal
>>> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
>>> __rte_format_printf(3, 4);
>>> +
>>> +__rte_internal
>>> int read_core_sysfs_u32(FILE *f, uint32_t *val);
>>> +
>>> +__rte_internal
>>> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
>>> +
>>> +__rte_internal
>>> int write_core_sysfs_s(FILE *f, const char *str);
>>>
>>> #endif /* _POWER_COMMON_H_ */
>>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
>> The name of the rte_power.c file is impropriate now. The context in this file is just for
>> cpufreq, right?
>> So I suggest that we need to rename this file as the rte_power_cpufreq_api.c
> Yes, renaming rte_power.c to rte_power_cpufreq.c is definitely a possible incremental change
> and will fix this as a separate patch.
> .
>
>>> index 36c3f3da98..2bf6d40517 100644
>>> --- a/lib/power/rte_power.c
>>> +++ b/lib/power/rte_power.c
>>> @@ -8,153 +8,86 @@
>>> #include <rte_spinlock.h>
>>>
>>> #include "rte_power.h"
>>> -#include "power_acpi_cpufreq.h"
>>> -#include "power_cppc_cpufreq.h"
>>> #include "power_common.h"
>>> -#include "power_kvm_vm.h"
>>> -#include "power_pstate_cpufreq.h"
>>> -#include "power_amd_pstate_cpufreq.h"
>>>
>>> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
>>> +static enum power_management_env global_default_env =
>> PM_ENV_NOT_SET;
>>> +static struct rte_power_core_ops *global_power_core_ops;
>>>
>>> static rte_spinlock_t global_env_cfg_lock =
>>> RTE_SPINLOCK_INITIALIZER;
>>> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
>>> + TAILQ_HEAD_INITIALIZER(core_ops_list);
>>>
>>> -/* function pointers */
>>> -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
>>> rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
>>> NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
>>> -rte_power_freq_change_t rte_power_freq_down = NULL;
>>> -rte_power_freq_change_t rte_power_freq_max = NULL;
>>> -rte_power_freq_change_t rte_power_freq_min = NULL;
>>> -rte_power_freq_change_t rte_power_turbo_status;
>>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
>>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
>>> -rte_power_get_capabilities_t rte_power_get_capabilities;
>>> -
>>> -static void
>>> -reset_power_function_ptrs(void)
>>> +
>>> +const char *power_env_str[] = {
>>> + "not set",
>>> + "acpi",
>>> + "kvm-vm",
>>> + "pstate",
>>> + "cppc",
>>> + "amd-pstate"
>>> +};
>>> +
>>> +/* register the ops struct in rte_power_core_ops, return 0 on
>>> +success. */ int rte_power_register_ops(struct rte_power_core_ops
>>> +*driver_ops)
>>> {
>>> - rte_power_freqs = NULL;
>>> - rte_power_get_freq = NULL;
>>> - rte_power_set_freq = NULL;
>>> - rte_power_freq_up = NULL;
>>> - rte_power_freq_down = NULL;
>>> - rte_power_freq_max = NULL;
>>> - rte_power_freq_min = NULL;
>>> - rte_power_turbo_status = NULL;
>>> - rte_power_freq_enable_turbo = NULL;
>>> - rte_power_freq_disable_turbo = NULL;
>>> - rte_power_get_capabilities = NULL;
>>> + if (!driver_ops->init || !driver_ops->exit ||
>>> + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
>>> + !driver_ops->get_freq || !driver_ops->set_freq ||
>>> + !driver_ops->freq_up || !driver_ops->freq_down ||
>>> + !driver_ops->freq_max || !driver_ops->freq_min ||
>>> + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
>>> + !driver_ops->disable_turbo || !driver_ops->get_caps) {
>>> + POWER_LOG(ERR, "Missing callbacks while registering
>>> + power ops");
>> turbo_status(), enable_turbo() and disable turbo() are not necessary, right?
> Nope, this is required to get the current status unlike the capability API (get_caps()).
ok
>> These depand on the capabilities from get_caps().
>>> + return -EINVAL;
>>> + }
>>> +
>>> + TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
>>> +
>>> + return 0;
>>> }
>>>
>>> int
>>> rte_power_check_env_supported(enum power_management_env env)
>>> {
>>> - switch (env) {
>>> - case PM_ENV_ACPI_CPUFREQ:
>>> - return power_acpi_cpufreq_check_supported();
>>> - case PM_ENV_PSTATE_CPUFREQ:
>>> - return power_pstate_cpufreq_check_supported();
>>> - case PM_ENV_KVM_VM:
>>> - return power_kvm_vm_check_supported();
>>> - case PM_ENV_CPPC_CPUFREQ:
>>> - return power_cppc_cpufreq_check_supported();
>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>> - return power_amd_pstate_cpufreq_check_supported();
>>> - default:
>>> - rte_errno = EINVAL;
>>> - return -1;
>>> - }
>>> + struct rte_power_core_ops *ops;
>>> +
>>> + if (env >= RTE_DIM(power_env_str))
>>> + return 0;
>>> +
>>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
>>> + if (strncmp(ops->name, power_env_str[env],
>>> + RTE_POWER_DRIVER_NAMESZ) == 0)
>>> + return ops->check_env_support();
>>> +
>>> + return 0;
>>> }
>>>
>>> int
>>> rte_power_set_env(enum power_management_env env)
>>> {
>>> + struct rte_power_core_ops *ops;
>>> + int ret = -1;
>>> +
>>> rte_spinlock_lock(&global_env_cfg_lock);
>>>
>>> if (global_default_env != PM_ENV_NOT_SET) {
>>> POWER_LOG(ERR, "Power Management Environment already set.");
>>> - rte_spinlock_unlock(&global_env_cfg_lock);
>>> - return -1;
>>> - }
>>> -
>> <...>
>>> - if (ret == 0)
>>> - global_default_env = env;
>>> - else {
>>> - global_default_env = PM_ENV_NOT_SET;
>>> - reset_power_function_ptrs();
>>> - }
>>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
>>> + if (strncmp(ops->name, power_env_str[env],
>>> + RTE_POWER_DRIVER_NAMESZ) == 0) {
>>> + global_power_core_ops = ops;
>>> + global_default_env = env;
>>> + ret = 0;
>>> + goto out;
>>> + }
>>> + POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
>>> + env);
>>>
>>> +out:
>>> rte_spinlock_unlock(&global_env_cfg_lock);
>>> return ret;
>>> }
>>> @@ -164,94 +97,66 @@ rte_power_unset_env(void)
>>> {
>>> rte_spinlock_lock(&global_env_cfg_lock);
>>> global_default_env = PM_ENV_NOT_SET;
>>> - reset_power_function_ptrs();
>>> + global_power_core_ops = NULL;
>>> rte_spinlock_unlock(&global_env_cfg_lock);
>>> }
>>>
>>> enum power_management_env
>>> -rte_power_get_env(void) {
>>> +rte_power_get_env(void)
>>> +{
>>> return global_default_env;
>>> }
>>>
>>> -int
>>> -rte_power_init(unsigned int lcore_id)
>>> +struct rte_power_core_ops *
>>> +rte_power_get_core_ops(void)
>>> {
>>> - int ret = -1;
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>>
>>> - switch (global_default_env) {
>>> - case PM_ENV_ACPI_CPUFREQ:
>>> - return power_acpi_cpufreq_init(lcore_id);
>>> - case PM_ENV_KVM_VM:
>>> - return power_kvm_vm_init(lcore_id);
>>> - case PM_ENV_PSTATE_CPUFREQ:
>>> - return power_pstate_cpufreq_init(lcore_id);
>>> - case PM_ENV_CPPC_CPUFREQ:
>>> - return power_cppc_cpufreq_init(lcore_id);
>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>> - return power_amd_pstate_cpufreq_init(lcore_id);
>>> - default:
>>> - POWER_LOG(INFO, "Env isn't set yet!");
>>> - }
>>> + return global_power_core_ops;
>>> +}
>>>
>>> - /* Auto detect Environment */
>>> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
>> management...");
>>> - ret = power_acpi_cpufreq_init(lcore_id);
>>> - if (ret == 0) {
>>> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
>>> - goto out;
>>> - }
>>> +int
>>> +rte_power_init(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops;
>>> + uint8_t env;
>>>
>>> - POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
>>> - ret = power_pstate_cpufreq_init(lcore_id);
>>> - if (ret == 0) {
>>> - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
>>> - goto out;
>>> - }
>>> + if (global_default_env != PM_ENV_NOT_SET)
>>> + return global_power_core_ops->init(lcore_id);
>>>
>>> - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power
>> management...");
>>> - ret = power_amd_pstate_cpufreq_init(lcore_id);
>>> - if (ret == 0) {
>>> - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
>>> - goto out;
>>> - }
>>> + POWER_LOG(INFO, "Env isn't set yet!");
>> remove this log?
>>> - POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
>>> - ret = power_cppc_cpufreq_init(lcore_id);
>>> - if (ret == 0) {
>>> - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
>>> - goto out;
>>> - }
>>> + /* Auto detect Environment */
>>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
>>> + if (ops) {
>>> + POWER_LOG(INFO,
>>> + "Attempting to initialise %s cpufreq power management...",
>>> + ops->name);
>>> + if (ops->init(lcore_id) == 0) {
>>> + for (env = 0; env < RTE_DIM(power_env_str); env++)
>>> + if (strncmp(ops->name, power_env_str[env],
>>> + RTE_POWER_DRIVER_NAMESZ) == 0) {
>>> + rte_power_set_env(env);
>>> + return 0;
>>> + }
>>> + }
>>> + }
>> Can we change the logic of rte_power_set_env()? like:
>> RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
>> for (env = 0; env < RTE_DIM(power_env_str); env++) {
>> if (strncmp(ops->name, power_env_str[env],
>> RTE_POWER_DRIVER_NAMESZ) == 0 &&
>> ops->init(lcore_id) == 0) {
>> global_power_core_ops = ops;
>> global_default_env = env;
>> }
>> }
>> }
>> That is easier to follow code.
> Yes, will fix in next version.
>
>>> +
>>> + POWER_LOG(ERR,
>>> + "Unable to set Power Management Environment for lcore %u",
>>> + lcore_id);
>>>
>>> - POWER_LOG(INFO, "Attempting to initialise VM power management...");
>>> - ret = power_kvm_vm_init(lcore_id);
>>> - if (ret == 0) {
>>> - rte_power_set_env(PM_ENV_KVM_VM);
>>> - goto out;
>>> - }
>>> - POWER_LOG(ERR, "Unable to set Power Management Environment for lcore
>> "
>>> - "%u", lcore_id);
>>> -out:
>>> - return ret;
>>> + return -1;
>>> }
>>>
>>> int
>>> rte_power_exit(unsigned int lcore_id)
>>> {
>>> - switch (global_default_env) {
>>> - case PM_ENV_ACPI_CPUFREQ:
>>> - return power_acpi_cpufreq_exit(lcore_id);
>>> - case PM_ENV_KVM_VM:
>>> - return power_kvm_vm_exit(lcore_id);
>>> - case PM_ENV_PSTATE_CPUFREQ:
>>> - return power_pstate_cpufreq_exit(lcore_id);
>>> - case PM_ENV_CPPC_CPUFREQ:
>>> - return power_cppc_cpufreq_exit(lcore_id);
>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>> - return power_amd_pstate_cpufreq_exit(lcore_id);
>>> - default:
>>> - POWER_LOG(ERR, "Environment has not been set, unable to exit
>> gracefully");
>>> + if (global_default_env != PM_ENV_NOT_SET)
>>> + return global_power_core_ops->exit(lcore_id);
>>>
>>> - }
>>> - return -1;
>>> + POWER_LOG(ERR,
>>> + "Environment has not been set, unable to exit
>>> + gracefully");
>>>
>>> + return -1;
>>> }
>>> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
>>> 4fa4afe399..5e4aacf08b 100644
>>> --- a/lib/power/rte_power.h
>>> +++ b/lib/power/rte_power.h
>>> @@ -1,5 +1,6 @@
>>> /* SPDX-License-Identifier: BSD-3-Clause
>>> * Copyright(c) 2010-2014 Intel Corporation
>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>> */
>>>
>>> #ifndef _RTE_POWER_H
>>> @@ -14,14 +15,21 @@
>>> #include <rte_log.h>
>>> #include <rte_power_guest_channel.h>
>>>
>>> +#include "rte_power_core_ops.h"
>>> +
>>> #ifdef __cplusplus
>>> extern "C" {
>>> #endif
>>>
>>> /* Power Management Environment State */ -enum power_management_env
>>> {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
>>> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
>>> - PM_ENV_AMD_PSTATE_CPUFREQ};
>>> +enum power_management_env {
>>> + PM_ENV_NOT_SET = 0,
>>> + PM_ENV_ACPI_CPUFREQ,
>>> + PM_ENV_KVM_VM,
>>> + PM_ENV_PSTATE_CPUFREQ,
>>> + PM_ENV_CPPC_CPUFREQ,
>>> + PM_ENV_AMD_PSTATE_CPUFREQ
>>> +};
>>>
>>> /**
>>> * Check if a specific power management environment type is
>>> supported on a @@ -66,6 +74,15 @@ void rte_power_unset_env(void);
>>> */
>>> enum power_management_env rte_power_get_env(void);
>> I'd like to let user not know used which cpufreq driver, which is friendly to user.
>>
>> So we can rethink if this API is necessary.
> For any API changes, could we handle this as a separate RFC for discussion?
> It’s important that these changes are not included within the scope of this patch.
Agreed.
Can you post a separate RFC to disscuss this improvement later?
>>> +/**
>>> + * @internal Get the power ops struct from its index.
>>> + *
>>> + * @return
>>> + * The pointer to the ops struct in the table if registered.
>>> + */
>>> +struct rte_power_core_ops *
>>> +rte_power_get_core_ops(void);
>>> +
>>> /**
>>> * Initialize power management for a specific lcore. If rte_power_set_env() has
>>> * not been called then an auto-detect of the environment will start
>>> and @@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
>>> * @return
>>> * The number of available frequencies.
>>> */
>>> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
>>> - uint32_t num);
>>> +static inline uint32_t
>>> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>
>>> -extern rte_power_freqs_t rte_power_freqs;
>>> + return ops->get_avail_freqs(lcore_id, freqs, n); }
>>>
>>> /**
>>> * Return the current index of available frequencies of a specific lcore.
>>> @@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
>>> * @return
>>> * The current index of available frequencies.
>>> */
>>> -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
>>> +static inline uint32_t
>>> +rte_power_get_freq(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>
>>> -extern rte_power_get_freq_t rte_power_get_freq;
>>> + return ops->get_freq(lcore_id);
>>> +}
>>>
>>> /**
>>> * Set the new frequency for a specific lcore by indicating the
>>> index of @@ -144,82 +168,101 @@ extern rte_power_get_freq_t
>> rte_power_get_freq;
>>> * - 0 on success without frequency changed.
>>> * - Negative on error.
>>> */
>>> -typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t
>>> index);
>>> -
>>> -extern rte_power_set_freq_t rte_power_set_freq;
>>> +static inline uint32_t
>>> +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>
>>> -/**
>>> - * Function pointer definition for generic frequency change
>>> functions. Review
>>> - * each environments specific documentation for usage.
>>> - *
>>> - * @param lcore_id
>>> - * lcore id.
>>> - *
>>> - * @return
>>> - * - 1 on success with frequency changed.
>>> - * - 0 on success without frequency changed.
>>> - * - Negative on error.
>>> - */
>>> -typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
>>> + return ops->set_freq(lcore_id, index); }
>>>
>>> /**
>>> * Scale up the frequency of a specific lcore according to the available
>>> * frequencies.
>>> * Review each environments specific documentation for usage.
>>> */
>>> -extern rte_power_freq_change_t rte_power_freq_up;
>>> +static inline int
>>> +rte_power_freq_up(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>> +
>>> + return ops->freq_up(lcore_id);
>>> +}
>>>
>>> /**
>>> * Scale down the frequency of a specific lcore according to the available
>>> * frequencies.
>>> * Review each environments specific documentation for usage.
>>> */
>>> -extern rte_power_freq_change_t rte_power_freq_down;
>>> +static inline int
>>> +rte_power_freq_down(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>> +
>>> + return ops->freq_down(lcore_id); }
>>>
>>> /**
>>> * Scale up the frequency of a specific lcore to the highest according to the
>>> * available frequencies.
>>> * Review each environments specific documentation for usage.
>>> */
>>> -extern rte_power_freq_change_t rte_power_freq_max;
>>> +static inline int
>>> +rte_power_freq_max(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>> +
>>> + return ops->freq_max(lcore_id);
>>> +}
>>>
>>> /**
>>> * Scale down the frequency of a specific lcore to the lowest according to the
>>> * available frequencies.
>>> * Review each environments specific documentation for usage..
>>> */
>>> -extern rte_power_freq_change_t rte_power_freq_min;
>>> +static inline int
>>> +rte_power_freq_min(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>> +
>>> + return ops->freq_min(lcore_id);
>>> +}
>>>
>>> /**
>>> * Query the Turbo Boost status of a specific lcore.
>>> * Review each environments specific documentation for usage..
>>> */
>>> -extern rte_power_freq_change_t rte_power_turbo_status;
>>> +static inline int
>>> +rte_power_turbo_status(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>> +
>>> + return ops->turbo_status(lcore_id); }
>>>
>>> /**
>>> * Enable Turbo Boost for this lcore.
>>> * Review each environments specific documentation for usage..
>>> */
>>> -extern rte_power_freq_change_t rte_power_freq_enable_turbo;
>>> +static inline int
>>> +rte_power_freq_enable_turbo(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>> +
>>> + return ops->enable_turbo(lcore_id); }
>>>
>>> /**
>>> * Disable Turbo Boost for this lcore.
>>> * Review each environments specific documentation for usage..
>>> */
>>> -extern rte_power_freq_change_t rte_power_freq_disable_turbo;
>>> +static inline int
>>> +rte_power_freq_disable_turbo(unsigned int lcore_id) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>
>>> -/**
>>> - * Power capabilities summary.
>>> - */
>>> -struct rte_power_core_capabilities {
>>> - union {
>>> - uint64_t capabilities;
>>> - struct {
>>> - uint64_t turbo:1; /**< Turbo can be enabled. */
>>> - uint64_t priority:1; /**< SST-BF high freq core */
>>> - };
>>> - };
>>> -};
>>> + return ops->disable_turbo(lcore_id); }
>>>
>>> /**
>>> * Returns power capabilities for a specific lcore.
>>> @@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
>>> * - 0 on success.
>>> * - Negative on error.
>>> */
>>> -typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
>>> - struct rte_power_core_capabilities *caps);
>>> +static inline int
>>> +rte_power_get_capabilities(unsigned int lcore_id,
>>> + struct rte_power_core_capabilities *caps) {
>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>
>>> -extern rte_power_get_capabilities_t rte_power_get_capabilities;
>>> + return ops->get_caps(lcore_id, caps); }
>>>
>>> #ifdef __cplusplus
>>> }
>>> diff --git a/lib/power/rte_power_core_ops.h
>>> b/lib/power/rte_power_core_ops.h new file mode 100644 index
>>> 0000000000..356a64df79
>>> --- /dev/null
>>> +++ b/lib/power/rte_power_core_ops.h
>>> @@ -0,0 +1,208 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2010-2014 Intel Corporation
>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>> + */
>>> +
>>> +#ifndef _RTE_POWER_CORE_OPS_H
>>> +#define _RTE_POWER_CORE_OPS_H
>>> +
>> suggest rename the file as rte_power_cpufreq_api.h.
>> If so, the role of this file is more clearly.
>>> +__rte_internal
>>> +int rte_power_register_ops(struct rte_power_core_ops *ops);
>>> +
>>> +/**
>>> + * Macro to statically register the ops of a cpufreq driver.
>>> + */
>>> +#define RTE_POWER_REGISTER_OPS(ops) \
>>> + RTE_INIT(power_hdlr_init_##ops) \
>>> + { \
>>> + rte_power_register_ops(&ops); \
>>> + }
>>> +
>>> +/**
>>> + * @internal Get the power ops struct from its index.
>>> + *
>>> + * @return
>>> + * The pointer to the ops struct in the table if registered.
>>> + */
>>> +struct rte_power_core_ops *
>>> +rte_power_get_core_ops(void);
>>> +
>>> +#ifdef __cplusplus
>>> +}
>>> +#endif
>>> +
>>> +#endif
>>> diff --git a/lib/power/version.map b/lib/power/version.map index
>>> c9a226614e..bd64e0828f 100644
>>> --- a/lib/power/version.map
>>> +++ b/lib/power/version.map
>>> @@ -51,4 +51,18 @@ EXPERIMENTAL {
>>> rte_power_set_uncore_env;
>>> rte_power_uncore_freqs;
>>> rte_power_unset_uncore_env;
>>> + # added in 24.07
>> 24.07-->24.11?
>>> + rte_power_logtype;
>>> +};
>>> +
>>> +INTERNAL {
>>> + global:
>>> +
>>> + rte_power_register_ops;
>>> + cpufreq_check_scaling_driver;
>>> + power_set_governor;
>>> + open_core_sysfs_file;
>>> + read_core_sysfs_u32;
>>> + read_core_sysfs_s;
>>> + write_core_sysfs_s;
>>> };
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v2 1/4] power: refactor core power management library
2024-09-13 7:34 ` lihuisong (C)
@ 2024-09-18 8:37 ` Tummala, Sivaprasad
2024-09-19 3:37 ` lihuisong (C)
0 siblings, 1 reply; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-09-18 8:37 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau,
cristian.dumitrescu, jerinj, konstantin.ananyev, Yigit, Ferruh,
gakhil
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Friday, September 13, 2024 1:05 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> radu.nicolau@intel.com; cristian.dumitrescu@intel.com; jerinj@marvell.com;
> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> gakhil@marvell.com
> Subject: Re: [PATCH v2 1/4] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> 在 2024/9/12 19:17, Tummala, Sivaprasad 写道:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > Hi Huisong,
> >
> > Please find my response inline.
> >
> >> -----Original Message-----
> >> From: lihuisong (C) <lihuisong@huawei.com>
> >> Sent: Tuesday, August 27, 2024 1:51 PM
> >> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> >> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> >> radu.nicolau@intel.com; cristian.dumitrescu@intel.com;
> >> jerinj@marvell.com; konstantin.ananyev@huawei.com; Yigit, Ferruh
> >> <Ferruh.Yigit@amd.com>; gakhil@marvell.com
> >> Subject: Re: [PATCH v2 1/4] power: refactor core power management
> >> library
> >>
> >> Caution: This message originated from an External Source. Use proper
> >> caution when opening attachments, clicking links, or responding.
> >>
> >>
> >> Hi Sivaprasa,
> >>
> >> Some comments inline.
> >>
> >> /Huisong
> >>
> >> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
> >>> This patch introduces a comprehensive refactor to the core power
> >>> management library. The primary focus is on improving modularity and
> >>> organization by relocating specific driver implementations from the
> >>> 'lib/power' directory to dedicated directories within
> >>> 'drivers/power/core/*'. The adjustment of meson.build files enables
> >>> the selective activation of individual drivers.
> >>> These changes contribute to a significant enhancement in code
> >>> organization, providing a clearer structure for driver implementations.
> >>> The refactor aims to improve overall code clarity and boost
> >>> maintainability. Additionally, it establishes a foundation for
> >>> future development, allowing for more focused work on individual
> >>> drivers and seamless integration of forthcoming enhancements.
> >>>
> >>> v2:
> >>> - added NULL check for global_core_ops in rte_power_get_core_ops
> >>>
> >>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> >>> ---
> >>> drivers/meson.build | 1 +
> >>> .../power/acpi/acpi_cpufreq.c | 22 +-
> >>> .../power/acpi/acpi_cpufreq.h | 6 +-
> >>> drivers/power/acpi/meson.build | 10 +
> >>> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> >>> .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
> >>> drivers/power/amd_pstate/meson.build | 10 +
> >>> .../power/cppc/cppc_cpufreq.c | 22 +-
> >>> .../power/cppc/cppc_cpufreq.h | 8 +-
> >>> drivers/power/cppc/meson.build | 10 +
> >>> .../power/kvm_vm}/guest_channel.c | 0
> >>> .../power/kvm_vm}/guest_channel.h | 0
> >>> .../power/kvm_vm/kvm_vm.c | 22 +-
> >>> .../power/kvm_vm/kvm_vm.h | 6 +-
> >>> drivers/power/kvm_vm/meson.build | 16 +
> >>> drivers/power/meson.build | 12 +
> >>> drivers/power/pstate/meson.build | 10 +
> >>> .../power/pstate/pstate_cpufreq.c | 22 +-
> >>> .../power/pstate/pstate_cpufreq.h | 6 +-
> >>> lib/power/meson.build | 7 +-
> >>> lib/power/power_common.c | 2 +-
> >>> lib/power/power_common.h | 16 +-
> >>> lib/power/rte_power.c | 291 ++++++------------
> >>> lib/power/rte_power.h | 139 ++++++---
> >>> lib/power/rte_power_core_ops.h | 208 +++++++++++++
> >>> lib/power/version.map | 14 +
> >>> 26 files changed, 621 insertions(+), 271 deletions(-)
> >>> rename lib/power/power_acpi_cpufreq.c =>
> >>> drivers/power/acpi/acpi_cpufreq.c
> >> (95%)
> >>> rename lib/power/power_acpi_cpufreq.h =>
> >>> drivers/power/acpi/acpi_cpufreq.h
> >> (98%)
> >>> create mode 100644 drivers/power/acpi/meson.build
> >>> rename lib/power/power_amd_pstate_cpufreq.c =>
> >> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> >>> rename lib/power/power_amd_pstate_cpufreq.h =>
> >> drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
> >>> create mode 100644 drivers/power/amd_pstate/meson.build
> >>> rename lib/power/power_cppc_cpufreq.c =>
> >>> drivers/power/cppc/cppc_cpufreq.c
> >> (95%)
> >>> rename lib/power/power_cppc_cpufreq.h =>
> >>> drivers/power/cppc/cppc_cpufreq.h
> >> (97%)
> >>> create mode 100644 drivers/power/cppc/meson.build
> >>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> >>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> >>> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c
> (82%)
> >>> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h
> (98%)
> >>> create mode 100644 drivers/power/kvm_vm/meson.build
> >>> create mode 100644 drivers/power/meson.build
> >>> create mode 100644 drivers/power/pstate/meson.build
> >>> rename lib/power/power_pstate_cpufreq.c =>
> >> drivers/power/pstate/pstate_cpufreq.c (96%)
> >>> rename lib/power/power_pstate_cpufreq.h =>
> >> drivers/power/pstate/pstate_cpufreq.h (98%)
> >>> create mode 100644 lib/power/rte_power_core_ops.h
> >> How about use the following directory structure?
> >> *For power libs*
> >> lib/power/power_common.*
> >> lib/power/rte_power_pmd_mgmt.*
> >> lib/power/rte_power_cpufreq_api.* (replacing rte_power.c file maybe simple for
> us.
> >> but I'm not sure if we can put the init of core, uncore and pmd mgmt
> >> to rte_power_init.c in rte_power.c.)
> >> lib/power/rte_power_uncore_freq_api.*
> > Yes, renaming rte_power.c is definitely a possible incremental change that could
> be considered later.
> > However, for the time being, our focus will be on refactoring the cpufreq drivers
> only.
> The rte_power.c just works for the initialization of cpufreq driver. Now that you are
> reworking core and uncore power library and rearrange the directory under power.
> I think renaming this file name should be more appropriate in this series.
> >> *And has directories under drivers/power:*
> >> 1> For core dvfs driver:
> >> drivers/power/cpufreq/acpi_cpufreq.c
> >> drivers/power/cpufreq/cppc_cpufreq.c
> >> drivers/power/cpufreq/amd_pstate_cpufreq.c
> >> drivers/power/cpufreq/intel_pstate_cpufreq.c
> >> drivers/power/cpufreq/kvm_cpufreq.c
> >> The code of each cpufreq driver is not too much and doesn't probably
> >> increase. So don't need to use a directory for it.
> >>
> >> 2> For uncore dvfs driver:
> >> drivers/power/uncorefreq/intel_uncore.*
> >>> diff --git a/drivers/meson.build b/drivers/meson.build index
> >>> 66931d4241..9d77e0deab 100644
> >>> --- a/drivers/meson.build
> >>> +++ b/drivers/meson.build
> >>> @@ -29,6 +29,7 @@ subdirs = [
> >>> 'event', # depends on common, bus, mempool and net.
> >>> 'baseband', # depends on common and bus.
> >>> 'gpu', # depends on common and bus.
> >>> + 'power', # depends on common (in future).
> >>> ]
> >>>
> >>> if meson.is_cross_build()
> >>> diff --git a/lib/power/power_acpi_cpufreq.c
> >>> b/drivers/power/acpi/acpi_cpufreq.c
> >>> similarity index 95%
> >>> rename from lib/power/power_acpi_cpufreq.c rename to
> >>> drivers/power/acpi/acpi_cpufreq.c
> >> do not suggest to create one directory for each cpufreq driver.
> >> Because pstate drivers also comply with ACPI spec, right?
> >> In addition, the code of each cpufreq drivers are not too much.
> >> There is just one file under one directory which is not good.
> > One of our objectives for the refactoring is to selectively disable non-essential
> drivers using Meson build options.
> > However, by rearranging the driver structure, we risk disrupting this capability.
> I get your purpose.
> The cpufreq library has the feature and interface to detect which driver to use, right?
> So it is not necessary for cpufreq library to introduce the Meson build options, which
> probably makes it complicate.
In Meson, you can reduce code size by disabling specific drivers or components through build options,
allowing you to exclude unnecessary features. At runtime, the library will automatically detect the available driver,
and if it's not present in the build, initialization will fail.
We're not introducing any new complexities; rather, we aim to ensure that the drivers in drivers/power/*
are consistent with the other drivers.
> >>> index 81996e1c13..8637c69703 100644
> >>> --- a/lib/power/power_acpi_cpufreq.c
> >>> +++ b/drivers/power/acpi/acpi_cpufreq.c
> >>> @@ -10,7 +10,7 @@
> >>> #include <rte_stdatomic.h>
> >>> #include <rte_string_fns.h>
> >>>
> >>> -#include "power_acpi_cpufreq.h"
> >>> +#include "acpi_cpufreq.h"
> >>> #include "power_common.h"
> >>>
> >> <...>
> >>> +if not is_linux
> >>> + build = false
> >>> + reason = 'only supported on Linux'
> >>> +endif
> >>> +sources = files('pstate_cpufreq.c')
> >>> +
> >>> +deps += ['power']
> >>> diff --git a/lib/power/power_pstate_cpufreq.c
> >>> b/drivers/power/pstate/pstate_cpufreq.c
> >>> similarity index 96%
> >>> rename from lib/power/power_pstate_cpufreq.c rename to
> >>> drivers/power/pstate/pstate_cpufreq.c
> >> pstate_cpufreq.c is actually intel_pstate cpufreq driver, right?
> >> So how about modify this file name to intel_pstate_cpufreq.c?
> > Yes, will fix this in next version.
> >>> index 2343121621..c32b1adabc 100644
> >>> --- a/lib/power/power_pstate_cpufreq.c
> >>> +++ b/drivers/power/pstate/pstate_cpufreq.c
> >>> @@ -15,7 +15,7 @@
> >>> #include <rte_stdatomic.h>
> >>>
> >>> #include "rte_power_pmd_mgmt.h"
> >>> -#include "power_pstate_cpufreq.h"
> >>> +#include "pstate_cpufreq.h"
> >>> #include "power_common.h"
> >>>
> >>> /* macros used for rounding frequency to nearest 100000 */ @@
> >>> -888,3
> >>> +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
> >>>
> >>> return 0;
> >>> }
> >>> +
> >> <...>
> >>> diff --git a/lib/power/power_common.c b/lib/power/power_common.c
> >>> index 590986d5ef..6c06411e8b 100644
> >>> --- a/lib/power/power_common.c
> >>> +++ b/lib/power/power_common.c
> >>> @@ -12,7 +12,7 @@
> >>>
> >>> #include "power_common.h"
> >>>
> >>> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> >>> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
> >>>
> >>> #define POWER_SYSFILE_SCALING_DRIVER \
> >>> "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> >>> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
> >>> index
> >>> 83f742f42a..767686ee12 100644
> >>> --- a/lib/power/power_common.h
> >>> +++ b/lib/power/power_common.h
> >>> @@ -6,12 +6,13 @@
> >>> #define _POWER_COMMON_H_
> >>>
> >>> #include <rte_common.h>
> >>> +#include <rte_compat.h>
> >>> #include <rte_log.h>
> >>>
> >>> #define RTE_POWER_INVALID_FREQ_INDEX (~0)
> >>>
> >>> -extern int power_logtype;
> >>> -#define RTE_LOGTYPE_POWER power_logtype
> >>> +extern int rte_power_logtype;
> >>> +#define RTE_LOGTYPE_POWER rte_power_logtype
> >>> #define POWER_LOG(level, ...) \
> >>> RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
> >>>
> >>> @@ -23,13 +24,24 @@ extern int power_logtype;
> >>> #endif
> >>>
> >>> /* check if scaling driver matches one we want */
> >>> +__rte_internal
> >>> int cpufreq_check_scaling_driver(const char *driver);
> >>> +
> >>> +__rte_internal
> >>> int power_set_governor(unsigned int lcore_id, const char *new_governor,
> >>> char *orig_governor, size_t orig_governor_len);
> >> suggest that move cpufreq interfaces like this to the
> >> rte_power_cpufreq_api.* I proposed above.
> > This is an internal API and isn’t intended for direct use by applications.
> > By moving it to rte_power_*, we risk exposing it inadvertently.
> we don't expose these to applications. application do not include this header file.
> power_set_governor() and cpufreq_check_scaling_driver() is just used by cpufreq
> driver. So they just can be seen by cpufreq lib or module, right?
> But if these interface are in power_common.h, pmd_mgmt and uncore driver also
> include this header file and can see them. This is not good.
> AFAIS, the power_common.h should just contain the kind of interfaces that are used
> by all power libs or sub-modules, like cpufreq, uncore, pmd_mgmt and so on.
OK., Will move this internal APIs from power_common.h to a separate header file.
> >
> >> The interfaces in power_comm.* can be used by all power modules, like
> >> core/uncore/pmd mgmt.
> >>> +
> >>> +__rte_internal
> >>> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> >>> __rte_format_printf(3, 4);
> >>> +
> >>> +__rte_internal
> >>> int read_core_sysfs_u32(FILE *f, uint32_t *val);
> >>> +
> >>> +__rte_internal
> >>> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> >>> +
> >>> +__rte_internal
> >>> int write_core_sysfs_s(FILE *f, const char *str);
> >>>
> >>> #endif /* _POWER_COMMON_H_ */
> >>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> >> The name of the rte_power.c file is impropriate now. The context in
> >> this file is just for cpufreq, right?
> >> So I suggest that we need to rename this file as the
> >> rte_power_cpufreq_api.c
> > Yes, renaming rte_power.c to rte_power_cpufreq.c is definitely a
> > possible incremental change and will fix this as a separate patch.
> > .
> >
> >>> index 36c3f3da98..2bf6d40517 100644
> >>> --- a/lib/power/rte_power.c
> >>> +++ b/lib/power/rte_power.c
> >>> @@ -8,153 +8,86 @@
> >>> #include <rte_spinlock.h>
> >>>
> >>> #include "rte_power.h"
> >>> -#include "power_acpi_cpufreq.h"
> >>> -#include "power_cppc_cpufreq.h"
> >>> #include "power_common.h"
> >>> -#include "power_kvm_vm.h"
> >>> -#include "power_pstate_cpufreq.h"
> >>> -#include "power_amd_pstate_cpufreq.h"
> >>>
> >>> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> >>> +static enum power_management_env global_default_env =
> >> PM_ENV_NOT_SET;
> >>> +static struct rte_power_core_ops *global_power_core_ops;
> >>>
> >>> static rte_spinlock_t global_env_cfg_lock =
> >>> RTE_SPINLOCK_INITIALIZER;
> >>> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> >>> + TAILQ_HEAD_INITIALIZER(core_ops_list);
> >>>
> >>> -/* function pointers */
> >>> -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
> >>> rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq
> >>> = NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
> >>> -rte_power_freq_change_t rte_power_freq_down = NULL;
> >>> -rte_power_freq_change_t rte_power_freq_max = NULL;
> >>> -rte_power_freq_change_t rte_power_freq_min = NULL;
> >>> -rte_power_freq_change_t rte_power_turbo_status;
> >>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
> >>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
> >>> -rte_power_get_capabilities_t rte_power_get_capabilities;
> >>> -
> >>> -static void
> >>> -reset_power_function_ptrs(void)
> >>> +
> >>> +const char *power_env_str[] = {
> >>> + "not set",
> >>> + "acpi",
> >>> + "kvm-vm",
> >>> + "pstate",
> >>> + "cppc",
> >>> + "amd-pstate"
> >>> +};
> >>> +
> >>> +/* register the ops struct in rte_power_core_ops, return 0 on
> >>> +success. */ int rte_power_register_ops(struct rte_power_core_ops
> >>> +*driver_ops)
> >>> {
> >>> - rte_power_freqs = NULL;
> >>> - rte_power_get_freq = NULL;
> >>> - rte_power_set_freq = NULL;
> >>> - rte_power_freq_up = NULL;
> >>> - rte_power_freq_down = NULL;
> >>> - rte_power_freq_max = NULL;
> >>> - rte_power_freq_min = NULL;
> >>> - rte_power_turbo_status = NULL;
> >>> - rte_power_freq_enable_turbo = NULL;
> >>> - rte_power_freq_disable_turbo = NULL;
> >>> - rte_power_get_capabilities = NULL;
> >>> + if (!driver_ops->init || !driver_ops->exit ||
> >>> + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
> >>> + !driver_ops->get_freq || !driver_ops->set_freq ||
> >>> + !driver_ops->freq_up || !driver_ops->freq_down ||
> >>> + !driver_ops->freq_max || !driver_ops->freq_min ||
> >>> + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
> >>> + !driver_ops->disable_turbo || !driver_ops->get_caps) {
> >>> + POWER_LOG(ERR, "Missing callbacks while registering
> >>> + power ops");
> >> turbo_status(), enable_turbo() and disable turbo() are not necessary, right?
> > Nope, this is required to get the current status unlike the capability API
> (get_caps()).
> ok
> >> These depand on the capabilities from get_caps().
> >>> + return -EINVAL;
> >>> + }
> >>> +
> >>> + TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
> >>> +
> >>> + return 0;
> >>> }
> >>>
> >>> int
> >>> rte_power_check_env_supported(enum power_management_env env)
> >>> {
> >>> - switch (env) {
> >>> - case PM_ENV_ACPI_CPUFREQ:
> >>> - return power_acpi_cpufreq_check_supported();
> >>> - case PM_ENV_PSTATE_CPUFREQ:
> >>> - return power_pstate_cpufreq_check_supported();
> >>> - case PM_ENV_KVM_VM:
> >>> - return power_kvm_vm_check_supported();
> >>> - case PM_ENV_CPPC_CPUFREQ:
> >>> - return power_cppc_cpufreq_check_supported();
> >>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> >>> - return power_amd_pstate_cpufreq_check_supported();
> >>> - default:
> >>> - rte_errno = EINVAL;
> >>> - return -1;
> >>> - }
> >>> + struct rte_power_core_ops *ops;
> >>> +
> >>> + if (env >= RTE_DIM(power_env_str))
> >>> + return 0;
> >>> +
> >>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> >>> + if (strncmp(ops->name, power_env_str[env],
> >>> + RTE_POWER_DRIVER_NAMESZ) == 0)
> >>> + return ops->check_env_support();
> >>> +
> >>> + return 0;
> >>> }
> >>>
> >>> int
> >>> rte_power_set_env(enum power_management_env env)
> >>> {
> >>> + struct rte_power_core_ops *ops;
> >>> + int ret = -1;
> >>> +
> >>> rte_spinlock_lock(&global_env_cfg_lock);
> >>>
> >>> if (global_default_env != PM_ENV_NOT_SET) {
> >>> POWER_LOG(ERR, "Power Management Environment already
> set.");
> >>> - rte_spinlock_unlock(&global_env_cfg_lock);
> >>> - return -1;
> >>> - }
> >>> -
> >> <...>
> >>> - if (ret == 0)
> >>> - global_default_env = env;
> >>> - else {
> >>> - global_default_env = PM_ENV_NOT_SET;
> >>> - reset_power_function_ptrs();
> >>> - }
> >>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> >>> + if (strncmp(ops->name, power_env_str[env],
> >>> + RTE_POWER_DRIVER_NAMESZ) == 0) {
> >>> + global_power_core_ops = ops;
> >>> + global_default_env = env;
> >>> + ret = 0;
> >>> + goto out;
> >>> + }
> >>> + POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
> >>> + env);
> >>>
> >>> +out:
> >>> rte_spinlock_unlock(&global_env_cfg_lock);
> >>> return ret;
> >>> }
> >>> @@ -164,94 +97,66 @@ rte_power_unset_env(void)
> >>> {
> >>> rte_spinlock_lock(&global_env_cfg_lock);
> >>> global_default_env = PM_ENV_NOT_SET;
> >>> - reset_power_function_ptrs();
> >>> + global_power_core_ops = NULL;
> >>> rte_spinlock_unlock(&global_env_cfg_lock);
> >>> }
> >>>
> >>> enum power_management_env
> >>> -rte_power_get_env(void) {
> >>> +rte_power_get_env(void)
> >>> +{
> >>> return global_default_env;
> >>> }
> >>>
> >>> -int
> >>> -rte_power_init(unsigned int lcore_id)
> >>> +struct rte_power_core_ops *
> >>> +rte_power_get_core_ops(void)
> >>> {
> >>> - int ret = -1;
> >>> + RTE_ASSERT(global_power_core_ops != NULL);
> >>>
> >>> - switch (global_default_env) {
> >>> - case PM_ENV_ACPI_CPUFREQ:
> >>> - return power_acpi_cpufreq_init(lcore_id);
> >>> - case PM_ENV_KVM_VM:
> >>> - return power_kvm_vm_init(lcore_id);
> >>> - case PM_ENV_PSTATE_CPUFREQ:
> >>> - return power_pstate_cpufreq_init(lcore_id);
> >>> - case PM_ENV_CPPC_CPUFREQ:
> >>> - return power_cppc_cpufreq_init(lcore_id);
> >>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> >>> - return power_amd_pstate_cpufreq_init(lcore_id);
> >>> - default:
> >>> - POWER_LOG(INFO, "Env isn't set yet!");
> >>> - }
> >>> + return global_power_core_ops;
> >>> +}
> >>>
> >>> - /* Auto detect Environment */
> >>> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
> >> management...");
> >>> - ret = power_acpi_cpufreq_init(lcore_id);
> >>> - if (ret == 0) {
> >>> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
> >>> - goto out;
> >>> - }
> >>> +int
> >>> +rte_power_init(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops;
> >>> + uint8_t env;
> >>>
> >>> - POWER_LOG(INFO, "Attempting to initialise PSTAT power
> management...");
> >>> - ret = power_pstate_cpufreq_init(lcore_id);
> >>> - if (ret == 0) {
> >>> - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
> >>> - goto out;
> >>> - }
> >>> + if (global_default_env != PM_ENV_NOT_SET)
> >>> + return global_power_core_ops->init(lcore_id);
> >>>
> >>> - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power
> >> management...");
> >>> - ret = power_amd_pstate_cpufreq_init(lcore_id);
> >>> - if (ret == 0) {
> >>> - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
> >>> - goto out;
> >>> - }
> >>> + POWER_LOG(INFO, "Env isn't set yet!");
> >> remove this log?
> >>> - POWER_LOG(INFO, "Attempting to initialise CPPC power
> management...");
> >>> - ret = power_cppc_cpufreq_init(lcore_id);
> >>> - if (ret == 0) {
> >>> - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
> >>> - goto out;
> >>> - }
> >>> + /* Auto detect Environment */
> >>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
> >>> + if (ops) {
> >>> + POWER_LOG(INFO,
> >>> + "Attempting to initialise %s cpufreq power management...",
> >>> + ops->name);
> >>> + if (ops->init(lcore_id) == 0) {
> >>> + for (env = 0; env < RTE_DIM(power_env_str); env++)
> >>> + if (strncmp(ops->name, power_env_str[env],
> >>> + RTE_POWER_DRIVER_NAMESZ) == 0) {
> >>> + rte_power_set_env(env);
> >>> + return 0;
> >>> + }
> >>> + }
> >>> + }
> >> Can we change the logic of rte_power_set_env()? like:
> >> RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
> >> for (env = 0; env < RTE_DIM(power_env_str); env++) {
> >> if (strncmp(ops->name, power_env_str[env],
> >> RTE_POWER_DRIVER_NAMESZ) == 0 &&
> >> ops->init(lcore_id) == 0) {
> >> global_power_core_ops = ops;
> >> global_default_env = env;
> >> }
> >> }
> >> }
> >> That is easier to follow code.
> > Yes, will fix in next version.
> >
> >>> +
> >>> + POWER_LOG(ERR,
> >>> + "Unable to set Power Management Environment for lcore %u",
> >>> + lcore_id);
> >>>
> >>> - POWER_LOG(INFO, "Attempting to initialise VM power management...");
> >>> - ret = power_kvm_vm_init(lcore_id);
> >>> - if (ret == 0) {
> >>> - rte_power_set_env(PM_ENV_KVM_VM);
> >>> - goto out;
> >>> - }
> >>> - POWER_LOG(ERR, "Unable to set Power Management Environment for
> lcore
> >> "
> >>> - "%u", lcore_id);
> >>> -out:
> >>> - return ret;
> >>> + return -1;
> >>> }
> >>>
> >>> int
> >>> rte_power_exit(unsigned int lcore_id)
> >>> {
> >>> - switch (global_default_env) {
> >>> - case PM_ENV_ACPI_CPUFREQ:
> >>> - return power_acpi_cpufreq_exit(lcore_id);
> >>> - case PM_ENV_KVM_VM:
> >>> - return power_kvm_vm_exit(lcore_id);
> >>> - case PM_ENV_PSTATE_CPUFREQ:
> >>> - return power_pstate_cpufreq_exit(lcore_id);
> >>> - case PM_ENV_CPPC_CPUFREQ:
> >>> - return power_cppc_cpufreq_exit(lcore_id);
> >>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> >>> - return power_amd_pstate_cpufreq_exit(lcore_id);
> >>> - default:
> >>> - POWER_LOG(ERR, "Environment has not been set, unable to exit
> >> gracefully");
> >>> + if (global_default_env != PM_ENV_NOT_SET)
> >>> + return global_power_core_ops->exit(lcore_id);
> >>>
> >>> - }
> >>> - return -1;
> >>> + POWER_LOG(ERR,
> >>> + "Environment has not been set, unable to exit
> >>> + gracefully");
> >>>
> >>> + return -1;
> >>> }
> >>> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
> >>> 4fa4afe399..5e4aacf08b 100644
> >>> --- a/lib/power/rte_power.h
> >>> +++ b/lib/power/rte_power.h
> >>> @@ -1,5 +1,6 @@
> >>> /* SPDX-License-Identifier: BSD-3-Clause
> >>> * Copyright(c) 2010-2014 Intel Corporation
> >>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> >>> */
> >>>
> >>> #ifndef _RTE_POWER_H
> >>> @@ -14,14 +15,21 @@
> >>> #include <rte_log.h>
> >>> #include <rte_power_guest_channel.h>
> >>>
> >>> +#include "rte_power_core_ops.h"
> >>> +
> >>> #ifdef __cplusplus
> >>> extern "C" {
> >>> #endif
> >>>
> >>> /* Power Management Environment State */ -enum
> >>> power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ,
> PM_ENV_KVM_VM,
> >>> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> >>> - PM_ENV_AMD_PSTATE_CPUFREQ};
> >>> +enum power_management_env {
> >>> + PM_ENV_NOT_SET = 0,
> >>> + PM_ENV_ACPI_CPUFREQ,
> >>> + PM_ENV_KVM_VM,
> >>> + PM_ENV_PSTATE_CPUFREQ,
> >>> + PM_ENV_CPPC_CPUFREQ,
> >>> + PM_ENV_AMD_PSTATE_CPUFREQ
> >>> +};
> >>>
> >>> /**
> >>> * Check if a specific power management environment type is
> >>> supported on a @@ -66,6 +74,15 @@ void rte_power_unset_env(void);
> >>> */
> >>> enum power_management_env rte_power_get_env(void);
> >> I'd like to let user not know used which cpufreq driver, which is friendly to user.
> >>
> >> So we can rethink if this API is necessary.
> > For any API changes, could we handle this as a separate RFC for discussion?
> > It’s important that these changes are not included within the scope of this patch.
> Agreed.
> Can you post a separate RFC to disscuss this improvement later?
> >>> +/**
> >>> + * @internal Get the power ops struct from its index.
> >>> + *
> >>> + * @return
> >>> + * The pointer to the ops struct in the table if registered.
> >>> + */
> >>> +struct rte_power_core_ops *
> >>> +rte_power_get_core_ops(void);
> >>> +
> >>> /**
> >>> * Initialize power management for a specific lcore. If rte_power_set_env()
> has
> >>> * not been called then an auto-detect of the environment will
> >>> start and @@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
> >>> * @return
> >>> * The number of available frequencies.
> >>> */
> >>> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
> >>> - uint32_t num);
> >>> +static inline uint32_t
> >>> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>>
> >>> -extern rte_power_freqs_t rte_power_freqs;
> >>> + return ops->get_avail_freqs(lcore_id, freqs, n); }
> >>>
> >>> /**
> >>> * Return the current index of available frequencies of a specific lcore.
> >>> @@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
> >>> * @return
> >>> * The current index of available frequencies.
> >>> */
> >>> -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
> >>> +static inline uint32_t
> >>> +rte_power_get_freq(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>>
> >>> -extern rte_power_get_freq_t rte_power_get_freq;
> >>> + return ops->get_freq(lcore_id); }
> >>>
> >>> /**
> >>> * Set the new frequency for a specific lcore by indicating the
> >>> index of @@ -144,82 +168,101 @@ extern rte_power_get_freq_t
> >> rte_power_get_freq;
> >>> * - 0 on success without frequency changed.
> >>> * - Negative on error.
> >>> */
> >>> -typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t
> >>> index);
> >>> -
> >>> -extern rte_power_set_freq_t rte_power_set_freq;
> >>> +static inline uint32_t
> >>> +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>>
> >>> -/**
> >>> - * Function pointer definition for generic frequency change
> >>> functions. Review
> >>> - * each environments specific documentation for usage.
> >>> - *
> >>> - * @param lcore_id
> >>> - * lcore id.
> >>> - *
> >>> - * @return
> >>> - * - 1 on success with frequency changed.
> >>> - * - 0 on success without frequency changed.
> >>> - * - Negative on error.
> >>> - */
> >>> -typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
> >>> + return ops->set_freq(lcore_id, index); }
> >>>
> >>> /**
> >>> * Scale up the frequency of a specific lcore according to the available
> >>> * frequencies.
> >>> * Review each environments specific documentation for usage.
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_freq_up;
> >>> +static inline int
> >>> +rte_power_freq_up(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>> +
> >>> + return ops->freq_up(lcore_id); }
> >>>
> >>> /**
> >>> * Scale down the frequency of a specific lcore according to the available
> >>> * frequencies.
> >>> * Review each environments specific documentation for usage.
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_freq_down;
> >>> +static inline int
> >>> +rte_power_freq_down(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>> +
> >>> + return ops->freq_down(lcore_id); }
> >>>
> >>> /**
> >>> * Scale up the frequency of a specific lcore to the highest according to the
> >>> * available frequencies.
> >>> * Review each environments specific documentation for usage.
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_freq_max;
> >>> +static inline int
> >>> +rte_power_freq_max(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>> +
> >>> + return ops->freq_max(lcore_id); }
> >>>
> >>> /**
> >>> * Scale down the frequency of a specific lcore to the lowest according to the
> >>> * available frequencies.
> >>> * Review each environments specific documentation for usage..
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_freq_min;
> >>> +static inline int
> >>> +rte_power_freq_min(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>> +
> >>> + return ops->freq_min(lcore_id); }
> >>>
> >>> /**
> >>> * Query the Turbo Boost status of a specific lcore.
> >>> * Review each environments specific documentation for usage..
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_turbo_status;
> >>> +static inline int
> >>> +rte_power_turbo_status(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>> +
> >>> + return ops->turbo_status(lcore_id); }
> >>>
> >>> /**
> >>> * Enable Turbo Boost for this lcore.
> >>> * Review each environments specific documentation for usage..
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_freq_enable_turbo;
> >>> +static inline int
> >>> +rte_power_freq_enable_turbo(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>> +
> >>> + return ops->enable_turbo(lcore_id); }
> >>>
> >>> /**
> >>> * Disable Turbo Boost for this lcore.
> >>> * Review each environments specific documentation for usage..
> >>> */
> >>> -extern rte_power_freq_change_t rte_power_freq_disable_turbo;
> >>> +static inline int
> >>> +rte_power_freq_disable_turbo(unsigned int lcore_id) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>>
> >>> -/**
> >>> - * Power capabilities summary.
> >>> - */
> >>> -struct rte_power_core_capabilities {
> >>> - union {
> >>> - uint64_t capabilities;
> >>> - struct {
> >>> - uint64_t turbo:1; /**< Turbo can be enabled. */
> >>> - uint64_t priority:1; /**< SST-BF high freq core */
> >>> - };
> >>> - };
> >>> -};
> >>> + return ops->disable_turbo(lcore_id); }
> >>>
> >>> /**
> >>> * Returns power capabilities for a specific lcore.
> >>> @@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
> >>> * - 0 on success.
> >>> * - Negative on error.
> >>> */
> >>> -typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
> >>> - struct rte_power_core_capabilities *caps);
> >>> +static inline int
> >>> +rte_power_get_capabilities(unsigned int lcore_id,
> >>> + struct rte_power_core_capabilities *caps) {
> >>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
> >>>
> >>> -extern rte_power_get_capabilities_t rte_power_get_capabilities;
> >>> + return ops->get_caps(lcore_id, caps); }
> >>>
> >>> #ifdef __cplusplus
> >>> }
> >>> diff --git a/lib/power/rte_power_core_ops.h
> >>> b/lib/power/rte_power_core_ops.h new file mode 100644 index
> >>> 0000000000..356a64df79
> >>> --- /dev/null
> >>> +++ b/lib/power/rte_power_core_ops.h
> >>> @@ -0,0 +1,208 @@
> >>> +/* SPDX-License-Identifier: BSD-3-Clause
> >>> + * Copyright(c) 2010-2014 Intel Corporation
> >>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> >>> + */
> >>> +
> >>> +#ifndef _RTE_POWER_CORE_OPS_H
> >>> +#define _RTE_POWER_CORE_OPS_H
> >>> +
> >> suggest rename the file as rte_power_cpufreq_api.h.
> >> If so, the role of this file is more clearly.
> >>> +__rte_internal
> >>> +int rte_power_register_ops(struct rte_power_core_ops *ops);
> >>> +
> >>> +/**
> >>> + * Macro to statically register the ops of a cpufreq driver.
> >>> + */
> >>> +#define RTE_POWER_REGISTER_OPS(ops) \
> >>> + RTE_INIT(power_hdlr_init_##ops) \
> >>> + { \
> >>> + rte_power_register_ops(&ops); \
> >>> + }
> >>> +
> >>> +/**
> >>> + * @internal Get the power ops struct from its index.
> >>> + *
> >>> + * @return
> >>> + * The pointer to the ops struct in the table if registered.
> >>> + */
> >>> +struct rte_power_core_ops *
> >>> +rte_power_get_core_ops(void);
> >>> +
> >>> +#ifdef __cplusplus
> >>> +}
> >>> +#endif
> >>> +
> >>> +#endif
> >>> diff --git a/lib/power/version.map b/lib/power/version.map index
> >>> c9a226614e..bd64e0828f 100644
> >>> --- a/lib/power/version.map
> >>> +++ b/lib/power/version.map
> >>> @@ -51,4 +51,18 @@ EXPERIMENTAL {
> >>> rte_power_set_uncore_env;
> >>> rte_power_uncore_freqs;
> >>> rte_power_unset_uncore_env;
> >>> + # added in 24.07
> >> 24.07-->24.11?
> >>> + rte_power_logtype;
> >>> +};
> >>> +
> >>> +INTERNAL {
> >>> + global:
> >>> +
> >>> + rte_power_register_ops;
> >>> + cpufreq_check_scaling_driver;
> >>> + power_set_governor;
> >>> + open_core_sysfs_file;
> >>> + read_core_sysfs_u32;
> >>> + read_core_sysfs_s;
> >>> + write_core_sysfs_s;
> >>> };
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 1/4] power: refactor core power management library
2024-09-18 8:37 ` Tummala, Sivaprasad
@ 2024-09-19 3:37 ` lihuisong (C)
0 siblings, 0 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-09-19 3:37 UTC (permalink / raw)
To: Tummala, Sivaprasad
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau,
cristian.dumitrescu, jerinj, konstantin.ananyev, Yigit, Ferruh,
gakhil
在 2024/9/18 16:37, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Friday, September 13, 2024 1:05 PM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
>> radu.nicolau@intel.com; cristian.dumitrescu@intel.com; jerinj@marvell.com;
>> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
>> gakhil@marvell.com
>> Subject: Re: [PATCH v2 1/4] power: refactor core power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> 在 2024/9/12 19:17, Tummala, Sivaprasad 写道:
>>> [AMD Official Use Only - AMD Internal Distribution Only]
>>>
>>> Hi Huisong,
>>>
>>> Please find my response inline.
>>>
>>>> -----Original Message-----
>>>> From: lihuisong (C) <lihuisong@huawei.com>
>>>> Sent: Tuesday, August 27, 2024 1:51 PM
>>>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>>>> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
>>>> radu.nicolau@intel.com; cristian.dumitrescu@intel.com;
>>>> jerinj@marvell.com; konstantin.ananyev@huawei.com; Yigit, Ferruh
>>>> <Ferruh.Yigit@amd.com>; gakhil@marvell.com
>>>> Subject: Re: [PATCH v2 1/4] power: refactor core power management
>>>> library
>>>>
>>>> Caution: This message originated from an External Source. Use proper
>>>> caution when opening attachments, clicking links, or responding.
>>>>
>>>>
>>>> Hi Sivaprasa,
>>>>
>>>> Some comments inline.
>>>>
>>>> /Huisong
>>>>
>>>> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
>>>>> This patch introduces a comprehensive refactor to the core power
>>>>> management library. The primary focus is on improving modularity and
>>>>> organization by relocating specific driver implementations from the
>>>>> 'lib/power' directory to dedicated directories within
>>>>> 'drivers/power/core/*'. The adjustment of meson.build files enables
>>>>> the selective activation of individual drivers.
>>>>> These changes contribute to a significant enhancement in code
>>>>> organization, providing a clearer structure for driver implementations.
>>>>> The refactor aims to improve overall code clarity and boost
>>>>> maintainability. Additionally, it establishes a foundation for
>>>>> future development, allowing for more focused work on individual
>>>>> drivers and seamless integration of forthcoming enhancements.
>>>>>
>>>>> v2:
>>>>> - added NULL check for global_core_ops in rte_power_get_core_ops
>>>>>
>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>>> ---
>>>>> drivers/meson.build | 1 +
>>>>> .../power/acpi/acpi_cpufreq.c | 22 +-
>>>>> .../power/acpi/acpi_cpufreq.h | 6 +-
>>>>> drivers/power/acpi/meson.build | 10 +
>>>>> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
>>>>> .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
>>>>> drivers/power/amd_pstate/meson.build | 10 +
>>>>> .../power/cppc/cppc_cpufreq.c | 22 +-
>>>>> .../power/cppc/cppc_cpufreq.h | 8 +-
>>>>> drivers/power/cppc/meson.build | 10 +
>>>>> .../power/kvm_vm}/guest_channel.c | 0
>>>>> .../power/kvm_vm}/guest_channel.h | 0
>>>>> .../power/kvm_vm/kvm_vm.c | 22 +-
>>>>> .../power/kvm_vm/kvm_vm.h | 6 +-
>>>>> drivers/power/kvm_vm/meson.build | 16 +
>>>>> drivers/power/meson.build | 12 +
>>>>> drivers/power/pstate/meson.build | 10 +
>>>>> .../power/pstate/pstate_cpufreq.c | 22 +-
>>>>> .../power/pstate/pstate_cpufreq.h | 6 +-
>>>>> lib/power/meson.build | 7 +-
>>>>> lib/power/power_common.c | 2 +-
>>>>> lib/power/power_common.h | 16 +-
>>>>> lib/power/rte_power.c | 291 ++++++------------
>>>>> lib/power/rte_power.h | 139 ++++++---
>>>>> lib/power/rte_power_core_ops.h | 208 +++++++++++++
>>>>> lib/power/version.map | 14 +
>>>>> 26 files changed, 621 insertions(+), 271 deletions(-)
>>>>> rename lib/power/power_acpi_cpufreq.c =>
>>>>> drivers/power/acpi/acpi_cpufreq.c
>>>> (95%)
>>>>> rename lib/power/power_acpi_cpufreq.h =>
>>>>> drivers/power/acpi/acpi_cpufreq.h
>>>> (98%)
>>>>> create mode 100644 drivers/power/acpi/meson.build
>>>>> rename lib/power/power_amd_pstate_cpufreq.c =>
>>>> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
>>>>> rename lib/power/power_amd_pstate_cpufreq.h =>
>>>> drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
>>>>> create mode 100644 drivers/power/amd_pstate/meson.build
>>>>> rename lib/power/power_cppc_cpufreq.c =>
>>>>> drivers/power/cppc/cppc_cpufreq.c
>>>> (95%)
>>>>> rename lib/power/power_cppc_cpufreq.h =>
>>>>> drivers/power/cppc/cppc_cpufreq.h
>>>> (97%)
>>>>> create mode 100644 drivers/power/cppc/meson.build
>>>>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
>>>>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
>>>>> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c
>> (82%)
>>>>> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h
>> (98%)
>>>>> create mode 100644 drivers/power/kvm_vm/meson.build
>>>>> create mode 100644 drivers/power/meson.build
>>>>> create mode 100644 drivers/power/pstate/meson.build
>>>>> rename lib/power/power_pstate_cpufreq.c =>
>>>> drivers/power/pstate/pstate_cpufreq.c (96%)
>>>>> rename lib/power/power_pstate_cpufreq.h =>
>>>> drivers/power/pstate/pstate_cpufreq.h (98%)
>>>>> create mode 100644 lib/power/rte_power_core_ops.h
>>>> How about use the following directory structure?
>>>> *For power libs*
>>>> lib/power/power_common.*
>>>> lib/power/rte_power_pmd_mgmt.*
>>>> lib/power/rte_power_cpufreq_api.* (replacing rte_power.c file maybe simple for
>> us.
>>>> but I'm not sure if we can put the init of core, uncore and pmd mgmt
>>>> to rte_power_init.c in rte_power.c.)
>>>> lib/power/rte_power_uncore_freq_api.*
>>> Yes, renaming rte_power.c is definitely a possible incremental change that could
>> be considered later.
>>> However, for the time being, our focus will be on refactoring the cpufreq drivers
>> only.
>> The rte_power.c just works for the initialization of cpufreq driver. Now that you are
>> reworking core and uncore power library and rearrange the directory under power.
>> I think renaming this file name should be more appropriate in this series.
>>>> *And has directories under drivers/power:*
>>>> 1> For core dvfs driver:
>>>> drivers/power/cpufreq/acpi_cpufreq.c
>>>> drivers/power/cpufreq/cppc_cpufreq.c
>>>> drivers/power/cpufreq/amd_pstate_cpufreq.c
>>>> drivers/power/cpufreq/intel_pstate_cpufreq.c
>>>> drivers/power/cpufreq/kvm_cpufreq.c
>>>> The code of each cpufreq driver is not too much and doesn't probably
>>>> increase. So don't need to use a directory for it.
>>>>
>>>> 2> For uncore dvfs driver:
>>>> drivers/power/uncorefreq/intel_uncore.*
>>>>> diff --git a/drivers/meson.build b/drivers/meson.build index
>>>>> 66931d4241..9d77e0deab 100644
>>>>> --- a/drivers/meson.build
>>>>> +++ b/drivers/meson.build
>>>>> @@ -29,6 +29,7 @@ subdirs = [
>>>>> 'event', # depends on common, bus, mempool and net.
>>>>> 'baseband', # depends on common and bus.
>>>>> 'gpu', # depends on common and bus.
>>>>> + 'power', # depends on common (in future).
>>>>> ]
>>>>>
>>>>> if meson.is_cross_build()
>>>>> diff --git a/lib/power/power_acpi_cpufreq.c
>>>>> b/drivers/power/acpi/acpi_cpufreq.c
>>>>> similarity index 95%
>>>>> rename from lib/power/power_acpi_cpufreq.c rename to
>>>>> drivers/power/acpi/acpi_cpufreq.c
>>>> do not suggest to create one directory for each cpufreq driver.
>>>> Because pstate drivers also comply with ACPI spec, right?
>>>> In addition, the code of each cpufreq drivers are not too much.
>>>> There is just one file under one directory which is not good.
>>> One of our objectives for the refactoring is to selectively disable non-essential
>> drivers using Meson build options.
>>> However, by rearranging the driver structure, we risk disrupting this capability.
>> I get your purpose.
>> The cpufreq library has the feature and interface to detect which driver to use, right?
>> So it is not necessary for cpufreq library to introduce the Meson build options, which
>> probably makes it complicate.
> In Meson, you can reduce code size by disabling specific drivers or components through build options,
> allowing you to exclude unnecessary features. At runtime, the library will automatically detect the available driver,
> and if it's not present in the build, initialization will fail.
I still cannot understand why you want to do this.
The reducing code size is not a good reason. If all libraries or drivers
want to do it like this, need to add more meson options.
Unless there is a situation where multiple drivers on a platform can be
used, and the automatic detection mechanism is not enough to determine
which driver to use.
Of course, you also can see the opinion of other reviewers.
> We're not introducing any new complexities; rather, we aim to ensure that the drivers in drivers/power/*
> are consistent with the other drivers.
What do the other drivers stand for?
Anyway, we need to make the directory hierarchy under drivers/power/ and
lib/power/ clear.
>>>>> index 81996e1c13..8637c69703 100644
>>>>> --- a/lib/power/power_acpi_cpufreq.c
>>>>> +++ b/drivers/power/acpi/acpi_cpufreq.c
>>>>> @@ -10,7 +10,7 @@
>>>>> #include <rte_stdatomic.h>
>>>>> #include <rte_string_fns.h>
>>>>>
>>>>> -#include "power_acpi_cpufreq.h"
>>>>> +#include "acpi_cpufreq.h"
>>>>> #include "power_common.h"
>>>>>
>>>> <...>
>>>>> +if not is_linux
>>>>> + build = false
>>>>> + reason = 'only supported on Linux'
>>>>> +endif
>>>>> +sources = files('pstate_cpufreq.c')
>>>>> +
>>>>> +deps += ['power']
>>>>> diff --git a/lib/power/power_pstate_cpufreq.c
>>>>> b/drivers/power/pstate/pstate_cpufreq.c
>>>>> similarity index 96%
>>>>> rename from lib/power/power_pstate_cpufreq.c rename to
>>>>> drivers/power/pstate/pstate_cpufreq.c
>>>> pstate_cpufreq.c is actually intel_pstate cpufreq driver, right?
>>>> So how about modify this file name to intel_pstate_cpufreq.c?
>>> Yes, will fix this in next version.
>>>>> index 2343121621..c32b1adabc 100644
>>>>> --- a/lib/power/power_pstate_cpufreq.c
>>>>> +++ b/drivers/power/pstate/pstate_cpufreq.c
>>>>> @@ -15,7 +15,7 @@
>>>>> #include <rte_stdatomic.h>
>>>>>
>>>>> #include "rte_power_pmd_mgmt.h"
>>>>> -#include "power_pstate_cpufreq.h"
>>>>> +#include "pstate_cpufreq.h"
>>>>> #include "power_common.h"
>>>>>
>>>>> /* macros used for rounding frequency to nearest 100000 */ @@
>>>>> -888,3
>>>>> +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
>>>>>
>>>>> return 0;
>>>>> }
>>>>> +
>>>> <...>
>>>>> diff --git a/lib/power/power_common.c b/lib/power/power_common.c
>>>>> index 590986d5ef..6c06411e8b 100644
>>>>> --- a/lib/power/power_common.c
>>>>> +++ b/lib/power/power_common.c
>>>>> @@ -12,7 +12,7 @@
>>>>>
>>>>> #include "power_common.h"
>>>>>
>>>>> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
>>>>> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>>>>>
>>>>> #define POWER_SYSFILE_SCALING_DRIVER \
>>>>> "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
>>>>> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
>>>>> index
>>>>> 83f742f42a..767686ee12 100644
>>>>> --- a/lib/power/power_common.h
>>>>> +++ b/lib/power/power_common.h
>>>>> @@ -6,12 +6,13 @@
>>>>> #define _POWER_COMMON_H_
>>>>>
>>>>> #include <rte_common.h>
>>>>> +#include <rte_compat.h>
>>>>> #include <rte_log.h>
>>>>>
>>>>> #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>>>>>
>>>>> -extern int power_logtype;
>>>>> -#define RTE_LOGTYPE_POWER power_logtype
>>>>> +extern int rte_power_logtype;
>>>>> +#define RTE_LOGTYPE_POWER rte_power_logtype
>>>>> #define POWER_LOG(level, ...) \
>>>>> RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>>>>>
>>>>> @@ -23,13 +24,24 @@ extern int power_logtype;
>>>>> #endif
>>>>>
>>>>> /* check if scaling driver matches one we want */
>>>>> +__rte_internal
>>>>> int cpufreq_check_scaling_driver(const char *driver);
>>>>> +
>>>>> +__rte_internal
>>>>> int power_set_governor(unsigned int lcore_id, const char *new_governor,
>>>>> char *orig_governor, size_t orig_governor_len);
>>>> suggest that move cpufreq interfaces like this to the
>>>> rte_power_cpufreq_api.* I proposed above.
>>> This is an internal API and isn’t intended for direct use by applications.
>>> By moving it to rte_power_*, we risk exposing it inadvertently.
>> we don't expose these to applications. application do not include this header file.
>> power_set_governor() and cpufreq_check_scaling_driver() is just used by cpufreq
>> driver. So they just can be seen by cpufreq lib or module, right?
>> But if these interface are in power_common.h, pmd_mgmt and uncore driver also
>> include this header file and can see them. This is not good.
>> AFAIS, the power_common.h should just contain the kind of interfaces that are used
>> by all power libs or sub-modules, like cpufreq, uncore, pmd_mgmt and so on.
> OK., Will move this internal APIs from power_common.h to a separate header file.
>>>> The interfaces in power_comm.* can be used by all power modules, like
>>>> core/uncore/pmd mgmt.
>>>>> +
>>>>> +__rte_internal
>>>>> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
>>>>> __rte_format_printf(3, 4);
>>>>> +
>>>>> +__rte_internal
>>>>> int read_core_sysfs_u32(FILE *f, uint32_t *val);
>>>>> +
>>>>> +__rte_internal
>>>>> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
>>>>> +
>>>>> +__rte_internal
>>>>> int write_core_sysfs_s(FILE *f, const char *str);
>>>>>
>>>>> #endif /* _POWER_COMMON_H_ */
>>>>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
>>>> The name of the rte_power.c file is impropriate now. The context in
>>>> this file is just for cpufreq, right?
>>>> So I suggest that we need to rename this file as the
>>>> rte_power_cpufreq_api.c
>>> Yes, renaming rte_power.c to rte_power_cpufreq.c is definitely a
>>> possible incremental change and will fix this as a separate patch.
>>> .
>>>
>>>>> index 36c3f3da98..2bf6d40517 100644
>>>>> --- a/lib/power/rte_power.c
>>>>> +++ b/lib/power/rte_power.c
>>>>> @@ -8,153 +8,86 @@
>>>>> #include <rte_spinlock.h>
>>>>>
>>>>> #include "rte_power.h"
>>>>> -#include "power_acpi_cpufreq.h"
>>>>> -#include "power_cppc_cpufreq.h"
>>>>> #include "power_common.h"
>>>>> -#include "power_kvm_vm.h"
>>>>> -#include "power_pstate_cpufreq.h"
>>>>> -#include "power_amd_pstate_cpufreq.h"
>>>>>
>>>>> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
>>>>> +static enum power_management_env global_default_env =
>>>> PM_ENV_NOT_SET;
>>>>> +static struct rte_power_core_ops *global_power_core_ops;
>>>>>
>>>>> static rte_spinlock_t global_env_cfg_lock =
>>>>> RTE_SPINLOCK_INITIALIZER;
>>>>> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
>>>>> + TAILQ_HEAD_INITIALIZER(core_ops_list);
>>>>>
>>>>> -/* function pointers */
>>>>> -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
>>>>> rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq
>>>>> = NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
>>>>> -rte_power_freq_change_t rte_power_freq_down = NULL;
>>>>> -rte_power_freq_change_t rte_power_freq_max = NULL;
>>>>> -rte_power_freq_change_t rte_power_freq_min = NULL;
>>>>> -rte_power_freq_change_t rte_power_turbo_status;
>>>>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
>>>>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
>>>>> -rte_power_get_capabilities_t rte_power_get_capabilities;
>>>>> -
>>>>> -static void
>>>>> -reset_power_function_ptrs(void)
>>>>> +
>>>>> +const char *power_env_str[] = {
>>>>> + "not set",
>>>>> + "acpi",
>>>>> + "kvm-vm",
>>>>> + "pstate",
>>>>> + "cppc",
>>>>> + "amd-pstate"
>>>>> +};
>>>>> +
>>>>> +/* register the ops struct in rte_power_core_ops, return 0 on
>>>>> +success. */ int rte_power_register_ops(struct rte_power_core_ops
>>>>> +*driver_ops)
>>>>> {
>>>>> - rte_power_freqs = NULL;
>>>>> - rte_power_get_freq = NULL;
>>>>> - rte_power_set_freq = NULL;
>>>>> - rte_power_freq_up = NULL;
>>>>> - rte_power_freq_down = NULL;
>>>>> - rte_power_freq_max = NULL;
>>>>> - rte_power_freq_min = NULL;
>>>>> - rte_power_turbo_status = NULL;
>>>>> - rte_power_freq_enable_turbo = NULL;
>>>>> - rte_power_freq_disable_turbo = NULL;
>>>>> - rte_power_get_capabilities = NULL;
>>>>> + if (!driver_ops->init || !driver_ops->exit ||
>>>>> + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
>>>>> + !driver_ops->get_freq || !driver_ops->set_freq ||
>>>>> + !driver_ops->freq_up || !driver_ops->freq_down ||
>>>>> + !driver_ops->freq_max || !driver_ops->freq_min ||
>>>>> + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
>>>>> + !driver_ops->disable_turbo || !driver_ops->get_caps) {
>>>>> + POWER_LOG(ERR, "Missing callbacks while registering
>>>>> + power ops");
>>>> turbo_status(), enable_turbo() and disable turbo() are not necessary, right?
>>> Nope, this is required to get the current status unlike the capability API
>> (get_caps()).
>> ok
>>>> These depand on the capabilities from get_caps().
>>>>> + return -EINVAL;
>>>>> + }
>>>>> +
>>>>> + TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
>>>>> +
>>>>> + return 0;
>>>>> }
>>>>>
>>>>> int
>>>>> rte_power_check_env_supported(enum power_management_env env)
>>>>> {
>>>>> - switch (env) {
>>>>> - case PM_ENV_ACPI_CPUFREQ:
>>>>> - return power_acpi_cpufreq_check_supported();
>>>>> - case PM_ENV_PSTATE_CPUFREQ:
>>>>> - return power_pstate_cpufreq_check_supported();
>>>>> - case PM_ENV_KVM_VM:
>>>>> - return power_kvm_vm_check_supported();
>>>>> - case PM_ENV_CPPC_CPUFREQ:
>>>>> - return power_cppc_cpufreq_check_supported();
>>>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>>>> - return power_amd_pstate_cpufreq_check_supported();
>>>>> - default:
>>>>> - rte_errno = EINVAL;
>>>>> - return -1;
>>>>> - }
>>>>> + struct rte_power_core_ops *ops;
>>>>> +
>>>>> + if (env >= RTE_DIM(power_env_str))
>>>>> + return 0;
>>>>> +
>>>>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
>>>>> + if (strncmp(ops->name, power_env_str[env],
>>>>> + RTE_POWER_DRIVER_NAMESZ) == 0)
>>>>> + return ops->check_env_support();
>>>>> +
>>>>> + return 0;
>>>>> }
>>>>>
>>>>> int
>>>>> rte_power_set_env(enum power_management_env env)
>>>>> {
>>>>> + struct rte_power_core_ops *ops;
>>>>> + int ret = -1;
>>>>> +
>>>>> rte_spinlock_lock(&global_env_cfg_lock);
>>>>>
>>>>> if (global_default_env != PM_ENV_NOT_SET) {
>>>>> POWER_LOG(ERR, "Power Management Environment already
>> set.");
>>>>> - rte_spinlock_unlock(&global_env_cfg_lock);
>>>>> - return -1;
>>>>> - }
>>>>> -
>>>> <...>
>>>>> - if (ret == 0)
>>>>> - global_default_env = env;
>>>>> - else {
>>>>> - global_default_env = PM_ENV_NOT_SET;
>>>>> - reset_power_function_ptrs();
>>>>> - }
>>>>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
>>>>> + if (strncmp(ops->name, power_env_str[env],
>>>>> + RTE_POWER_DRIVER_NAMESZ) == 0) {
>>>>> + global_power_core_ops = ops;
>>>>> + global_default_env = env;
>>>>> + ret = 0;
>>>>> + goto out;
>>>>> + }
>>>>> + POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
>>>>> + env);
>>>>>
>>>>> +out:
>>>>> rte_spinlock_unlock(&global_env_cfg_lock);
>>>>> return ret;
>>>>> }
>>>>> @@ -164,94 +97,66 @@ rte_power_unset_env(void)
>>>>> {
>>>>> rte_spinlock_lock(&global_env_cfg_lock);
>>>>> global_default_env = PM_ENV_NOT_SET;
>>>>> - reset_power_function_ptrs();
>>>>> + global_power_core_ops = NULL;
>>>>> rte_spinlock_unlock(&global_env_cfg_lock);
>>>>> }
>>>>>
>>>>> enum power_management_env
>>>>> -rte_power_get_env(void) {
>>>>> +rte_power_get_env(void)
>>>>> +{
>>>>> return global_default_env;
>>>>> }
>>>>>
>>>>> -int
>>>>> -rte_power_init(unsigned int lcore_id)
>>>>> +struct rte_power_core_ops *
>>>>> +rte_power_get_core_ops(void)
>>>>> {
>>>>> - int ret = -1;
>>>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>>>>
>>>>> - switch (global_default_env) {
>>>>> - case PM_ENV_ACPI_CPUFREQ:
>>>>> - return power_acpi_cpufreq_init(lcore_id);
>>>>> - case PM_ENV_KVM_VM:
>>>>> - return power_kvm_vm_init(lcore_id);
>>>>> - case PM_ENV_PSTATE_CPUFREQ:
>>>>> - return power_pstate_cpufreq_init(lcore_id);
>>>>> - case PM_ENV_CPPC_CPUFREQ:
>>>>> - return power_cppc_cpufreq_init(lcore_id);
>>>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>>>> - return power_amd_pstate_cpufreq_init(lcore_id);
>>>>> - default:
>>>>> - POWER_LOG(INFO, "Env isn't set yet!");
>>>>> - }
>>>>> + return global_power_core_ops;
>>>>> +}
>>>>>
>>>>> - /* Auto detect Environment */
>>>>> - POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power
>>>> management...");
>>>>> - ret = power_acpi_cpufreq_init(lcore_id);
>>>>> - if (ret == 0) {
>>>>> - rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
>>>>> - goto out;
>>>>> - }
>>>>> +int
>>>>> +rte_power_init(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops;
>>>>> + uint8_t env;
>>>>>
>>>>> - POWER_LOG(INFO, "Attempting to initialise PSTAT power
>> management...");
>>>>> - ret = power_pstate_cpufreq_init(lcore_id);
>>>>> - if (ret == 0) {
>>>>> - rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
>>>>> - goto out;
>>>>> - }
>>>>> + if (global_default_env != PM_ENV_NOT_SET)
>>>>> + return global_power_core_ops->init(lcore_id);
>>>>>
>>>>> - POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power
>>>> management...");
>>>>> - ret = power_amd_pstate_cpufreq_init(lcore_id);
>>>>> - if (ret == 0) {
>>>>> - rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
>>>>> - goto out;
>>>>> - }
>>>>> + POWER_LOG(INFO, "Env isn't set yet!");
>>>> remove this log?
>>>>> - POWER_LOG(INFO, "Attempting to initialise CPPC power
>> management...");
>>>>> - ret = power_cppc_cpufreq_init(lcore_id);
>>>>> - if (ret == 0) {
>>>>> - rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
>>>>> - goto out;
>>>>> - }
>>>>> + /* Auto detect Environment */
>>>>> + RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
>>>>> + if (ops) {
>>>>> + POWER_LOG(INFO,
>>>>> + "Attempting to initialise %s cpufreq power management...",
>>>>> + ops->name);
>>>>> + if (ops->init(lcore_id) == 0) {
>>>>> + for (env = 0; env < RTE_DIM(power_env_str); env++)
>>>>> + if (strncmp(ops->name, power_env_str[env],
>>>>> + RTE_POWER_DRIVER_NAMESZ) == 0) {
>>>>> + rte_power_set_env(env);
>>>>> + return 0;
>>>>> + }
>>>>> + }
>>>>> + }
>>>> Can we change the logic of rte_power_set_env()? like:
>>>> RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
>>>> for (env = 0; env < RTE_DIM(power_env_str); env++) {
>>>> if (strncmp(ops->name, power_env_str[env],
>>>> RTE_POWER_DRIVER_NAMESZ) == 0 &&
>>>> ops->init(lcore_id) == 0) {
>>>> global_power_core_ops = ops;
>>>> global_default_env = env;
>>>> }
>>>> }
>>>> }
>>>> That is easier to follow code.
>>> Yes, will fix in next version.
>>>
>>>>> +
>>>>> + POWER_LOG(ERR,
>>>>> + "Unable to set Power Management Environment for lcore %u",
>>>>> + lcore_id);
>>>>>
>>>>> - POWER_LOG(INFO, "Attempting to initialise VM power management...");
>>>>> - ret = power_kvm_vm_init(lcore_id);
>>>>> - if (ret == 0) {
>>>>> - rte_power_set_env(PM_ENV_KVM_VM);
>>>>> - goto out;
>>>>> - }
>>>>> - POWER_LOG(ERR, "Unable to set Power Management Environment for
>> lcore
>>>> "
>>>>> - "%u", lcore_id);
>>>>> -out:
>>>>> - return ret;
>>>>> + return -1;
>>>>> }
>>>>>
>>>>> int
>>>>> rte_power_exit(unsigned int lcore_id)
>>>>> {
>>>>> - switch (global_default_env) {
>>>>> - case PM_ENV_ACPI_CPUFREQ:
>>>>> - return power_acpi_cpufreq_exit(lcore_id);
>>>>> - case PM_ENV_KVM_VM:
>>>>> - return power_kvm_vm_exit(lcore_id);
>>>>> - case PM_ENV_PSTATE_CPUFREQ:
>>>>> - return power_pstate_cpufreq_exit(lcore_id);
>>>>> - case PM_ENV_CPPC_CPUFREQ:
>>>>> - return power_cppc_cpufreq_exit(lcore_id);
>>>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>>>> - return power_amd_pstate_cpufreq_exit(lcore_id);
>>>>> - default:
>>>>> - POWER_LOG(ERR, "Environment has not been set, unable to exit
>>>> gracefully");
>>>>> + if (global_default_env != PM_ENV_NOT_SET)
>>>>> + return global_power_core_ops->exit(lcore_id);
>>>>>
>>>>> - }
>>>>> - return -1;
>>>>> + POWER_LOG(ERR,
>>>>> + "Environment has not been set, unable to exit
>>>>> + gracefully");
>>>>>
>>>>> + return -1;
>>>>> }
>>>>> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
>>>>> 4fa4afe399..5e4aacf08b 100644
>>>>> --- a/lib/power/rte_power.h
>>>>> +++ b/lib/power/rte_power.h
>>>>> @@ -1,5 +1,6 @@
>>>>> /* SPDX-License-Identifier: BSD-3-Clause
>>>>> * Copyright(c) 2010-2014 Intel Corporation
>>>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>>>> */
>>>>>
>>>>> #ifndef _RTE_POWER_H
>>>>> @@ -14,14 +15,21 @@
>>>>> #include <rte_log.h>
>>>>> #include <rte_power_guest_channel.h>
>>>>>
>>>>> +#include "rte_power_core_ops.h"
>>>>> +
>>>>> #ifdef __cplusplus
>>>>> extern "C" {
>>>>> #endif
>>>>>
>>>>> /* Power Management Environment State */ -enum
>>>>> power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ,
>> PM_ENV_KVM_VM,
>>>>> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
>>>>> - PM_ENV_AMD_PSTATE_CPUFREQ};
>>>>> +enum power_management_env {
>>>>> + PM_ENV_NOT_SET = 0,
>>>>> + PM_ENV_ACPI_CPUFREQ,
>>>>> + PM_ENV_KVM_VM,
>>>>> + PM_ENV_PSTATE_CPUFREQ,
>>>>> + PM_ENV_CPPC_CPUFREQ,
>>>>> + PM_ENV_AMD_PSTATE_CPUFREQ
>>>>> +};
>>>>>
>>>>> /**
>>>>> * Check if a specific power management environment type is
>>>>> supported on a @@ -66,6 +74,15 @@ void rte_power_unset_env(void);
>>>>> */
>>>>> enum power_management_env rte_power_get_env(void);
>>>> I'd like to let user not know used which cpufreq driver, which is friendly to user.
>>>>
>>>> So we can rethink if this API is necessary.
>>> For any API changes, could we handle this as a separate RFC for discussion?
>>> It’s important that these changes are not included within the scope of this patch.
>> Agreed.
>> Can you post a separate RFC to disscuss this improvement later?
>>>>> +/**
>>>>> + * @internal Get the power ops struct from its index.
>>>>> + *
>>>>> + * @return
>>>>> + * The pointer to the ops struct in the table if registered.
>>>>> + */
>>>>> +struct rte_power_core_ops *
>>>>> +rte_power_get_core_ops(void);
>>>>> +
>>>>> /**
>>>>> * Initialize power management for a specific lcore. If rte_power_set_env()
>> has
>>>>> * not been called then an auto-detect of the environment will
>>>>> start and @@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
>>>>> * @return
>>>>> * The number of available frequencies.
>>>>> */
>>>>> -typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
>>>>> - uint32_t num);
>>>>> +static inline uint32_t
>>>>> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>>
>>>>> -extern rte_power_freqs_t rte_power_freqs;
>>>>> + return ops->get_avail_freqs(lcore_id, freqs, n); }
>>>>>
>>>>> /**
>>>>> * Return the current index of available frequencies of a specific lcore.
>>>>> @@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
>>>>> * @return
>>>>> * The current index of available frequencies.
>>>>> */
>>>>> -typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
>>>>> +static inline uint32_t
>>>>> +rte_power_get_freq(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>>
>>>>> -extern rte_power_get_freq_t rte_power_get_freq;
>>>>> + return ops->get_freq(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Set the new frequency for a specific lcore by indicating the
>>>>> index of @@ -144,82 +168,101 @@ extern rte_power_get_freq_t
>>>> rte_power_get_freq;
>>>>> * - 0 on success without frequency changed.
>>>>> * - Negative on error.
>>>>> */
>>>>> -typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t
>>>>> index);
>>>>> -
>>>>> -extern rte_power_set_freq_t rte_power_set_freq;
>>>>> +static inline uint32_t
>>>>> +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>>
>>>>> -/**
>>>>> - * Function pointer definition for generic frequency change
>>>>> functions. Review
>>>>> - * each environments specific documentation for usage.
>>>>> - *
>>>>> - * @param lcore_id
>>>>> - * lcore id.
>>>>> - *
>>>>> - * @return
>>>>> - * - 1 on success with frequency changed.
>>>>> - * - 0 on success without frequency changed.
>>>>> - * - Negative on error.
>>>>> - */
>>>>> -typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
>>>>> + return ops->set_freq(lcore_id, index); }
>>>>>
>>>>> /**
>>>>> * Scale up the frequency of a specific lcore according to the available
>>>>> * frequencies.
>>>>> * Review each environments specific documentation for usage.
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_freq_up;
>>>>> +static inline int
>>>>> +rte_power_freq_up(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>> +
>>>>> + return ops->freq_up(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Scale down the frequency of a specific lcore according to the available
>>>>> * frequencies.
>>>>> * Review each environments specific documentation for usage.
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_freq_down;
>>>>> +static inline int
>>>>> +rte_power_freq_down(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>> +
>>>>> + return ops->freq_down(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Scale up the frequency of a specific lcore to the highest according to the
>>>>> * available frequencies.
>>>>> * Review each environments specific documentation for usage.
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_freq_max;
>>>>> +static inline int
>>>>> +rte_power_freq_max(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>> +
>>>>> + return ops->freq_max(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Scale down the frequency of a specific lcore to the lowest according to the
>>>>> * available frequencies.
>>>>> * Review each environments specific documentation for usage..
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_freq_min;
>>>>> +static inline int
>>>>> +rte_power_freq_min(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>> +
>>>>> + return ops->freq_min(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Query the Turbo Boost status of a specific lcore.
>>>>> * Review each environments specific documentation for usage..
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_turbo_status;
>>>>> +static inline int
>>>>> +rte_power_turbo_status(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>> +
>>>>> + return ops->turbo_status(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Enable Turbo Boost for this lcore.
>>>>> * Review each environments specific documentation for usage..
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_freq_enable_turbo;
>>>>> +static inline int
>>>>> +rte_power_freq_enable_turbo(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>> +
>>>>> + return ops->enable_turbo(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Disable Turbo Boost for this lcore.
>>>>> * Review each environments specific documentation for usage..
>>>>> */
>>>>> -extern rte_power_freq_change_t rte_power_freq_disable_turbo;
>>>>> +static inline int
>>>>> +rte_power_freq_disable_turbo(unsigned int lcore_id) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>>
>>>>> -/**
>>>>> - * Power capabilities summary.
>>>>> - */
>>>>> -struct rte_power_core_capabilities {
>>>>> - union {
>>>>> - uint64_t capabilities;
>>>>> - struct {
>>>>> - uint64_t turbo:1; /**< Turbo can be enabled. */
>>>>> - uint64_t priority:1; /**< SST-BF high freq core */
>>>>> - };
>>>>> - };
>>>>> -};
>>>>> + return ops->disable_turbo(lcore_id); }
>>>>>
>>>>> /**
>>>>> * Returns power capabilities for a specific lcore.
>>>>> @@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
>>>>> * - 0 on success.
>>>>> * - Negative on error.
>>>>> */
>>>>> -typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
>>>>> - struct rte_power_core_capabilities *caps);
>>>>> +static inline int
>>>>> +rte_power_get_capabilities(unsigned int lcore_id,
>>>>> + struct rte_power_core_capabilities *caps) {
>>>>> + struct rte_power_core_ops *ops = rte_power_get_core_ops();
>>>>>
>>>>> -extern rte_power_get_capabilities_t rte_power_get_capabilities;
>>>>> + return ops->get_caps(lcore_id, caps); }
>>>>>
>>>>> #ifdef __cplusplus
>>>>> }
>>>>> diff --git a/lib/power/rte_power_core_ops.h
>>>>> b/lib/power/rte_power_core_ops.h new file mode 100644 index
>>>>> 0000000000..356a64df79
>>>>> --- /dev/null
>>>>> +++ b/lib/power/rte_power_core_ops.h
>>>>> @@ -0,0 +1,208 @@
>>>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>>>> + * Copyright(c) 2010-2014 Intel Corporation
>>>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>>>> + */
>>>>> +
>>>>> +#ifndef _RTE_POWER_CORE_OPS_H
>>>>> +#define _RTE_POWER_CORE_OPS_H
>>>>> +
>>>> suggest rename the file as rte_power_cpufreq_api.h.
>>>> If so, the role of this file is more clearly.
>>>>> +__rte_internal
>>>>> +int rte_power_register_ops(struct rte_power_core_ops *ops);
>>>>> +
>>>>> +/**
>>>>> + * Macro to statically register the ops of a cpufreq driver.
>>>>> + */
>>>>> +#define RTE_POWER_REGISTER_OPS(ops) \
>>>>> + RTE_INIT(power_hdlr_init_##ops) \
>>>>> + { \
>>>>> + rte_power_register_ops(&ops); \
>>>>> + }
>>>>> +
>>>>> +/**
>>>>> + * @internal Get the power ops struct from its index.
>>>>> + *
>>>>> + * @return
>>>>> + * The pointer to the ops struct in the table if registered.
>>>>> + */
>>>>> +struct rte_power_core_ops *
>>>>> +rte_power_get_core_ops(void);
>>>>> +
>>>>> +#ifdef __cplusplus
>>>>> +}
>>>>> +#endif
>>>>> +
>>>>> +#endif
>>>>> diff --git a/lib/power/version.map b/lib/power/version.map index
>>>>> c9a226614e..bd64e0828f 100644
>>>>> --- a/lib/power/version.map
>>>>> +++ b/lib/power/version.map
>>>>> @@ -51,4 +51,18 @@ EXPERIMENTAL {
>>>>> rte_power_set_uncore_env;
>>>>> rte_power_uncore_freqs;
>>>>> rte_power_unset_uncore_env;
>>>>> + # added in 24.07
>>>> 24.07-->24.11?
>>>>> + rte_power_logtype;
>>>>> +};
>>>>> +
>>>>> +INTERNAL {
>>>>> + global:
>>>>> +
>>>>> + rte_power_register_ops;
>>>>> + cpufreq_check_scaling_driver;
>>>>> + power_set_governor;
>>>>> + open_core_sysfs_file;
>>>>> + read_core_sysfs_u32;
>>>>> + read_core_sysfs_s;
>>>>> + write_core_sysfs_s;
>>>>> };
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 0/4] power: refactor power management library
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-08-26 13:06 ` [PATCH v2 0/4] power: refactor power management library Sivaprasad Tummala
@ 2024-10-07 18:01 ` Stephen Hemminger
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
6 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-07 18:01 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, dev
On Mon, 26 Aug 2024 13:06:45 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless integration
> of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
>
> Sivaprasad Tummala (4):
> power: refactor core power management library
> power: refactor uncore power management library
> test/power: removed function pointer validations
> power/amd_uncore: uncore power management support for AMD EPYC
> processors
Build fails and lots of comments. Please redo.
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v2 1/4] power: refactor core power management library
2024-08-26 15:26 ` Stephen Hemminger
@ 2024-10-07 19:25 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-07 19:25 UTC (permalink / raw)
To: Stephen Hemminger
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, Yigit, Ferruh, konstantin.ananyev, dev
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Stephen,
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, August 26, 2024 8:56 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
> Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com;
> dev@dpdk.org
> Subject: Re: [PATCH v2 1/4] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Mon, 26 Aug 2024 13:06:46 +0000
> Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
>
> > +static struct rte_power_core_ops acpi_ops = {
> > + .name = "acpi",
> > + .init = power_acpi_cpufreq_init,
> > + .exit = power_acpi_cpufreq_exit,
> > + .check_env_support = power_acpi_cpufreq_check_supported,
> > + .get_avail_freqs = power_acpi_cpufreq_freqs,
> > + .get_freq = power_acpi_cpufreq_get_freq,
> > + .set_freq = power_acpi_cpufreq_set_freq,
> > + .freq_down = power_acpi_cpufreq_freq_down,
> > + .freq_up = power_acpi_cpufreq_freq_up,
> > + .freq_max = power_acpi_cpufreq_freq_max,
> > + .freq_min = power_acpi_cpufreq_freq_min,
> > + .turbo_status = power_acpi_turbo_status,
> > + .enable_turbo = power_acpi_enable_turbo,
> > + .disable_turbo = power_acpi_disable_turbo,
> > + .get_caps = power_acpi_get_capabilities };
> > +
>
> Can this be made const?
> It is good for security and overall safety to have structures with function pointers
> marked const.
The struct relies on dynamic list operations, it makes sense to keep it mutable.
This will ensure we can effectively manage the operations as needed without
running into issues with read-only restrictions.
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v2 2/4] power: refactor uncore power management library
2024-08-27 13:02 ` lihuisong (C)
@ 2024-10-08 6:19 ` Tummala, Sivaprasad
2024-10-22 2:05 ` lihuisong (C)
0 siblings, 1 reply; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-08 6:19 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Lihuisong,
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Tuesday, August 27, 2024 6:33 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> radu.nicolau@intel.com; jerinj@marvell.com; cristian.dumitrescu@intel.com;
> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> gakhil@marvell.com
> Subject: Re: [PATCH v2 2/4] power: refactor uncore power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hi Sivaprasad,
>
> Suggest to split this patch into two patches for easiler to review:
> patch-1: abstract a file for uncore dvfs core level, namely, the
> rte_power_uncore_ops.c you did.
> patch-2: move and rename, lib/power/power_intel_uncore.c =>
> drivers/power/intel_uncore/intel_uncore.c
>
> patch[1/4] is also too big and not good to review.
>
> In addition, I have some question and am not sure if we can adjust uncore init
> process.
>
> /Huisong
>
>
> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
> > This patch refactors the power management library, addressing uncore
> > power management. The primary changes involve the creation of
> > dedicated directories for each driver within 'drivers/power/uncore/*'.
> > The adjustment of meson.build files enables the selective activation
> > of individual drivers.
> >
> > This refactor significantly improves code organization, enhances
> > clarity and boosts maintainability. It lays the foundation for more
> > focused development on individual drivers and facilitates seamless
> > integration of future enhancements, particularly the AMD uncore driver.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> > .../power/intel_uncore/intel_uncore.c | 18 +-
> > .../power/intel_uncore/intel_uncore.h | 8 +-
> > drivers/power/intel_uncore/meson.build | 6 +
> > drivers/power/meson.build | 3 +-
> > lib/power/meson.build | 2 +-
> > lib/power/rte_power_uncore.c | 205 ++++++---------
> > lib/power/rte_power_uncore.h | 87 ++++---
> > lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
> > lib/power/version.map | 1 +
> > 9 files changed, 405 insertions(+), 164 deletions(-)
> > rename lib/power/power_intel_uncore.c =>
> drivers/power/intel_uncore/intel_uncore.c (95%)
> > rename lib/power/power_intel_uncore.h =>
> drivers/power/intel_uncore/intel_uncore.h (97%)
> > create mode 100644 drivers/power/intel_uncore/meson.build
> > create mode 100644 lib/power/rte_power_uncore_ops.h
> >
> > diff --git a/lib/power/power_intel_uncore.c
> > b/drivers/power/intel_uncore/intel_uncore.c
> > similarity index 95%
> > rename from lib/power/power_intel_uncore.c rename to
> > drivers/power/intel_uncore/intel_uncore.c
> > index 4eb9c5900a..804ad5d755 100644
> > --- a/lib/power/power_intel_uncore.c
> > +++ b/drivers/power/intel_uncore/intel_uncore.c
> > @@ -8,7 +8,7 @@
> >
> > #include <rte_memcpy.h>
> >
> > -#include "power_intel_uncore.h"
> > +#include "intel_uncore.h"
> > #include "power_common.h"
> >
> > #define MAX_NUMA_DIE 8
> > @@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
> >
> > return count;
> > }
> <...>
> >
> > -#endif /* POWER_INTEL_UNCORE_H */
> > +#endif /* INTEL_UNCORE_H */
> > diff --git a/drivers/power/intel_uncore/meson.build
> > b/drivers/power/intel_uncore/meson.build
> > new file mode 100644
> > index 0000000000..876df8ad14
> > --- /dev/null
> > +++ b/drivers/power/intel_uncore/meson.build
> > @@ -0,0 +1,6 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel
> > +Corporation # Copyright(c) 2024 Advanced Micro Devices, Inc.
> > +
> > +sources = files('intel_uncore.c')
> > +deps += ['power']
> > diff --git a/drivers/power/meson.build b/drivers/power/meson.build
> > index 8c7215c639..c83047af94 100644
> > --- a/drivers/power/meson.build
> > +++ b/drivers/power/meson.build
> > @@ -6,7 +6,8 @@ drivers = [
> > 'amd_pstate',
> > 'cppc',
> > 'kvm_vm',
> > - 'pstate'
> > + 'pstate',
> > + 'intel_uncore'
> The cppc, amd_pstate and so on belong to cpufreq scope.
> And intel_uncore belongs to uncore dvfs scope.
> They are not the same level. So I proposes that we need to create one directory
> called like cpufreq or core.
> This 'intel_uncore' name don't seems appropriate. what do you think the following
> directory structure:
> drivers/power/uncore/intel_uncore.c
> drivers/power/uncore/amd_uncore.c (according to the patch[4/4]).
At present, Meson does not support detecting an additional level of subdirectories within drivers/*.
All the drivers maintain a consistent subdirectory structure.
> > ]
> > std_deps = ['power']
> > diff --git a/lib/power/meson.build b/lib/power/meson.build index
> > f3e3451cdc..9b13d98810 100644
> > --- a/lib/power/meson.build
> > +++ b/lib/power/meson.build
> > @@ -13,7 +13,6 @@ if not is_linux
> > endif
> > sources = files(
> > 'power_common.c',
> > - 'power_intel_uncore.c',
> > 'rte_power.c',
> > 'rte_power_uncore.c',
> > 'rte_power_pmd_mgmt.c',
> > @@ -24,6 +23,7 @@ headers = files(
> > 'rte_power_guest_channel.h',
> > 'rte_power_pmd_mgmt.h',
> > 'rte_power_uncore.h',
> > + 'rte_power_uncore_ops.h',
> > )
> > if cc.has_argument('-Wno-cast-qual')
> > cflags += '-Wno-cast-qual'
> > diff --git a/lib/power/rte_power_uncore.c
> > b/lib/power/rte_power_uncore.c index 48c75a5da0..9f8771224f 100644
> > --- a/lib/power/rte_power_uncore.c
> > +++ b/lib/power/rte_power_uncore.c
> > @@ -1,6 +1,7 @@
> > /* SPDX-License-Identifier: BSD-3-Clause
> > * Copyright(c) 2010-2014 Intel Corporation
> > * Copyright(c) 2023 AMD Corporation
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> > */
> >
> > #include <errno.h>
> > @@ -12,98 +13,50 @@
> > #include "rte_power_uncore.h"
> > #include "power_intel_uncore.h"
> >
> > -enum rte_uncore_power_mgmt_env default_uncore_env =
> > RTE_UNCORE_PM_ENV_NOT_SET;
> > +static enum rte_uncore_power_mgmt_env global_uncore_env =
> > +RTE_UNCORE_PM_ENV_NOT_SET; static struct rte_power_uncore_ops
> > +*global_uncore_ops;
> >
> > static rte_spinlock_t global_env_cfg_lock =
> > RTE_SPINLOCK_INITIALIZER;
> > +static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
> > + TAILQ_HEAD_INITIALIZER(uncore_ops_list);
> >
> > -static uint32_t
> > -power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
> > - unsigned int die __rte_unused)
> > -{
> > - return 0;
> > -}
> > -
> > -static int
> > -power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
> > - unsigned int die __rte_unused, uint32_t index __rte_unused)
> > -{
> > - return 0;
> > -}
> > +const char *uncore_env_str[] = {
> > + "not set",
> > + "auto-detect",
> > + "intel-uncore",
> > + "amd-hsmp"
> > +};
> Why open the "auto-detect" mode to user?
> Why not set this automatically at framework initialization?
> After all, the uncore driver is fixed for one platform.
The auto-detection feature has been implemented to enable seamless migration across platforms
without requiring any changes to the application
> >
> > -static int
> > -power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
> > - unsigned int die __rte_unused)
> > -{
> > - return 0;
> > -}
> > -
> <...>
> > -static int
> > -power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
> > - unsigned int die __rte_unused)
> > +/* register the ops struct in rte_power_uncore_ops, return 0 on
> > +success. */ int rte_power_register_uncore_ops(struct
> > +rte_power_uncore_ops *driver_ops)
> > {
> > - return 0;
> > -}
> > + if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
> > + !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
> > + !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
> > + !driver_ops->set_freq || !driver_ops->freq_max ||
> > + !driver_ops->freq_min) {
> > + POWER_LOG(ERR, "Missing callbacks while registering power ops");
> > + return -1;
> > + }
> > + if (driver_ops->cb)
> > + driver_ops->cb();
> >
> > -static unsigned int
> > -power_dummy_uncore_get_num_pkgs(void)
> > -{
> > - return 0;
> > -}
> > + TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
> >
> > -static unsigned int
> > -power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused) -{
> > return 0;
> > }
> > -
> > -/* function pointers */
> > -rte_power_get_uncore_freq_t rte_power_get_uncore_freq =
> > power_get_dummy_uncore_freq; -rte_power_set_uncore_freq_t
> > rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
> > -rte_power_uncore_freq_change_t rte_power_uncore_freq_max =
> > power_dummy_uncore_freq_max; -rte_power_uncore_freq_change_t
> > rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
> > -rte_power_uncore_freqs_t rte_power_uncore_freqs =
> > power_dummy_uncore_freqs; -rte_power_uncore_get_num_freqs_t
> > rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
> > -rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs =
> > power_dummy_uncore_get_num_pkgs; -rte_power_uncore_get_num_dies_t
> > rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
> > -
> > -static void
> > -reset_power_uncore_function_ptrs(void)
> > -{
> > - rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
> > - rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
> > - rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
> > - rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
> > - rte_power_uncore_freqs = power_dummy_uncore_freqs;
> > - rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
> > - rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
> > - rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
> > -}
> > -
> > int
> > rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
> > {
> > - int ret;
> > + int ret = -1;
> > + struct rte_power_uncore_ops *ops;
> >
> > rte_spinlock_lock(&global_env_cfg_lock);
> >
> > - if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
> > + if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
> > POWER_LOG(ERR, "Uncore Power Management Env already set.");
> > - rte_spinlock_unlock(&global_env_cfg_lock);
> > - return -1;
> > + goto out;
> > }
> >
> <...>
> > + if (env <= RTE_DIM(uncore_env_str)) {
> > + RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
> > + if (strncmp(ops->name, uncore_env_str[env],
> > + RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
> > + global_uncore_env = env;
> > + global_uncore_ops = ops;
> > + ret = 0;
> > + goto out;
> > + }
> > + POWER_LOG(ERR, "Power Management (%s) not supported",
> > + uncore_env_str[env]);
> > + } else
> > + POWER_LOG(ERR, "Invalid Power Management Environment");
> >
> > - default_uncore_env = env;
> > out:
> > rte_spinlock_unlock(&global_env_cfg_lock);
> > return ret;
> > @@ -139,15 +89,22 @@ void
> > rte_power_unset_uncore_env(void)
> > {
> > rte_spinlock_lock(&global_env_cfg_lock);
> > - default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> > - reset_power_uncore_function_ptrs();
> > + global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> > rte_spinlock_unlock(&global_env_cfg_lock);
> > }
> >
>
> How about abstract an ABI interface to intialize or set the uncore driver on platform
> by automatical.
>
> And later do power_intel_uncore_init_on_die() for each die on different package.
>
> > enum rte_uncore_power_mgmt_env
> > rte_power_get_uncore_env(void)
> > {
> > - return default_uncore_env;
> > + return global_uncore_env;
> > +}
> > +
> > +struct rte_power_uncore_ops *
> > +rte_power_get_uncore_ops(void)
> > +{
> > + RTE_ASSERT(global_uncore_ops != NULL);
> > +
> > + return global_uncore_ops;
> > }
> >
> > int
> > @@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned
> > int die)
> This pkg means the socket id on the platform, right?
> If so, I am not sure that the
> uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE] used in uncore lib is
> universal for all uncore driver.
> For example, uncore driver just support do uncore dvfs based on the socket unit.
> What shoud we do for this? we may need to think twice.
Yes, pkg represents a socket id. In platforms with a single uncore controller per socket,
the die ID should be set to '0' for the corresponding socket ID (pkg).
.
> > {
> > int ret = -1;
> >
> <...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 0/5] power: refactor power management library
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-07 18:01 ` Stephen Hemminger
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
` (7 more replies)
6 siblings, 8 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 328 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 286 +++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 206 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
40 files changed, 1602 insertions(+), 624 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 1/5] power: refactor core power management library
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 2/5] power: refactor uncore " Sivaprasad Tummala
` (6 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 286 ++++++------------
lib/power/rte_power.h | 139 ++++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 618 insertions(+), 269 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index 81996e1c13..8637c69703 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -577,3 +577,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..c685008fb5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 090a0d96cb..f571f4184a 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -700,3 +700,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 97%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..17bd8e2eaf 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if amd p-state power management is supported.
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index 32aaacb948..775b8f4434 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -685,3 +685,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..64a766145a 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..8b92054076 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..405524ce7c
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Advanced Micro Devices, Inc.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 2343121621..c32b1adabc 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -888,3 +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..5fddb40280 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..d6b86ea19c 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_cpufreq_api.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index 590986d5ef..6c06411e8b 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -12,7 +12,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 83f742f42a..767686ee12 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,13 +24,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..fbee9033f2 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -8,153 +8,86 @@
#include <rte_spinlock.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
+ goto out;
}
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,94 +97,65 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum power_management_env
-rte_power_get_env(void) {
+rte_power_get_env(void)
+{
return global_default_env;
}
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+
+ return global_power_core_ops;
+}
+
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
-
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
- }
- return -1;
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..d77d285c18 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_cpufreq_api.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_cpufreq_api.h b/lib/power/rte_power_cpufreq_api.h
new file mode 100644
index 0000000000..526372e0d4
--- /dev/null
+++ b/lib/power/rte_power_cpufreq_api.h
@@ -0,0 +1,208 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CPUFREQ_API_H
+#define _RTE_POWER_CPUFREQ_API_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_ops(&ops); \
+}
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..a46dd8adbf 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,18 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.11
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 2/5] power: refactor uncore power management library
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 3/5] test/power: removed function pointer validations Sivaprasad Tummala
` (5 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v3:
- fixed typo in header file inclusion
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 206 ++++++---------
lib/power/rte_power_uncore.h | 87 ++++---
lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 405 insertions(+), 165 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index d6b86ea19c..63616e60fd 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..0f0b212a90 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <errno.h>
@@ -10,100 +11,51 @@
#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +65,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,15 +88,22 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
+}
+
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+
+ return global_uncore_ops;
}
int
@@ -155,27 +111,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +141,12 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..c9fba02568 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +12,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +116,13 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,26 +145,13 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-/**
- * Function pointer definition for generic frequency change functions.
- *
- * @param pkg
- * Package number.
- * Each physical CPU in a system is referred to as a package.
- * @param die
- * Die number.
- * Each package can have several dies connected together via the uncore mesh.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+ return ops->set_freq(pkg, die, index);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -169,7 +160,13 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +175,13 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +203,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,9 +228,13 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +246,13 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+ return ops->get_num_pkgs();
+}
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +268,13 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..d0bbffcbf9
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,239 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index a46dd8adbf..7c9aece813 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 3/5] test/power: removed function pointer validations
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
` (4 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index 619b2811c6..8cb67e662c 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -519,58 +519,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-08 17:27 ` [PATCH v3 3/5] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 5/5] maintainers: update for drivers/power Sivaprasad Tummala
` (3 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 328 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 575 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..e667a783cd
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,328 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ num_uncore_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 5/5] maintainers: update for drivers/power
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-08 17:27 ` [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 0/5] power: refactor power management library Sivaprasad Tummala
` (2 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 812463fe9f..7d2868fe30 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1737,6 +1737,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 0/5] power: refactor power management library
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-08 17:27 ` [PATCH v3 5/5] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-08 17:27 ` Sivaprasad Tummala
2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:27 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 328 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 286 +++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 206 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
40 files changed, 1602 insertions(+), 624 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 0/5] power: refactor power management library
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-08 17:27 ` [PATCH v3 0/5] power: refactor power management library Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
` (6 more replies)
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
7 siblings, 7 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 286 +++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 206 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
40 files changed, 1603 insertions(+), 624 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 1/5] power: refactor core power management library
2024-10-08 17:43 ` Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 2/5] power: refactor uncore " Sivaprasad Tummala
` (5 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 286 ++++++------------
lib/power/rte_power.h | 139 ++++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 618 insertions(+), 269 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index 81996e1c13..8637c69703 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -577,3 +577,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..c685008fb5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 090a0d96cb..f571f4184a 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -700,3 +700,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 97%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..17bd8e2eaf 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if amd p-state power management is supported.
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index 32aaacb948..775b8f4434 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -685,3 +685,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..64a766145a 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..8b92054076 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..405524ce7c
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Advanced Micro Devices, Inc.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 2343121621..c32b1adabc 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -888,3 +888,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..5fddb40280 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..d6b86ea19c 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_cpufreq_api.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index 590986d5ef..6c06411e8b 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -12,7 +12,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 83f742f42a..767686ee12 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,13 +24,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..fbee9033f2 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -8,153 +8,86 @@
#include <rte_spinlock.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
+ goto out;
}
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,94 +97,65 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum power_management_env
-rte_power_get_env(void) {
+rte_power_get_env(void)
+{
return global_default_env;
}
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+
+ return global_power_core_ops;
+}
+
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
-
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
- }
- return -1;
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..d77d285c18 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_cpufreq_api.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_cpufreq_api.h b/lib/power/rte_power_cpufreq_api.h
new file mode 100644
index 0000000000..526372e0d4
--- /dev/null
+++ b/lib/power/rte_power_cpufreq_api.h
@@ -0,0 +1,208 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CPUFREQ_API_H
+#define _RTE_POWER_CPUFREQ_API_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_ops(&ops); \
+}
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..a46dd8adbf 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,18 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.11
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 2/5] power: refactor uncore power management library
2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 3/5] test/power: removed function pointer validations Sivaprasad Tummala
` (4 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v3:
- fixed typo in header file inclusion
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 206 ++++++---------
lib/power/rte_power_uncore.h | 87 ++++---
lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 405 insertions(+), 165 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index d6b86ea19c..63616e60fd 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..0f0b212a90 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <errno.h>
@@ -10,100 +11,51 @@
#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +65,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,15 +88,22 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
+}
+
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+
+ return global_uncore_ops;
}
int
@@ -155,27 +111,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +141,12 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..c9fba02568 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +12,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +116,13 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,26 +145,13 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-/**
- * Function pointer definition for generic frequency change functions.
- *
- * @param pkg
- * Package number.
- * Each physical CPU in a system is referred to as a package.
- * @param die
- * Die number.
- * Each package can have several dies connected together via the uncore mesh.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+ return ops->set_freq(pkg, die, index);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -169,7 +160,13 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +175,13 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +203,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,9 +228,13 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +246,13 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+ return ops->get_num_pkgs();
+}
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +268,13 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..d0bbffcbf9
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,239 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index a46dd8adbf..7c9aece813 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 3/5] test/power: removed function pointer validations
2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
` (3 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index 619b2811c6..8cb67e662c 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -519,58 +519,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors
2024-10-08 17:43 ` Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-08 17:43 ` [PATCH v3 3/5] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 5/5] maintainers: update for drivers/power Sivaprasad Tummala
` (2 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 576 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 5/5] maintainers: update for drivers/power
2024-10-08 17:43 ` Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-08 17:43 ` [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-12 17:44 ` Stephen Hemminger
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 812463fe9f..7d2868fe30 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1737,6 +1737,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v3 0/5] power: refactor power management library
2024-10-08 17:43 ` Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-08 17:43 ` [PATCH v3 5/5] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-12 17:44 ` Stephen Hemminger
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-08 17:43 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 286 +++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 206 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
40 files changed, 1603 insertions(+), 624 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v3 0/5] power: refactor power management library
2024-10-08 17:43 ` Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-08 17:43 ` [PATCH v3 0/5] power: refactor power management library Sivaprasad Tummala
@ 2024-10-12 17:44 ` Stephen Hemminger
6 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-12 17:44 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, dev
On Tue, 8 Oct 2024 17:43:03 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
>
> Sivaprasad Tummala (5):
> power: refactor core power management library
> power: refactor uncore power management library
> test/power: removed function pointer validations
> power/amd_uncore: uncore support for AMD EPYC processors
> maintainers: update for drivers/power
Looks good but several build failures.
It looks like the new internal function to get power ops is not
in version.map.
Also while looking, if I use IWYU it shows that errno.h and rte_errno.h
are included but never used.
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 0/5] power: refactor power management library
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
` (6 preceding siblings ...)
2024-10-08 17:43 ` Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 1/5] power: refactor core " Sivaprasad Tummala
` (7 more replies)
7 siblings, 8 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 +++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 207 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
40 files changed, 1605 insertions(+), 624 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 1/5] power: refactor core power management library
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 2/5] power: refactor uncore " Sivaprasad Tummala
` (6 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 ++++++------------
lib/power/rte_power.h | 139 ++++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 619 insertions(+), 269 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index abad53bef1..c3fd10f287 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -583,3 +583,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..c685008fb5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 4809d45a22..5eb3828f7a 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -706,3 +706,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 97%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..17bd8e2eaf 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if amd p-state power management is supported.
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index e73f4520d0..5ad22e3ddd 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -691,3 +691,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..64a766145a 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..8b92054076 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..405524ce7c
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Advanced Micro Devices, Inc.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 1c2a91a178..362a4de91c 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -894,3 +894,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..5fddb40280 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..d6b86ea19c 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_cpufreq_api.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index 590986d5ef..6c06411e8b 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -12,7 +12,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 83f742f42a..767686ee12 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,13 +24,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..372a9ff4f8 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -6,155 +6,89 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
+ goto out;
}
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,94 +98,65 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum power_management_env
-rte_power_get_env(void) {
+rte_power_get_env(void)
+{
return global_default_env;
}
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+
+ return global_power_core_ops;
+}
+
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
-
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
- }
- return -1;
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..d77d285c18 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_cpufreq_api.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -108,10 +125,13 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_cpufreq_api.h b/lib/power/rte_power_cpufreq_api.h
new file mode 100644
index 0000000000..526372e0d4
--- /dev/null
+++ b/lib/power/rte_power_cpufreq_api.h
@@ -0,0 +1,208 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CPUFREQ_API_H
+#define _RTE_POWER_CPUFREQ_API_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_ops(&ops); \
+}
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..a46dd8adbf 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,18 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.11
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 2/5] power: refactor uncore power management library
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 3/5] test/power: removed function pointer validations Sivaprasad Tummala
` (5 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v4:
- fixed build error with RTE_ASSERT
v3:
- fixed typo in header file inclusion
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 207 ++++++---------
lib/power/rte_power_uncore.h | 87 ++++---
lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 406 insertions(+), 165 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index d6b86ea19c..63616e60fd 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..c049573d64 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -1,109 +1,62 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <errno.h>
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +66,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,15 +89,22 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
+}
+
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+
+ return global_uncore_ops;
}
int
@@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +142,12 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..c9fba02568 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +12,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +116,13 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,26 +145,13 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-/**
- * Function pointer definition for generic frequency change functions.
- *
- * @param pkg
- * Package number.
- * Each physical CPU in a system is referred to as a package.
- * @param die
- * Die number.
- * Each package can have several dies connected together via the uncore mesh.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+ return ops->set_freq(pkg, die, index);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -169,7 +160,13 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +175,13 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +203,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,9 +228,13 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +246,13 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+ return ops->get_num_pkgs();
+}
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +268,13 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..d0bbffcbf9
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,239 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index a46dd8adbf..7c9aece813 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 3/5] test/power: removed function pointer validations
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 1/5] power: refactor core " Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
` (4 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index 619b2811c6..8cb67e662c 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -519,58 +519,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 4/5] power/amd_uncore: uncore support for AMD EPYC processors
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-15 2:49 ` [PATCH v4 3/5] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 5/5] maintainers: update for drivers/power Sivaprasad Tummala
` (3 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 576 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 5/5] maintainers: update for drivers/power
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-15 2:49 ` [PATCH v4 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 0/5] power: refactor power management library Sivaprasad Tummala
` (2 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 6814991735..9f14e8f8d6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1744,6 +1744,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v4 0/5] power: refactor power management library
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-15 2:49 ` [PATCH v4 5/5] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-15 2:49 ` Sivaprasad Tummala
2024-10-15 3:15 ` Stephen Hemminger
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-15 2:49 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
power/amd_uncore: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 +++++----------
lib/power/rte_power.h | 139 +++++---
lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
lib/power/rte_power_uncore.c | 207 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
lib/power/version.map | 15 +
40 files changed, 1605 insertions(+), 624 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v4 0/5] power: refactor power management library
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-15 2:49 ` [PATCH v4 0/5] power: refactor power management library Sivaprasad Tummala
@ 2024-10-15 3:15 ` Stephen Hemminger
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-15 3:15 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, dev
On Tue, 15 Oct 2024 02:49:53 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
>
> Sivaprasad Tummala (5):
> power: refactor core power management library
> power: refactor uncore power management library
> test/power: removed function pointer validations
> power/amd_uncore: uncore support for AMD EPYC processors
> maintainers: update for drivers/power
>
> MAINTAINERS | 1 +
> app/test/test_power.c | 95 -----
> app/test/test_power_cpufreq.c | 52 ---
> app/test/test_power_kvm_vm.c | 36 --
> drivers/meson.build | 1 +
> .../power/acpi/acpi_cpufreq.c | 22 +-
> .../power/acpi/acpi_cpufreq.h | 6 +-
> drivers/power/acpi/meson.build | 10 +
> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
> drivers/power/amd_pstate/meson.build | 10 +
> drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
> drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
> drivers/power/amd_uncore/meson.build | 20 ++
> .../power/cppc/cppc_cpufreq.c | 22 +-
> .../power/cppc/cppc_cpufreq.h | 8 +-
> drivers/power/cppc/meson.build | 10 +
> .../power/intel_uncore/intel_uncore.c | 18 +-
> .../power/intel_uncore/intel_uncore.h | 8 +-
> drivers/power/intel_uncore/meson.build | 6 +
> .../power/kvm_vm}/guest_channel.c | 0
> .../power/kvm_vm}/guest_channel.h | 0
> .../power/kvm_vm/kvm_vm.c | 22 +-
> .../power/kvm_vm/kvm_vm.h | 6 +-
> drivers/power/kvm_vm/meson.build | 16 +
> drivers/power/meson.build | 14 +
> drivers/power/pstate/meson.build | 10 +
> .../power/pstate/pstate_cpufreq.c | 22 +-
> .../power/pstate/pstate_cpufreq.h | 6 +-
> examples/l3fwd-power/main.c | 12 +-
> lib/power/meson.build | 9 +-
> lib/power/power_common.c | 2 +-
> lib/power/power_common.h | 16 +-
> lib/power/rte_power.c | 287 +++++----------
> lib/power/rte_power.h | 139 +++++---
> lib/power/rte_power_cpufreq_api.h | 208 +++++++++++
> lib/power/rte_power_uncore.c | 207 +++++------
> lib/power/rte_power_uncore.h | 87 +++--
> lib/power/rte_power_uncore_ops.h | 239 +++++++++++++
> lib/power/version.map | 15 +
> 40 files changed, 1605 insertions(+), 624 deletions(-)
> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
> create mode 100644 drivers/power/acpi/meson.build
> rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
> create mode 100644 drivers/power/amd_pstate/meson.build
> create mode 100644 drivers/power/amd_uncore/amd_uncore.c
> create mode 100644 drivers/power/amd_uncore/amd_uncore.h
> create mode 100644 drivers/power/amd_uncore/meson.build
> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
> create mode 100644 drivers/power/cppc/meson.build
> rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
> rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
> create mode 100644 drivers/power/intel_uncore/meson.build
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> create mode 100644 drivers/power/kvm_vm/meson.build
> create mode 100644 drivers/power/meson.build
> create mode 100644 drivers/power/pstate/meson.build
> rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
> rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
> create mode 100644 lib/power/rte_power_cpufreq_api.h
> create mode 100644 lib/power/rte_power_uncore_ops.h
>
This is showing some build problems.
*Build Failed #1:
OS: RHEL94-64
Target: x86_64-native-linuxapp-gcc+shared
FAILED: app/dpdk-test
gcc -o app/dpdk-test app/dpdk-test.p/test_commands.c.o app/dpdk-test.p/test_test.c.o app/dpdk-test.p/test_packet_burst_generator.c.o app/dpdk-test.p/test_sample_packet_forward.c.o app/dpdk-test.p/test_virtual_pmd.c.o app/dpdk-test.p/test_test_acl.c.o app/dpdk-test.p/test_test_alarm.c.o app/dpdk-test.p/test_test_argparse.c.o app/dpdk-test.p/test_test_atomic.c.o app/dpdk-test.p/test_test_barrier.c.o app/dpdk-test.p/test_test_bitcount.c.o app/dpdk-test.p/test_test_bitmap.c.o app/dpdk-test.p/test_test_bitops.c.o app/dpdk-test.p/test_test_bitset.c.o app/dpdk-test.p/test_test_bitratestats.c.o app/dpdk-test.p/test_test_bpf.c.o app/dpdk-test.p/test_test_byteorder.c.o app/dpdk-test.p/test_test_cksum.c.o app/dpdk-test.p/test_test_cksum_perf.c.o app/dpdk-test.p/test_test_cmdline.c.o app/dpdk-test.p/test_test_cmdline_cirbuf.c.o app/dpdk-test.p/test_test_cmdline_etheraddr.c.o app/dpdk-test.p/test_test_cmdline_ipaddr.c.o app/dpdk-test.p/test_test_cmdline_lib.c.o app/dpdk-test.p/test_test_cmdline_
num.c.o app/dpdk-test.p/test_test_cmdline_portlist.c.o app/dpdk-test.p/test_test_cmdline_string.c.o app/dpdk-test.p/test_test_common.c.o app/dpdk-test.p/test_test_compressdev.c.o app/dpdk-test.p/test_test_cpuflags.c.o app/dpdk-test.p/test_test_crc.c.o app/dpdk-test.p/test_test_cryptodev.c.o app/dpdk-test.p/test_test_cryptodev_asym.c.o app/dpdk-test.p/test_test_cryptodev_blockcipher.c.o app/dpdk-test.p/test_test_cryptodev_crosscheck.c.o app/dpdk-test.p/test_test_cryptodev_security_ipsec.c.o app/dpdk-test.p/test_test_cryptodev_security_pdcp.c.o app/dpdk-test.p/test_test_cryptodev_security_tls_record.c.o app/dpdk-test.p/test_test_cycles.c.o app/dpdk-test.p/test_test_debug.c.o app/dpdk-test.p/test_test_devargs.c.o app/dpdk-test.p/test_test_dispatcher.c.o app/dpdk-test.p/test_test_distributor.c.o app/dpdk-test.p/test_test_distributor_perf.c.o app/dpdk-test.p/test_test_dmadev.c.o app/dpdk-test.p/test_test_dmadev_api.c.o app/dpdk-test.p/test_test_eal_flags.c.o app/dpdk-test.p/test_test_eal
_fs.c.o app/dpdk-test.p/test_test_efd.c.o app/dpdk-test.p/test_test_efd_perf.c.o app/dpdk-test.p/test_test_errno.c.o app/dpdk-test.p/test_test_ethdev_api.c.o app/dpdk-test.p/test_test_ethdev_link.c.o app/dpdk-test.p/test_test_event_crypto_adapter.c.o app/dpdk-test.p/test_test_event_dma_adapter.c.o app/dpdk-test.p/test_test_event_eth_rx_adapter.c.o app/dpdk-test.p/test_test_event_eth_tx_adapter.c.o app/dpdk-test.p/test_test_event_ring.c.o app/dpdk-test.p/test_test_event_timer_adapter.c.o app/dpdk-test.p/test_test_eventdev.c.o app/dpdk-test.p/test_test_external_mem.c.o app/dpdk-test.p/test_test_fbarray.c.o app/dpdk-test.p/test_test_fib.c.o app/dpdk-test.p/test_test_fib6.c.o app/dpdk-test.p/test_test_fib6_perf.c.o app/dpdk-test.p/test_test_fib_perf.c.o app/dpdk-test.p/test_test_func_reentrancy.c.o app/dpdk-test.p/test_test_graph.c.o app/dpdk-test.p/test_test_graph_perf.c.o app/dpdk-test.p/test_test_hash.c.o app/dpdk-test.p/test_test_hash_functions.c.o app/dpdk-test.p/test_test_hash_mul
tiwriter.c.o app/dpdk-test.p/test_test_hash_perf.c.o app/dpdk-test.p/test_test_hash_readwrite.c.o app/dpdk-test.p/test_test_hash_readwrite_lf_perf.c.o app/dpdk-test.p/test_test_interrupts.c.o app/dpdk-test.p/test_test_ipfrag.c.o app/dpdk-test.p/test_test_ipsec.c.o app/dpdk-test.p/test_test_ipsec_perf.c.o app/dpdk-test.p/test_test_ipsec_sad.c.o app/dpdk-test.p/test_test_kvargs.c.o app/dpdk-test.p/test_test_latencystats.c.o app/dpdk-test.p/test_test_lcores.c.o app/dpdk-test.p/test_test_link_bonding.c.o app/dpdk-test.p/test_test_link_bonding_mode4.c.o app/dpdk-test.p/test_test_link_bonding_rssconf.c.o app/dpdk-test.p/test_test_logs.c.o app/dpdk-test.p/test_test_lpm.c.o app/dpdk-test.p/test_test_lpm6.c.o app/dpdk-test.p/test_test_lpm6_perf.c.o app/dpdk-test.p/test_test_lpm_perf.c.o app/dpdk-test.p/test_test_malloc.c.o app/dpdk-test.p/test_test_malloc_perf.c.o app/dpdk-test.p/test_test_mbuf.c.o app/dpdk-test.p/test_test_mcslock.c.o app/dpdk-test.p/test_test_member.c.o app/dpdk-test.p/tes
t_test_member_perf.c.o app/dpdk-test.p/test_test_memcpy.c.o app/dpdk-test.p/test_test_memcpy_perf.c.o app/dpdk-test.p/test_test_memory.c.o app/dpdk-test.p/test_test_mempool.c.o app/dpdk-test.p/test_test_mempool_perf.c.o app/dpdk-test.p/test_test_memzone.c.o app/dpdk-test.p/test_test_meter.c.o app/dpdk-test.p/test_test_metrics.c.o app/dpdk-test.p/test_test_mp_secondary.c.o app/dpdk-test.p/test_test_net_ether.c.o app/dpdk-test.p/test_test_pcapng.c.o app/dpdk-test.p/test_test_pdcp.c.o app/dpdk-test.p/test_test_pdump.c.o app/dpdk-test.p/test_test_per_lcore.c.o app/dpdk-test.p/test_test_pflock.c.o app/dpdk-test.p/test_test_pie.c.o app/dpdk-test.p/test_test_pmd_perf.c.o app/dpdk-test.p/test_test_pmd_ring.c.o app/dpdk-test.p/test_test_pmd_ring_perf.c.o app/dpdk-test.p/test_test_power.c.o app/dpdk-test.p/test_test_power_cpufreq.c.o app/dpdk-test.p/test_test_power_intel_uncore.c.o app/dpdk-test.p/test_test_power_kvm_vm.c.o app/dpdk-test.p/test_test_prefetch.c.o app/dpdk-test.p/test_test_ptr_
compress.c.o app/dpdk-test.p/test_test_rand_perf.c.o app/dpdk-test.p/test_test_rawdev.c.o app/dpdk-test.p/test_test_rcu_qsbr.c.o app/dpdk-test.p/test_test_rcu_qsbr_perf.c.o app/dpdk-test.p/test_test_reassembly_perf.c.o app/dpdk-test.p/test_test_reciprocal_division.c.o app/dpdk-test.p/test_test_reciprocal_division_perf.c.o app/dpdk-test.p/test_test_red.c.o app/dpdk-test.p/test_test_reorder.c.o app/dpdk-test.p/test_test_rib.c.o app/dpdk-test.p/test_test_rib6.c.o app/dpdk-test.p/test_test_ring.c.o app/dpdk-test.p/test_test_ring_hts_stress.c.o app/dpdk-test.p/test_test_ring_mpmc_stress.c.o app/dpdk-test.p/test_test_ring_mt_peek_stress.c.o app/dpdk-test.p/test_test_ring_mt_peek_stress_zc.c.o app/dpdk-test.p/test_test_ring_perf.c.o app/dpdk-test.p/test_test_ring_rts_stress.c.o app/dpdk-test.p/test_test_ring_st_peek_stress.c.o app/dpdk-test.p/test_test_ring_st_peek_stress_zc.c.o app/dpdk-test.p/test_test_ring_stress.c.o app/dpdk-test.p/test_test_rwlock.c.o app/dpdk-test.p/test_test_sched.c
.o app/dpdk-test.p/test_test_security.c.o app/dpdk-test.p/test_test_security_inline_macsec.c.o app/dpdk-test.p/test_test_security_inline_proto.c.o app/dpdk-test.p/test_test_security_proto.c.o app/dpdk-test.p/test_test_seqlock.c.o app/dpdk-test.p/test_test_service_cores.c.o app/dpdk-test.p/test_test_spinlock.c.o app/dpdk-test.p/test_test_stack.c.o app/dpdk-test.p/test_test_stack_perf.c.o app/dpdk-test.p/test_test_string_fns.c.o app/dpdk-test.p/test_test_table.c.o app/dpdk-test.p/test_test_table_acl.c.o app/dpdk-test.p/test_test_table_combined.c.o app/dpdk-test.p/test_test_table_pipeline.c.o app/dpdk-test.p/test_test_table_ports.c.o app/dpdk-test.p/test_test_table_tables.c.o app/dpdk-test.p/test_test_tailq.c.o app/dpdk-test.p/test_test_telemetry_data.c.o app/dpdk-test.p/test_test_telemetry_json.c.o app/dpdk-test.p/test_test_thash.c.o app/dpdk-test.p/test_test_thash_perf.c.o app/dpdk-test.p/test_test_threads.c.o app/dpdk-test.p/test_test_ticketlock.c.o app/dpdk-test.p/test_test_timer.c
.o app/dpdk-test.p/test_test_timer_perf.c.o app/dpdk-test.p/test_test_timer_racecond.c.o app/dpdk-test.p/test_test_timer_secondary.c.o app/dpdk-test.p/test_test_trace.c.o app/dpdk-test.p/test_test_trace_perf.c.o app/dpdk-test.p/test_test_trace_register.c.o app/dpdk-test.p/test_test_vdev.c.o app/dpdk-test.p/test_test_version.c.o -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -Wl,--no-as-needed -pthread -Wl,--start-group -lm -ldl -lnuma -lfdt '-Wl,-rpath,$ORIGIN/../lib:$ORIGIN/../drivers:XXXXXXXXXXXXXXX' -Wl,-rpath-link,/root/RHEL94-64_K5.14.0_GCC11.4.1/x86_64-native-linuxapp-gcc+shared/33487/dpdk/x86_64-native-linuxapp-gcc+shared/lib -Wl,-rpath-link,/root/RHEL94-64_K5.14.0_GCC11.4.1/x86_64-native-linuxapp-gcc+shared/33487/dpdk/x86_64-native-linuxapp-gcc+shared/drivers lib/librte_cmdline.so.25.0 lib/librte_eal.so.25.0 lib/librte_kvargs.so.25.0 lib/librte_log.so.25.0 lib/librte_telemetry.so.25.0 lib/librte_net.so.25.0 lib/librte_mbuf.so.25.0 lib/librte_mempool.so.25.0 lib/librte_ring.so.25
.0 drivers/librte_net_ring.so.25.0 lib/librte_ethdev.so.25.0 lib/librte_meter.so.25.0 drivers/librte_bus_pci.so.25.0 lib/librte_pci.so.25.0 drivers/librte_bus_vdev.so.25.0 lib/librte_acl.so.25.0 lib/librte_argparse
.so.25.0 lib/librte_hash.so.25.0 lib/librte_rcu.so.25.0 lib/librte_metrics.so.25.0 lib/librte_bitratestats.so.25.0 lib/librte_bpf.so.25.0 lib/librte_compressdev.so.25.0 lib/librte_cryptodev.so.25.0 lib/librte_security.so.25.0 lib/librte_dispatcher.so.25.0 lib/librte_eventdev.so.25.0 lib/librte_timer.so.25.0 lib/librte_dmadev.so.25.0 lib/librte_distributor.so.25.0 lib/librte_efd.so.25.0 lib/librte_fib.so.25.0 lib/librte_rib.so.25.0 lib/librte_table.so.25.0 lib/librte_port.so.25.0 lib/librte_sched.so.25.0 lib/librte_ip_frag.so.25.0 lib/librte_lpm.so.25.0 lib/librte_graph.so.25.0 lib/librte_pcapng.so.25.0 lib/librte_ipsec.so.25.0 lib/librte_latencystats.so.25.0 drivers/librte_net_bond.so.25.0 lib/librte_member.so.25.0 lib/librte_pdcp.so.25.0 lib/librte_reorder.so.25.0 lib/librte_pdump.so.25.0 lib/librte_power.so.25.0 lib/librte_rawdev.so.25.0 lib/librte_stack.so.25.0 lib/librte_pipeline.so.25.0 drivers/librte_crypto_scheduler.so.25.0 /usr/lib64/libz.so /usr/lib64/libpcap.so /usr/lib64/l
ibelf.so -Wl,--end-group
/usr/bin/ld: app/dpdk-test.p/test_test_power_cpufreq.c.o: in function `test_power_caps':
test_power_cpufreq.c:(.text+0x15): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: app/dpdk-test.p/test_test_power_cpufreq.c.o: in function `test_power_cpufreq':
test_power_cpufreq.c:(.text+0x48d): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_cpufreq.c:(.text+0x4ac): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_cpufreq.c:(.text+0x4c8): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_cpufreq.c:(.text+0x4e4): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: app/dpdk-test.p/test_test_power_cpufreq.c.o:test_power_cpufreq.c:(.text+0x516): more undefined references to `rte_power_get_core_ops' follow
/usr/bin/ld: app/dpdk-test.p/test_test_power_intel_uncore.c.o: in function `test_power_intel_uncore':
test_power_intel_uncore.c:(.text+0x19): undefined reference to `rte_power_get_uncore_ops'
/usr/bin/ld: test_power_intel_uncore.c:(.text+0x3a): undefined reference to `rte_power_get_uncore_ops'
/usr/bin/ld: test_power_intel_uncore.c:(.text+0x46): undefined reference to `rte_power_get_uncore_ops'
/usr/bin/ld: test_power_intel_uncore.c:(.text+0x61): undefined reference to `rte_power_get_uncore_ops'
/usr/bin/ld: test_power_intel_uncore.c:(.text+0x75): undefined reference to `rte_power_get_uncore_ops'
/usr/bin/ld: app/dpdk-test.p/test_test_power_intel_uncore.c.o:test_power_intel_uncore.c:(.text+0x81): more undefined references to `rte_power_get_uncore_ops' follow
/usr/bin/ld: app/dpdk-test.p/test_test_power_kvm_vm.c.o: in function `test_power_kvm_vm':
test_power_kvm_vm.c:(.text+0x54): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_kvm_vm.c:(.text+0x6a): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_kvm_vm.c:(.text+0x80): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_kvm_vm.c:(.text+0x96): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: test_power_kvm_vm.c:(.text+0xac): undefined reference to `rte_power_get_core_ops'
/usr/bin/ld: app/dpdk-test.p/test_test_power_kvm_vm.c.o:test_power_kvm_vm.c:(.text+0xc2): more undefined references to `rte_power_get_core_ops' follow
collect2: error: ld returned 1 exit status
ninja: build stopped
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 0/5] power: refactor power management library
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
` (6 preceding siblings ...)
2024-10-15 3:15 ` Stephen Hemminger
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 1/5] power: refactor core " Sivaprasad Tummala
` (7 more replies)
7 siblings, 8 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 +++++----------
lib/power/rte_power.h | 141 +++++---
lib/power/rte_power_cpufreq_api.h | 209 +++++++++++
lib/power/rte_power_uncore.c | 207 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 241 +++++++++++++
lib/power/version.map | 17 +
40 files changed, 1611 insertions(+), 625 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 1/5] power: refactor core power management library
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 2/5] power: refactor uncore " Sivaprasad Tummala
` (6 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v5:
- fixed code style warning
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 ++++++------------
lib/power/rte_power.h | 141 ++++++---
lib/power/rte_power_cpufreq_api.h | 209 +++++++++++++
lib/power/version.map | 14 +
26 files changed, 621 insertions(+), 270 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 2733306698..7ef4f581a0 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index abad53bef1..c3fd10f287 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -583,3 +583,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..c685008fb5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 4809d45a22..5eb3828f7a 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -706,3 +706,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 97%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..17bd8e2eaf 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if amd p-state power management is supported.
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index e73f4520d0..5ad22e3ddd 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -691,3 +691,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..64a766145a 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..8b92054076 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..405524ce7c
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2024 Advanced Micro Devices, Inc.
+#
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 1c2a91a178..362a4de91c 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -894,3 +894,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..5fddb40280 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..d6b86ea19c 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_cpufreq_api.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index 590986d5ef..6c06411e8b 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -12,7 +12,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 83f742f42a..767686ee12 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,13 +24,24 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..372a9ff4f8 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -6,155 +6,89 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
+ goto out;
}
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
- }
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,94 +98,65 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum power_management_env
-rte_power_get_env(void) {
+rte_power_get_env(void)
+{
return global_default_env;
}
+struct rte_power_core_ops *
+rte_power_get_core_ops(void)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+
+ return global_power_core_ops;
+}
+
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
-
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
- }
- return -1;
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
+ return -1;
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..a2f4bebfc7 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_cpufreq_api.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -66,6 +74,15 @@ void rte_power_unset_env(void);
*/
enum power_management_env rte_power_get_env(void);
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
/**
* Initialize power management for a specific lcore. If rte_power_set_env() has
* not been called then an auto-detect of the environment will start and
@@ -102,16 +119,19 @@ int rte_power_exit(unsigned int lcore_id);
* lcore id.
* @param freqs
* The buffer array to save the frequencies.
- * @param num
+ * @param n
* The number of frequencies to get.
*
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
+static inline uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_freqs_t rte_power_freqs;
+ return ops->get_avail_freqs(lcore_id, freqs, n);
+}
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +144,13 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+static inline uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_freq_t rte_power_get_freq;
+ return ops->get_freq(lcore_id);
+}
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,82 +168,101 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+static inline uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
- *
- * @param lcore_id
- * lcore id.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+ return ops->set_freq(lcore_id, index);
+}
/**
* Scale up the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_up;
+static inline int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_up(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+static inline int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_down(lcore_id);
+}
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+static inline int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_max(lcore_id);
+}
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+static inline int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->freq_min(lcore_id);
+}
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+static inline int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->turbo_status(lcore_id);
+}
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+static inline int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
+
+ return ops->enable_turbo(lcore_id);
+}
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+static inline int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+ return ops->disable_turbo(lcore_id);
+}
/**
* Returns power capabilities for a specific lcore.
@@ -235,10 +278,14 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
- struct rte_power_core_capabilities *caps);
+static inline int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ struct rte_power_core_ops *ops = rte_power_get_core_ops();
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
+ return ops->get_caps(lcore_id, caps);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_cpufreq_api.h b/lib/power/rte_power_cpufreq_api.h
new file mode 100644
index 0000000000..becaff8002
--- /dev/null
+++ b/lib/power/rte_power_cpufreq_api.h
@@ -0,0 +1,209 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CPUFREQ_API_H
+#define _RTE_POWER_CPUFREQ_API_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
+ uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_ops(&ops); \
+}
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+__rte_internal
+struct rte_power_core_ops *
+rte_power_get_core_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..a46dd8adbf 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,18 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.11
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 2/5] power: refactor uncore power management library
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 3/5] test/power: removed function pointer validations Sivaprasad Tummala
` (5 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v5:
- fixed build errors for risc-v/ppc targets
v4:
- fixed build error with RTE_ASSERT
v3:
- fixed typo in header file inclusion
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 207 ++++++---------
lib/power/rte_power_uncore.h | 87 ++++---
lib/power/rte_power_uncore_ops.h | 241 ++++++++++++++++++
lib/power/version.map | 3 +
9 files changed, 410 insertions(+), 165 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index d6b86ea19c..63616e60fd 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..c049573d64 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -1,109 +1,62 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <errno.h>
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +66,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,15 +89,22 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
+}
+
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+
+ return global_uncore_ops;
}
int
@@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +142,12 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..c9fba02568 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
* Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +12,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +116,13 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+static inline uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+ return ops->get_freq(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,26 +145,13 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+static inline uint32_t
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-/**
- * Function pointer definition for generic frequency change functions.
- *
- * @param pkg
- * Package number.
- * Each physical CPU in a system is referred to as a package.
- * @param die
- * Die number.
- * Each package can have several dies connected together via the uncore mesh.
- *
- * @return
- * - 1 on success with frequency changed.
- * - 0 on success without frequency changed.
- * - Negative on error.
- */
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+ return ops->set_freq(pkg, die, index);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -169,7 +160,13 @@ typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+static inline uint32_t
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_max(pkg, die);
+}
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -178,7 +175,13 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
*
* This function should NOT be called in the fast path.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+static inline uint32_t
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
+
+ return ops->freq_min(pkg, die);
+}
/**
* Return the list of available frequencies in the index array.
@@ -200,10 +203,14 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
- uint32_t *freqs, uint32_t num);
+static inline uint32_t
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
+ return ops->get_avail_freqs(pkg, die, freqs, num);
+}
/**
* Return the list length of available frequencies in the index array.
@@ -221,9 +228,13 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+static inline int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+ return ops->get_num_freqs(pkg, die);
+}
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +246,13 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+static inline unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+ return ops->get_num_pkgs();
+}
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +268,13 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+static inline unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ struct rte_power_uncore_ops *ops = rte_power_get_uncore_ops();
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+ return ops->get_num_dies(pkg);
+}
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..17dc0fc9fb
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,241 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+#include <rte_common.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+/**
+ * @internal Get the power uncore ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+__rte_internal
+struct rte_power_uncore_ops *
+rte_power_get_uncore_ops(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index a46dd8adbf..920364f3eb 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,9 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
+ rte_power_get_core_ops;
+ rte_power_get_uncore_ops;
cpufreq_check_scaling_driver;
power_set_governor;
open_core_sysfs_file;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 3/5] test/power: removed function pointer validations
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 1/5] power: refactor core " Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
` (4 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index 619b2811c6..8cb67e662c 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -519,58 +519,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 4/5] drivers/power: uncore support for AMD EPYC processors
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-17 10:26 ` [PATCH v5 3/5] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 5/5] maintainers: update for drivers/power Sivaprasad Tummala
` (3 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 576 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 5/5] maintainers: update for drivers/power
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-17 10:26 ` [PATCH v5 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 0/5] power: refactor power management library Sivaprasad Tummala
` (2 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 6814991735..9f14e8f8d6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1744,6 +1744,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v5 0/5] power: refactor power management library
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-17 10:26 ` [PATCH v5 5/5] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-17 10:26 ` Sivaprasad Tummala
2024-10-17 16:17 ` Stephen Hemminger
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-17 10:26 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 8 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 16 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 16 +-
lib/power/rte_power.c | 287 +++++----------
lib/power/rte_power.h | 141 +++++---
lib/power/rte_power_cpufreq_api.h | 209 +++++++++++
lib/power/rte_power_uncore.c | 207 +++++------
lib/power/rte_power_uncore.h | 87 +++--
lib/power/rte_power_uncore_ops.h | 241 +++++++++++++
lib/power/version.map | 17 +
40 files changed, 1611 insertions(+), 625 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v5 0/5] power: refactor power management library
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-17 10:26 ` [PATCH v5 0/5] power: refactor power management library Sivaprasad Tummala
@ 2024-10-17 16:17 ` Stephen Hemminger
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-17 16:17 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, dev
On Thu, 17 Oct 2024 10:26:44 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
Does not build.
*Build Failed #2:
OS: RHEL94-64
Target: x86_64-native-linuxapp-gcc
FAILED: examples/dpdk-distributor.p/distributor_main.c.o
gcc -Iexamples/dpdk-distributor.p -Iexamples -I../examples -Iexamples/distributor -I../examples/distributor -I../examples/common -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/ethdev -I../lib/ethdev -Ilib/meter -I../lib/meter -Ilib/cmdline -I../lib/cmdline -Ilib/distributor -I../lib/distributor -Ilib/power -I../lib/power -Ilib/timer -I../lib/timer -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O3 -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-zero-length-bounds -D_GNU_SOURCE -march=native -mrtm -Wno-format-truncation -DALLOW_EXPERIMENTAL_API -MD -MQ examples/dpdk-distributor.p/distributor_main.c.o -MF examples/dpdk-distributor.p/distributor_main.c.o.d -o examples/dpdk-distributor.p/distributor_main.c.o -c ../examples/distributor/main.c
In file included from ../examples/distributor/main.c:20:
In function ‘rte_power_get_capabilities’,
inlined from ‘main’ at ../examples/distributor/main.c:888:4:
../lib/power/rte_power.h:285:42: error: call to ‘rte_power_get_core_ops’ declared with attribute error: Symbol is not public ABI
285 | struct rte_power_core_ops *ops = rte_power_get_core_ops();
| ^~~~~~~~~~~~~~~~~~~~~~~~
[2962/3118] Compiling C object examples/dpdk-fips_validation.p/fips_validation_fips_validation_hmac.c.o
[2963/3118] Compiling C object examples/dpdk-bbdev_app.p/bbdev_app_main.c.o
[2964/3118] Compiling C object examples/dpdk-fips_validation.p/fips_validation_fips_validation_xts.c.o
[2965/3118] Compiling C object examples/dpdk-fips_validation.p/fips_validation_fips_validation_sha.c.o
[2966/3118] Linking target examples/dpdk-bond
[2967/3118] Compiling C object examples/dpdk-fips_validation.p/fips_validation_main.c.o
[2968/3118] Compiling C object app/dpdk-test.p/test_test_ring_perf.c.o
[2969/3118] Compiling C object app/dpdk-test.p/test_test_trace_perf.c.o
[2970/3118] Compiling C object app/dpdk-test.p/test_test_ring.c.o
ninja: build stopped
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 0/5] power: refactor power management library
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
` (6 preceding siblings ...)
2024-10-17 16:17 ` Stephen Hemminger
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 1/5] power: refactor core " Sivaprasad Tummala
` (6 more replies)
7 siblings, 7 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 +++++++++++
drivers/power/amd_uncore/meson.build | 20 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
lib/power/rte_power_uncore.c | 253 +++++++------
lib/power/rte_power_uncore.h | 61 ++-
lib/power/rte_power_uncore_ops.h | 230 ++++++++++++
lib/power/version.map | 16 +
40 files changed, 1664 insertions(+), 622 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 1/5] power: refactor core power management library
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 2/5] power: refactor uncore " Sivaprasad Tummala
` (5 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v6:
- fixed compilation error with symbol export in API
- exported power_get_lcore_mapped_cpu_id as internal API to be
used in drivers/power/*
v5:
- fixed code style warning
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
lib/power/version.map | 15 +
26 files changed, 665 insertions(+), 269 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 2733306698..7ef4f581a0 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index ae809fbb60..974fbb7ba8 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -587,3 +587,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..c685008fb5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 2b728eca18..8b93226281 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -710,3 +710,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 96%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..d7188fcdac 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if amd p-state power management is supported.
@@ -216,4 +216,4 @@ int power_amd_pstate_disable_turbo(unsigned int lcore_id);
int power_amd_pstate_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_AMD_PSTATET_CPUFREQ_H */
+#endif /* _AMD_PSTATET_CPUFREQ_H */
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index cc9305bdfe..8ca84c4b49 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -695,3 +695,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..64a766145a 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..8b92054076 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..fe11179ab3
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 4755909466..09a11f7c37 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -898,3 +898,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..5fddb40280 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9a4a592caf 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_cpufreq_api.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index b47c63a5f1..e482f71c64 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -13,7 +13,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 82fb94d0c0..c294f561bb 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,14 +24,27 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
+
+__rte_internal
int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..416f0148a3 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -6,155 +6,88 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
-
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
- }
-
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ goto out;
}
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,7 +97,7 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -176,82 +109,122 @@ rte_power_get_env(void) {
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
-
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
+
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
- }
return -1;
+}
+
+uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->get_avail_freqs(lcore_id, freqs, n);
+}
+
+uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->get_freq(lcore_id);
+}
+
+uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->set_freq(lcore_id, index);
+}
+
+int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_up(lcore_id);
+}
+
+int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_down(lcore_id);
+}
+
+int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_max(lcore_id);
+}
+
+int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_min(lcore_id);
+}
+int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->turbo_status(lcore_id);
+}
+
+int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->enable_turbo(lcore_id);
+}
+
+int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->disable_turbo(lcore_id);
+}
+
+int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->get_caps(lcore_id, caps);
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..e9a72b92ad 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_cpufreq_api.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -108,10 +116,7 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
-
-extern rte_power_freqs_t rte_power_freqs;
+uint32_t rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t num);
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +129,7 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
-
-extern rte_power_get_freq_t rte_power_get_freq;
+uint32_t rte_power_get_freq(unsigned int lcore_id);
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,13 +147,12 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+uint32_t rte_power_set_freq(unsigned int lcore_id, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * Review each environments specific documentation for usage.
*
* @param lcore_id
* lcore id.
@@ -160,66 +162,92 @@ extern rte_power_set_freq_t rte_power_set_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
-
-/**
- * Scale up the frequency of a specific lcore according to the available
- * frequencies.
- * Review each environments specific documentation for usage.
- */
-extern rte_power_freq_change_t rte_power_freq_up;
+int rte_power_freq_up(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+int rte_power_freq_down(unsigned int lcore_id);
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+int rte_power_freq_max(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+int rte_power_freq_min(unsigned int lcore_id);
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 turbo boost enabled.
+ * - 0 turbo boost disabled.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+int rte_power_turbo_status(unsigned int lcore_id);
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+int rte_power_freq_enable_turbo(unsigned int lcore_id);
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
-
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+int rte_power_freq_disable_turbo(unsigned int lcore_id);
/**
* Returns power capabilities for a specific lcore.
@@ -235,11 +263,9 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+int rte_power_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
-
#ifdef __cplusplus
}
#endif
diff --git a/lib/power/rte_power_cpufreq_api.h b/lib/power/rte_power_cpufreq_api.h
new file mode 100644
index 0000000000..31fea941bf
--- /dev/null
+++ b/lib/power/rte_power_cpufreq_api.h
@@ -0,0 +1,206 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CPUFREQ_API_H
+#define _RTE_POWER_CPUFREQ_API_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
+ uint32_t *freqs, uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_ops(&ops); \
+}
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..016e599e90 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,19 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.11
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_get_lcore_mapped_cpu_id;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 2/5] power: refactor uncore power management library
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-20 23:25 ` Stephen Hemminger
2024-10-20 23:28 ` Stephen Hemminger
2024-10-20 9:22 ` [PATCH v6 3/5] test/power: removed function pointer validations Sivaprasad Tummala
` (4 subsequent siblings)
6 siblings, 2 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
iThis patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v6:
- fixed compilation error with symbol export in API
v5:
- fixed build errors for risc-v/ppc targets
v4:
- fixed build error with RTE_ASSERT
v3:
- fixed typo in header file inclusion
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 253 +++++++++---------
lib/power/rte_power_uncore.h | 61 ++---
lib/power/rte_power_uncore_ops.h | 230 ++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 419 insertions(+), 163 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..ffee28f9b3 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef _INTEL_UNCORE_H
+#define _INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* _INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 9a4a592caf..d435197cef 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..f11238cc34 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -10,100 +10,53 @@
#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +66,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,43 +89,43 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
}
int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
-
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +133,69 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
+
+uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_freq(pkg, die);
+}
+
+int
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->set_freq(pkg, die, index);
+}
+
+int
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_max(pkg, die);
+}
+
+int
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_min(pkg, die);
+}
+
+int
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_avail_freqs(pkg, die, freqs, num);
+}
+
+int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_freqs(pkg, die);
+}
+
+unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_pkgs();
+}
+
+unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_dies(pkg);
+}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..ae22be5c52 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +11,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +115,7 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+uint32_t rte_power_get_uncore_freq(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,12 +138,14 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+int rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions.
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
*
* @param pkg
* Package number.
@@ -160,16 +159,7 @@ extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
-
-/**
- * Set minimum and maximum uncore frequency for specified die on a package
- * to maximum value according to the available frequencies.
- * It should be protected outside of this function for threadsafe.
- *
- * This function should NOT be called in the fast path.
- */
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+int rte_power_uncore_freq_max(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -177,8 +167,20 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
* It should be protected outside of this function for threadsafe.
*
* This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+int rte_power_uncore_freq_min(unsigned int pkg, unsigned int die);
/**
* Return the list of available frequencies in the index array.
@@ -200,11 +202,10 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+__rte_experimental
+int rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
uint32_t *freqs, uint32_t num);
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
-
/**
* Return the list length of available frequencies in the index array.
*
@@ -221,9 +222,7 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+int rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +234,7 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
-
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+unsigned int rte_power_uncore_get_num_pkgs(void);
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +250,7 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
-
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+unsigned int rte_power_uncore_get_num_dies(unsigned int pkg);
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..f91994d3c1
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,230 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+#include <rte_common.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index 016e599e90..d9dd4145b7 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_get_lcore_mapped_cpu_id;
power_set_governor;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 3/5] test/power: removed function pointer validations
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 1/5] power: refactor core " Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
` (3 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index edbd34424e..f4522747d5 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -534,58 +534,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 4/5] drivers/power: uncore support for AMD EPYC processors
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-20 9:22 ` [PATCH v6 3/5] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 5/5] maintainers: update for drivers/power Sivaprasad Tummala
` (2 subsequent siblings)
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 576 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 5/5] maintainers: update for drivers/power
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-20 9:22 ` [PATCH v6 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 6ea7850093..7e29931be9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1744,6 +1744,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v6 0/5] power: refactor power management library
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-20 9:22 ` [PATCH v6 5/5] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-20 9:22 ` Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-20 9:22 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 +++++++++++
drivers/power/amd_uncore/meson.build | 20 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
lib/power/rte_power_uncore.c | 253 +++++++------
lib/power/rte_power_uncore.h | 61 ++-
lib/power/rte_power_uncore_ops.h | 230 ++++++++++++
lib/power/version.map | 16 +
40 files changed, 1664 insertions(+), 622 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v6 2/5] power: refactor uncore power management library
2024-10-20 9:22 ` [PATCH v6 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-20 23:25 ` Stephen Hemminger
2024-10-20 23:28 ` Stephen Hemminger
1 sibling, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-20 23:25 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong,
dev
On Sun, 20 Oct 2024 09:22:29 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
> index 48c75a5da0..f11238cc34 100644
> --- a/lib/power/rte_power_uncore.c
> +++ b/lib/power/rte_power_uncore.c
> @@ -10,100 +10,53 @@
>
> #include "power_common.h"
> #include "rte_power_uncore.h"
> -#include "power_intel_uncore.h"
>
> -enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> +static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
> +static struct rte_power_uncore_ops *global_uncore_ops;
>
> static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> +static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
> + TAILQ_HEAD_INITIALIZER(uncore_ops_list);
>
Need to include rte_debug.h now?
_github build: failed_
Build URL: https://github.com/ovsrobot/dpdk/actions/runs/11425299483
Build Logs:
-----------------------Summary of failed steps-----------------------
"ubuntu-22.04-gcc-ppc64le" failed at step Build and test
"ubuntu-22.04-gcc-riscv64" failed at step Build and test
----------------------End summary of failed steps--------------------
-------------------------------BEGIN LOGS----------------------------
####################################################################################
#### [Begin job log] "ubuntu-22.04-gcc-ppc64le" at step Build and test
####################################################################################
[325/3576] Generating symbol file lib/librte_latencystats.so.25.0.p/librte_latencystats.so.25.0.symbols
[326/3576] Generating symbol file lib/librte_jobstats.so.25.0.p/librte_jobstats.so.25.0.symbols
[327/3576] Compiling C object lib/librte_member.a.p/member_rte_member_ht.c.o
[328/3576] Generating symbol file lib/librte_ip_frag.so.25.0.p/librte_ip_frag.so.25.0.symbols
[329/3576] Compiling C object drivers/libtmp_rte_bus_platform.a.p/bus_platform_platform_params.c.o
[330/3576] Compiling C object lib/librte_member.a.p/member_rte_member_vbf.c.o
[331/3576] Compiling C object lib/librte_power.a.p/power_power_common.c.o
[332/3576] Compiling C object lib/librte_power.a.p/power_rte_power.c.o
[333/3576] Compiling C object lib/librte_power.a.p/power_rte_power_uncore.c.o
FAILED: lib/librte_power.a.p/power_rte_power_uncore.c.o
ccache powerpc64le-linux-gnu-gcc -Ilib/librte_power.a.p -Ilib -I../lib -Ilib/power -I../lib/power -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/ppc/include -I../lib/eal/ppc/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/timer -I../lib/timer -Ilib/ethdev -I../lib/ethdev -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O2 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-psabi -D_GNU_SOURCE -fPIC -mcpu=power8 -mtune=power8 -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation -DRTE_LOG_DEFAULT_LOGTYPE=lib.power -MD -MQ lib/librte_power.a.p/power_rte_power_uncore.c.o -MF lib/librte_power.a.p/power_rte_power_uncore.c.o.d -o lib/librte_power.a.p/power_rte_power_uncore.c.o -c ../lib/power/rte_power_uncore.c
../lib/power/rte_power_uncore.c: In function ‘rte_power_get_uncore_freq’:
../lib/power/rte_power_uncore.c:149:9: error: implicit declaration of function ‘RTE_ASSERT’; did you mean ‘RTE_STR’? [-Werror=implicit-function-declaration]
149 | RTE_ASSERT(global_uncore_ops != NULL);
| ^~~~~~~~~~
| RTE_STR
../lib/power/rte_power_uncore.c:149:9: error: nested extern declaration of ‘RTE_ASSERT’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
[334/3576] Compiling C object lib/librte_member.a.p/member_rte_member_sketch.c.o
[335/3576] Generating lpm.sym_chk with a custom command (wrapped by meson to capture output)
[336/3576] Compiling C object lib/librte_pcapng.a.p/pcapng_rte_pcapng.c.o
[337/3576] Compiling C object lib/librte_power.a.p/power_rte_power_pmd_mgmt.c.o
[338/3576] Generating eventdev.sym_chk with a custom command (wrapped by meson to capture output)
ninja: build stopped: subcommand failed.
##[error]Process completed with exit code 1.
####################################################################################
#### [End job log] "ubuntu-22.04-gcc-ppc64le" at step Build and test
####################################################################################
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v6 2/5] power: refactor uncore power management library
2024-10-20 9:22 ` [PATCH v6 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-20 23:25 ` Stephen Hemminger
@ 2024-10-20 23:28 ` Stephen Hemminger
1 sibling, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-20 23:28 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong,
dev
On Sun, 20 Oct 2024 09:22:29 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> +uint32_t
> +rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
> +{
> + RTE_ASSERT(global_uncore_ops != NULL);
All these RTE_ASSERT calls seem like a good idea, they really don't help.
If RTE_ASSERT fails it prints a message and calls abort.
If you skip the RTE_ASSERT, the next line will cause a NULL dereference
illegal pointer reference and crash.
So in either case it crashes, and the RTE_ASSERT() doesn't add much help.
Also RTE_ENABLE_ASSERT() is usually disabled.
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 0/5] power: refactor power management library
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-20 9:22 ` [PATCH v6 0/5] power: refactor power management library Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
` (7 more replies)
6 siblings, 8 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 +++++++++++
drivers/power/amd_uncore/meson.build | 20 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
lib/power/rte_power_uncore.c | 256 +++++++------
lib/power/rte_power_uncore.h | 61 ++-
lib/power/rte_power_uncore_ops.h | 230 ++++++++++++
lib/power/version.map | 16 +
40 files changed, 1666 insertions(+), 623 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 1/5] power: refactor core power management library
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-22 1:20 ` Stephen Hemminger
2024-10-22 3:03 ` lihuisong (C)
2024-10-21 4:07 ` [PATCH v7 2/5] power: refactor uncore " Sivaprasad Tummala
` (6 subsequent siblings)
7 siblings, 2 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v6:
- fixed compilation error with symbol export in API
- exported power_get_lcore_mapped_cpu_id as internal API to be
used in drivers/power/*
v5:
- fixed code style warning
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
lib/power/version.map | 15 +
26 files changed, 665 insertions(+), 269 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 2733306698..7ef4f581a0 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index ae809fbb60..974fbb7ba8 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -587,3 +587,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..c685008fb5 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 2b728eca18..8b93226281 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -710,3 +710,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 96%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..d7188fcdac 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if amd p-state power management is supported.
@@ -216,4 +216,4 @@ int power_amd_pstate_disable_turbo(unsigned int lcore_id);
int power_amd_pstate_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_AMD_PSTATET_CPUFREQ_H */
+#endif /* _AMD_PSTATET_CPUFREQ_H */
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index cc9305bdfe..8ca84c4b49 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -695,3 +695,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..64a766145a 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 100%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..a1342dcd8b 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_core_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..8b92054076 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..fe11179ab3
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 4755909466..09a11f7c37 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -898,3 +898,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_core_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..5fddb40280 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "rte_power_cpufreq_api.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..9a4a592caf 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,20 +12,15 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'rte_power.h',
+ 'rte_power_cpufreq_api.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index b47c63a5f1..e482f71c64 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -13,7 +13,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 82fb94d0c0..c294f561bb 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,14 +24,27 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
+
+__rte_internal
int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..416f0148a3 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -6,155 +6,88 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_core_ops *global_power_core_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
-
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
+ TAILQ_HEAD_INITIALIZER(core_ops_list);
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_core_ops, return 0 on success. */
+int
+rte_power_register_ops(struct rte_power_core_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&core_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_core_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_core_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
- }
-
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ goto out;
}
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_power_core_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,7 +97,7 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_power_core_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -176,82 +109,122 @@ rte_power_get_env(void) {
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
-
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ struct rte_power_core_ops *ops;
+ uint8_t env;
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &core_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_power_core_ops->exit(lcore_id);
+
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
- }
return -1;
+}
+
+uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->get_avail_freqs(lcore_id, freqs, n);
+}
+
+uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->get_freq(lcore_id);
+}
+
+uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->set_freq(lcore_id, index);
+}
+
+int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_up(lcore_id);
+}
+
+int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_down(lcore_id);
+}
+
+int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_max(lcore_id);
+}
+
+int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->freq_min(lcore_id);
+}
+int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->turbo_status(lcore_id);
+}
+
+int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->enable_turbo(lcore_id);
+}
+
+int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->disable_turbo(lcore_id);
+}
+
+int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ RTE_ASSERT(global_power_core_ops != NULL);
+ return global_power_core_ops->get_caps(lcore_id, caps);
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..e9a72b92ad 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "rte_power_cpufreq_api.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -108,10 +116,7 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
-
-extern rte_power_freqs_t rte_power_freqs;
+uint32_t rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t num);
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +129,7 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
-
-extern rte_power_get_freq_t rte_power_get_freq;
+uint32_t rte_power_get_freq(unsigned int lcore_id);
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,13 +147,12 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+uint32_t rte_power_set_freq(unsigned int lcore_id, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * Review each environments specific documentation for usage.
*
* @param lcore_id
* lcore id.
@@ -160,66 +162,92 @@ extern rte_power_set_freq_t rte_power_set_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
-
-/**
- * Scale up the frequency of a specific lcore according to the available
- * frequencies.
- * Review each environments specific documentation for usage.
- */
-extern rte_power_freq_change_t rte_power_freq_up;
+int rte_power_freq_up(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+int rte_power_freq_down(unsigned int lcore_id);
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+int rte_power_freq_max(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+int rte_power_freq_min(unsigned int lcore_id);
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 turbo boost enabled.
+ * - 0 turbo boost disabled.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+int rte_power_turbo_status(unsigned int lcore_id);
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+int rte_power_freq_enable_turbo(unsigned int lcore_id);
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
-
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+int rte_power_freq_disable_turbo(unsigned int lcore_id);
/**
* Returns power capabilities for a specific lcore.
@@ -235,11 +263,9 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+int rte_power_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
-
#ifdef __cplusplus
}
#endif
diff --git a/lib/power/rte_power_cpufreq_api.h b/lib/power/rte_power_cpufreq_api.h
new file mode 100644
index 0000000000..31fea941bf
--- /dev/null
+++ b/lib/power/rte_power_cpufreq_api.h
@@ -0,0 +1,206 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _RTE_POWER_CPUFREQ_API_H
+#define _RTE_POWER_CPUFREQ_API_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
+ uint32_t *freqs, uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_core_ops {
+ RTE_TAILQ_ENTRY(rte_power_core_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_ops(struct rte_power_core_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_ops(&ops); \
+}
+
+/**
+ * @internal Get the power ops struct from its index.
+ *
+ * @return
+ * The pointer to the ops struct in the table if registered.
+ */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..016e599e90 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -51,4 +51,19 @@ EXPERIMENTAL {
rte_power_set_uncore_env;
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
+ # added in 24.11
+ rte_power_logtype;
+};
+
+INTERNAL {
+ global:
+
+ rte_power_register_ops;
+ cpufreq_check_scaling_driver;
+ power_get_lcore_mapped_cpu_id;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 2/5] power: refactor uncore power management library
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-22 1:18 ` Stephen Hemminger
2024-10-22 3:17 ` lihuisong (C)
2024-10-21 4:07 ` [PATCH v7 3/5] test/power: removed function pointer validations Sivaprasad Tummala
` (5 subsequent siblings)
7 siblings, 2 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
iThis patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v7:
- fixed build error with aarch32 gcc cross compilation
v6:
- fixed compilation error with symbol export in API
v5:
- fixed build errors for risc-v/ppc targets
v4:
- fixed build error with RTE_ASSERT
v3:
- fixed typo in header file inclusion
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/rte_power_uncore.c | 256 +++++++++---------
lib/power/rte_power_uncore.h | 61 ++---
lib/power/rte_power_uncore_ops.h | 230 ++++++++++++++++
lib/power/version.map | 1 +
9 files changed, 421 insertions(+), 164 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/rte_power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..ffee28f9b3 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef _INTEL_UNCORE_H
+#define _INTEL_UNCORE_H
/**
* @file
@@ -11,7 +11,7 @@
*/
#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* _INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 9a4a592caf..d435197cef 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_uncore_ops.h',
)
deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..e59458d7a7 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -7,103 +7,57 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
-#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
+#include "power_common.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +67,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,43 +90,43 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
}
int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
-
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +134,69 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
+
+uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_freq(pkg, die);
+}
+
+int
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->set_freq(pkg, die, index);
+}
+
+int
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_max(pkg, die);
+}
+
+int
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_min(pkg, die);
+}
+
+int
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_avail_freqs(pkg, die, freqs, num);
+}
+
+int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_freqs(pkg, die);
+}
+
+unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_pkgs();
+}
+
+unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_dies(pkg);
+}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..ae22be5c52 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +11,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "rte_power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +115,7 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+uint32_t rte_power_get_uncore_freq(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,12 +138,14 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+int rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions.
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
*
* @param pkg
* Package number.
@@ -160,16 +159,7 @@ extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
-
-/**
- * Set minimum and maximum uncore frequency for specified die on a package
- * to maximum value according to the available frequencies.
- * It should be protected outside of this function for threadsafe.
- *
- * This function should NOT be called in the fast path.
- */
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+int rte_power_uncore_freq_max(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -177,8 +167,20 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
* It should be protected outside of this function for threadsafe.
*
* This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+int rte_power_uncore_freq_min(unsigned int pkg, unsigned int die);
/**
* Return the list of available frequencies in the index array.
@@ -200,11 +202,10 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+__rte_experimental
+int rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
uint32_t *freqs, uint32_t num);
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
-
/**
* Return the list length of available frequencies in the index array.
*
@@ -221,9 +222,7 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+int rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +234,7 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
-
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+unsigned int rte_power_uncore_get_num_pkgs(void);
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +250,7 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
-
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+unsigned int rte_power_uncore_get_num_dies(unsigned int pkg);
#ifdef __cplusplus
}
diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
new file mode 100644
index 0000000000..f91994d3c1
--- /dev/null
+++ b/lib/power/rte_power_uncore_ops.h
@@ -0,0 +1,230 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef RTE_POWER_UNCORE_OPS_H
+#define RTE_POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+#include <rte_common.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_POWER_UNCORE_OPS_H */
diff --git a/lib/power/version.map b/lib/power/version.map
index 016e599e90..d9dd4145b7 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -59,6 +59,7 @@ INTERNAL {
global:
rte_power_register_ops;
+ rte_power_register_uncore_ops;
cpufreq_check_scaling_driver;
power_get_lcore_mapped_cpu_id;
power_set_governor;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 3/5] test/power: removed function pointer validations
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
` (4 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index edbd34424e..f4522747d5 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -534,58 +534,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 4/5] drivers/power: uncore support for AMD EPYC processors
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-21 4:07 ` [PATCH v7 3/5] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 5/5] maintainers: update for drivers/power Sivaprasad Tummala
` (3 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 576 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..60e0e64d27
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,226 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "rte_power.h"
+#include "rte_power_uncore.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 5/5] maintainers: update for drivers/power
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-21 4:07 ` [PATCH v7 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 0/5] power: refactor power management library Sivaprasad Tummala
` (2 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 6ea7850093..7e29931be9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1744,6 +1744,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v7 0/5] power: refactor power management library
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-21 4:07 ` [PATCH v7 5/5] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-21 4:07 ` Sivaprasad Tummala
2024-10-22 1:34 ` Stephen Hemminger
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-21 4:07 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (5):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
MAINTAINERS | 1 +
app/test/test_power.c | 95 -----
app/test/test_power_cpufreq.c | 52 ---
app/test/test_power_kvm_vm.c | 36 --
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 226 +++++++++++
drivers/power/amd_uncore/meson.build | 20 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 8 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 0
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/l3fwd-power/main.c | 12 +-
lib/power/meson.build | 9 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
lib/power/rte_power_uncore.c | 256 +++++++------
lib/power/rte_power_uncore.h | 61 ++-
lib/power/rte_power_uncore_ops.h | 230 ++++++++++++
lib/power/version.map | 16 +
40 files changed, 1666 insertions(+), 623 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/rte_power_cpufreq_api.h
create mode 100644 lib/power/rte_power_uncore_ops.h
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v7 2/5] power: refactor uncore power management library
2024-10-21 4:07 ` [PATCH v7 2/5] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-22 1:18 ` Stephen Hemminger
2024-10-22 6:45 ` Tummala, Sivaprasad
2024-10-22 3:17 ` lihuisong (C)
1 sibling, 1 reply; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-22 1:18 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong,
dev
On Mon, 21 Oct 2024 04:07:20 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
> new file mode 100644
> index 0000000000..f91994d3c1
> --- /dev/null
> +++ b/lib/power/rte_power_uncore_ops.h
> @@ -0,0 +1,230 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Intel Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> + */
> +
> +#ifndef RTE_POWER_UNCORE_OPS_H
> +#define RTE_POWER_UNCORE_OPS_H
> +
> +/**
> + * @file
> + * RTE Uncore Frequency Management
> + */
> +
> +#include <rte_compat.h>
> +#include <rte_common.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
Since this all internal doesn't really require C++ guards.
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v7 1/5] power: refactor core power management library
2024-10-21 4:07 ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
@ 2024-10-22 1:20 ` Stephen Hemminger
2024-10-22 6:45 ` Tummala, Sivaprasad
2024-10-22 3:03 ` lihuisong (C)
1 sibling, 1 reply; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-22 1:20 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong,
dev
On Mon, 21 Oct 2024 04:07:19 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> diff --git a/lib/power/version.map b/lib/power/version.map
> index c9a226614e..016e599e90 100644
> --- a/lib/power/version.map
> +++ b/lib/power/version.map
> @@ -51,4 +51,19 @@ EXPERIMENTAL {
> rte_power_set_uncore_env;
> rte_power_uncore_freqs;
> rte_power_unset_uncore_env;
> + # added in 24.11
> + rte_power_logtype;
> +};
The logtype should be kept internal, not part of public API
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v7 0/5] power: refactor power management library
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-21 4:07 ` [PATCH v7 0/5] power: refactor power management library Sivaprasad Tummala
@ 2024-10-22 1:34 ` Stephen Hemminger
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-22 1:34 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong,
dev
On Mon, 21 Oct 2024 04:07:18 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
Looks good, a couple of minor things you could address later in other comments.
One other thing in the power internals would be to change:
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
to be similar to existing fopen()
FILE *fopen_sysfs_file(const char *mode, const char *format, ...)
__rte_format_printf(2, 3) __rte_malloc __rte_dealloc(fclose, 1)
That would catch if the file pointer was not handled correctly.
Series-Acked-by: Stephen Hemminger <stephen@networkplumber.org>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v2 2/4] power: refactor uncore power management library
2024-10-08 6:19 ` Tummala, Sivaprasad
@ 2024-10-22 2:05 ` lihuisong (C)
0 siblings, 0 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-10-22 2:05 UTC (permalink / raw)
To: Tummala, Sivaprasad
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil
Hi Sivaprasa,
I have a inline question, please take a look.
在 2024/10/8 14:19, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lihuisong,
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Tuesday, August 27, 2024 6:33 PM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
>> radu.nicolau@intel.com; jerinj@marvell.com; cristian.dumitrescu@intel.com;
>> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
>> gakhil@marvell.com
>> Subject: Re: [PATCH v2 2/4] power: refactor uncore power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> Hi Sivaprasad,
>>
>> Suggest to split this patch into two patches for easiler to review:
>> patch-1: abstract a file for uncore dvfs core level, namely, the
>> rte_power_uncore_ops.c you did.
>> patch-2: move and rename, lib/power/power_intel_uncore.c =>
>> drivers/power/intel_uncore/intel_uncore.c
>>
>> patch[1/4] is also too big and not good to review.
>>
>> In addition, I have some question and am not sure if we can adjust uncore init
>> process.
>>
>> /Huisong
>>
>>
>> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
>>> This patch refactors the power management library, addressing uncore
>>> power management. The primary changes involve the creation of
>>> dedicated directories for each driver within 'drivers/power/uncore/*'.
>>> The adjustment of meson.build files enables the selective activation
>>> of individual drivers.
>>>
>>> This refactor significantly improves code organization, enhances
>>> clarity and boosts maintainability. It lays the foundation for more
>>> focused development on individual drivers and facilitates seamless
>>> integration of future enhancements, particularly the AMD uncore driver.
>>>
>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>> ---
>>> .../power/intel_uncore/intel_uncore.c | 18 +-
>>> .../power/intel_uncore/intel_uncore.h | 8 +-
>>> drivers/power/intel_uncore/meson.build | 6 +
>>> drivers/power/meson.build | 3 +-
>>> lib/power/meson.build | 2 +-
>>> lib/power/rte_power_uncore.c | 205 ++++++---------
>>> lib/power/rte_power_uncore.h | 87 ++++---
>>> lib/power/rte_power_uncore_ops.h | 239 ++++++++++++++++++
>>> lib/power/version.map | 1 +
>>> 9 files changed, 405 insertions(+), 164 deletions(-)
>>> rename lib/power/power_intel_uncore.c =>
>> drivers/power/intel_uncore/intel_uncore.c (95%)
>>> rename lib/power/power_intel_uncore.h =>
>> drivers/power/intel_uncore/intel_uncore.h (97%)
>>> create mode 100644 drivers/power/intel_uncore/meson.build
>>> create mode 100644 lib/power/rte_power_uncore_ops.h
>>>
>>> diff --git a/lib/power/power_intel_uncore.c
>>> b/drivers/power/intel_uncore/intel_uncore.c
>>> similarity index 95%
>>> rename from lib/power/power_intel_uncore.c rename to
>>> drivers/power/intel_uncore/intel_uncore.c
>>> index 4eb9c5900a..804ad5d755 100644
>>> --- a/lib/power/power_intel_uncore.c
>>> +++ b/drivers/power/intel_uncore/intel_uncore.c
>>> @@ -8,7 +8,7 @@
>>>
>>> #include <rte_memcpy.h>
>>>
>>> -#include "power_intel_uncore.h"
>>> +#include "intel_uncore.h"
>>> #include "power_common.h"
>>>
>>> #define MAX_NUMA_DIE 8
>>> @@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
>>>
>>> return count;
>>> }
>> <...>
>>> -#endif /* POWER_INTEL_UNCORE_H */
>>> +#endif /* INTEL_UNCORE_H */
>>> diff --git a/drivers/power/intel_uncore/meson.build
>>> b/drivers/power/intel_uncore/meson.build
>>> new file mode 100644
>>> index 0000000000..876df8ad14
>>> --- /dev/null
>>> +++ b/drivers/power/intel_uncore/meson.build
>>> @@ -0,0 +1,6 @@
>>> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel
>>> +Corporation # Copyright(c) 2024 Advanced Micro Devices, Inc.
>>> +
>>> +sources = files('intel_uncore.c')
>>> +deps += ['power']
>>> diff --git a/drivers/power/meson.build b/drivers/power/meson.build
>>> index 8c7215c639..c83047af94 100644
>>> --- a/drivers/power/meson.build
>>> +++ b/drivers/power/meson.build
>>> @@ -6,7 +6,8 @@ drivers = [
>>> 'amd_pstate',
>>> 'cppc',
>>> 'kvm_vm',
>>> - 'pstate'
>>> + 'pstate',
>>> + 'intel_uncore'
>> The cppc, amd_pstate and so on belong to cpufreq scope.
>> And intel_uncore belongs to uncore dvfs scope.
>> They are not the same level. So I proposes that we need to create one directory
>> called like cpufreq or core.
>> This 'intel_uncore' name don't seems appropriate. what do you think the following
>> directory structure:
>> drivers/power/uncore/intel_uncore.c
>> drivers/power/uncore/amd_uncore.c (according to the patch[4/4]).
> At present, Meson does not support detecting an additional level of subdirectories within drivers/*.
> All the drivers maintain a consistent subdirectory structure.
>>> ]
>>> std_deps = ['power']
>>> diff --git a/lib/power/meson.build b/lib/power/meson.build index
>>> f3e3451cdc..9b13d98810 100644
>>> --- a/lib/power/meson.build
>>> +++ b/lib/power/meson.build
>>> @@ -13,7 +13,6 @@ if not is_linux
>>> endif
>>> sources = files(
>>> 'power_common.c',
>>> - 'power_intel_uncore.c',
>>> 'rte_power.c',
>>> 'rte_power_uncore.c',
>>> 'rte_power_pmd_mgmt.c',
>>> @@ -24,6 +23,7 @@ headers = files(
>>> 'rte_power_guest_channel.h',
>>> 'rte_power_pmd_mgmt.h',
>>> 'rte_power_uncore.h',
>>> + 'rte_power_uncore_ops.h',
>>> )
>>> if cc.has_argument('-Wno-cast-qual')
>>> cflags += '-Wno-cast-qual'
>>> diff --git a/lib/power/rte_power_uncore.c
>>> b/lib/power/rte_power_uncore.c index 48c75a5da0..9f8771224f 100644
>>> --- a/lib/power/rte_power_uncore.c
>>> +++ b/lib/power/rte_power_uncore.c
>>> @@ -1,6 +1,7 @@
>>> /* SPDX-License-Identifier: BSD-3-Clause
>>> * Copyright(c) 2010-2014 Intel Corporation
>>> * Copyright(c) 2023 AMD Corporation
>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>> */
>>>
>>> #include <errno.h>
>>> @@ -12,98 +13,50 @@
>>> #include "rte_power_uncore.h"
>>> #include "power_intel_uncore.h"
>>>
>>> -enum rte_uncore_power_mgmt_env default_uncore_env =
>>> RTE_UNCORE_PM_ENV_NOT_SET;
>>> +static enum rte_uncore_power_mgmt_env global_uncore_env =
>>> +RTE_UNCORE_PM_ENV_NOT_SET; static struct rte_power_uncore_ops
>>> +*global_uncore_ops;
>>>
>>> static rte_spinlock_t global_env_cfg_lock =
>>> RTE_SPINLOCK_INITIALIZER;
>>> +static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
>>> + TAILQ_HEAD_INITIALIZER(uncore_ops_list);
>>>
>>> -static uint32_t
>>> -power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
>>> - unsigned int die __rte_unused)
>>> -{
>>> - return 0;
>>> -}
>>> -
>>> -static int
>>> -power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
>>> - unsigned int die __rte_unused, uint32_t index __rte_unused)
>>> -{
>>> - return 0;
>>> -}
>>> +const char *uncore_env_str[] = {
>>> + "not set",
>>> + "auto-detect",
>>> + "intel-uncore",
>>> + "amd-hsmp"
>>> +};
>> Why open the "auto-detect" mode to user?
>> Why not set this automatically at framework initialization?
>> After all, the uncore driver is fixed for one platform.
> The auto-detection feature has been implemented to enable seamless migration across platforms
> without requiring any changes to the application
>>> -static int
>>> -power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
>>> - unsigned int die __rte_unused)
>>> -{
>>> - return 0;
>>> -}
>>> -
>> <...>
>>> -static int
>>> -power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
>>> - unsigned int die __rte_unused)
>>> +/* register the ops struct in rte_power_uncore_ops, return 0 on
>>> +success. */ int rte_power_register_uncore_ops(struct
>>> +rte_power_uncore_ops *driver_ops)
>>> {
>>> - return 0;
>>> -}
>>> + if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
>>> + !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
>>> + !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
>>> + !driver_ops->set_freq || !driver_ops->freq_max ||
>>> + !driver_ops->freq_min) {
>>> + POWER_LOG(ERR, "Missing callbacks while registering power ops");
>>> + return -1;
>>> + }
>>> + if (driver_ops->cb)
>>> + driver_ops->cb();
>>>
>>> -static unsigned int
>>> -power_dummy_uncore_get_num_pkgs(void)
>>> -{
>>> - return 0;
>>> -}
>>> + TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
>>>
>>> -static unsigned int
>>> -power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused) -{
>>> return 0;
>>> }
>>> -
>>> -/* function pointers */
>>> -rte_power_get_uncore_freq_t rte_power_get_uncore_freq =
>>> power_get_dummy_uncore_freq; -rte_power_set_uncore_freq_t
>>> rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
>>> -rte_power_uncore_freq_change_t rte_power_uncore_freq_max =
>>> power_dummy_uncore_freq_max; -rte_power_uncore_freq_change_t
>>> rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
>>> -rte_power_uncore_freqs_t rte_power_uncore_freqs =
>>> power_dummy_uncore_freqs; -rte_power_uncore_get_num_freqs_t
>>> rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
>>> -rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs =
>>> power_dummy_uncore_get_num_pkgs; -rte_power_uncore_get_num_dies_t
>>> rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
>>> -
>>> -static void
>>> -reset_power_uncore_function_ptrs(void)
>>> -{
>>> - rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
>>> - rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
>>> - rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
>>> - rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
>>> - rte_power_uncore_freqs = power_dummy_uncore_freqs;
>>> - rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
>>> - rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
>>> - rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
>>> -}
>>> -
>>> int
>>> rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
>>> {
>>> - int ret;
>>> + int ret = -1;
>>> + struct rte_power_uncore_ops *ops;
>>>
>>> rte_spinlock_lock(&global_env_cfg_lock);
>>>
>>> - if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
>>> + if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
>>> POWER_LOG(ERR, "Uncore Power Management Env already set.");
>>> - rte_spinlock_unlock(&global_env_cfg_lock);
>>> - return -1;
>>> + goto out;
>>> }
>>>
>> <...>
>>> + if (env <= RTE_DIM(uncore_env_str)) {
>>> + RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
>>> + if (strncmp(ops->name, uncore_env_str[env],
>>> + RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
>>> + global_uncore_env = env;
>>> + global_uncore_ops = ops;
>>> + ret = 0;
>>> + goto out;
>>> + }
>>> + POWER_LOG(ERR, "Power Management (%s) not supported",
>>> + uncore_env_str[env]);
>>> + } else
>>> + POWER_LOG(ERR, "Invalid Power Management Environment");
>>>
>>> - default_uncore_env = env;
>>> out:
>>> rte_spinlock_unlock(&global_env_cfg_lock);
>>> return ret;
>>> @@ -139,15 +89,22 @@ void
>>> rte_power_unset_uncore_env(void)
>>> {
>>> rte_spinlock_lock(&global_env_cfg_lock);
>>> - default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
>>> - reset_power_uncore_function_ptrs();
>>> + global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
>>> rte_spinlock_unlock(&global_env_cfg_lock);
>>> }
>>>
>> How about abstract an ABI interface to intialize or set the uncore driver on platform
>> by automatical.
>>
>> And later do power_intel_uncore_init_on_die() for each die on different package.
>>> enum rte_uncore_power_mgmt_env
>>> rte_power_get_uncore_env(void)
>>> {
>>> - return default_uncore_env;
>>> + return global_uncore_env;
>>> +}
>>> +
>>> +struct rte_power_uncore_ops *
>>> +rte_power_get_uncore_ops(void)
>>> +{
>>> + RTE_ASSERT(global_uncore_ops != NULL);
>>> +
>>> + return global_uncore_ops;
>>> }
>>>
>>> int
>>> @@ -155,27 +112,29 @@ rte_power_uncore_init(unsigned int pkg, unsigned
>>> int die)
>> This pkg means the socket id on the platform, right?
>> If so, I am not sure that the
>> uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE] used in uncore lib is
>> universal for all uncore driver.
>> For example, uncore driver just support do uncore dvfs based on the socket unit.
>> What shoud we do for this? we may need to think twice.
> Yes, pkg represents a socket id. In platforms with a single uncore controller per socket,
> the die ID should be set to '0' for the corresponding socket ID (pkg).
> .
So just use the die ID 0 on one socket ID(namely, uncore_info[0][0],
uncore_info[1][0]) to initialize the uncore power info on sockets, right?
From the implement in l3fwd-power, it set all die ID and all sockets.
For the platform with a single uncore controller per socket, their
uncore driver in DPDK have to ignore other die IDs except die-0 on one
socket. right?
>>> {
>>> int ret = -1;
>>>
>> <...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v7 1/5] power: refactor core power management library
2024-10-21 4:07 ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
2024-10-22 1:20 ` Stephen Hemminger
@ 2024-10-22 3:03 ` lihuisong (C)
2024-10-22 7:13 ` Tummala, Sivaprasad
1 sibling, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-10-22 3:03 UTC (permalink / raw)
To: Sivaprasad Tummala, david.hunt, konstantin.ananyev
Cc: dev, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit
Hi Sivaprasad,
Some comments inline.
在 2024/10/21 12:07, Sivaprasad Tummala 写道:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
>
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
>
> v6:
> - fixed compilation error with symbol export in API
> - exported power_get_lcore_mapped_cpu_id as internal API to be
> used in drivers/power/*
>
> v5:
> - fixed code style warning
>
> v4:
> - fixed build error with RTE_ASSERT
>
> v3:
> - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
> - re-worked on auto detection logic
>
> v2:
> - added NULL check for global_core_ops in rte_power_get_core_ops
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> drivers/meson.build | 1 +
> .../power/acpi/acpi_cpufreq.c | 22 +-
> .../power/acpi/acpi_cpufreq.h | 6 +-
> drivers/power/acpi/meson.build | 10 +
> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> .../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
> drivers/power/amd_pstate/meson.build | 10 +
> .../power/cppc/cppc_cpufreq.c | 22 +-
> .../power/cppc/cppc_cpufreq.h | 8 +-
> drivers/power/cppc/meson.build | 10 +
> .../power/kvm_vm}/guest_channel.c | 0
> .../power/kvm_vm}/guest_channel.h | 0
> .../power/kvm_vm/kvm_vm.c | 22 +-
> .../power/kvm_vm/kvm_vm.h | 6 +-
> drivers/power/kvm_vm/meson.build | 14 +
> drivers/power/meson.build | 12 +
> drivers/power/pstate/meson.build | 10 +
> .../power/pstate/pstate_cpufreq.c | 22 +-
> .../power/pstate/pstate_cpufreq.h | 6 +-
> lib/power/meson.build | 7 +-
> lib/power/power_common.c | 2 +-
> lib/power/power_common.h | 18 +-
> lib/power/rte_power.c | 355 ++++++++----------
> lib/power/rte_power.h | 116 +++---
> lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
> lib/power/version.map | 15 +
> 26 files changed, 665 insertions(+), 269 deletions(-)
> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
> create mode 100644 drivers/power/acpi/meson.build
> rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
> create mode 100644 drivers/power/amd_pstate/meson.build
> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
> create mode 100644 drivers/power/cppc/meson.build
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> create mode 100644 drivers/power/kvm_vm/meson.build
> create mode 100644 drivers/power/meson.build
> create mode 100644 drivers/power/pstate/meson.build
> rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
> rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
> create mode 100644 lib/power/rte_power_cpufreq_api.h
>
> diff --git a/drivers/meson.build b/drivers/meson.build
> index 2733306698..7ef4f581a0 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -29,6 +29,7 @@ subdirs = [
> 'event', # depends on common, bus, mempool and net.
> 'baseband', # depends on common and bus.
> 'gpu', # depends on common and bus.
> + 'power', # depends on common (in future).
> ]
>
> if meson.is_cross_build()
> diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
> similarity index 95%
> rename from lib/power/power_acpi_cpufreq.c
> rename to drivers/power/acpi/acpi_cpufreq.c
> index ae809fbb60..974fbb7ba8 100644
> --- a/lib/power/power_acpi_cpufreq.c
> +++ b/drivers/power/acpi/acpi_cpufreq.c
> @@ -10,7 +10,7 @@
> #include <rte_stdatomic.h>
> #include <rte_string_fns.h>
>
> -#include "power_acpi_cpufreq.h"
> +#include "acpi_cpufreq.h"
> #include "power_common.h"
>
<...>
> diff --git a/lib/power/power_common.c b/lib/power/power_common.c
> index b47c63a5f1..e482f71c64 100644
> --- a/lib/power/power_common.c
> +++ b/lib/power/power_common.c
> @@ -13,7 +13,7 @@
>
> #include "power_common.h"
>
> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>
> #define POWER_SYSFILE_SCALING_DRIVER \
> "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> diff --git a/lib/power/power_common.h b/lib/power/power_common.h
> index 82fb94d0c0..c294f561bb 100644
> --- a/lib/power/power_common.h
> +++ b/lib/power/power_common.h
> @@ -6,12 +6,13 @@
> #define _POWER_COMMON_H_
>
> #include <rte_common.h>
> +#include <rte_compat.h>
> #include <rte_log.h>
>
> #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>
> -extern int power_logtype;
> -#define RTE_LOGTYPE_POWER power_logtype
> +extern int rte_power_logtype;
> +#define RTE_LOGTYPE_POWER rte_power_logtype
> #define POWER_LOG(level, ...) \
> RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>
> @@ -23,14 +24,27 @@ extern int power_logtype;
> #endif
>
> /* check if scaling driver matches one we want */
> +__rte_internal
> int cpufreq_check_scaling_driver(const char *driver);
> +
> +__rte_internal
> int power_set_governor(unsigned int lcore_id, const char *new_governor,
> char *orig_governor, size_t orig_governor_len);
cpufreq_check_scaling_driver and power_set_governor are just used for
cpufreq, they shouldn't be put in this common header file.
We've come to an aggrement in patch V2 1/4.
I guess you forget it😁
suggest that move these two APIs to rte_power_cpufreq_api.h.
> +
> +__rte_internal
> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> __rte_format_printf(3, 4);
> +
> +__rte_internal
> int read_core_sysfs_u32(FILE *f, uint32_t *val);
> +
> +__rte_internal
> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> +
> +__rte_internal
> int write_core_sysfs_s(FILE *f, const char *str);
> +
> +__rte_internal
> int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
>
> #endif /* _POWER_COMMON_H_ */
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> index 36c3f3da98..416f0148a3 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power.c
> @@ -6,155 +6,88 @@
>
> #include <rte_errno.h>
> #include <rte_spinlock.h>
> +#include <rte_debug.h>
>
> #include "rte_power.h"
> -#include "power_acpi_cpufreq.h"
> -#include "power_cppc_cpufreq.h"
> #include "power_common.h"
> -#include "power_kvm_vm.h"
> -#include "power_pstate_cpufreq.h"
> -#include "power_amd_pstate_cpufreq.h"
>
> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static struct rte_power_core_ops *global_power_core_ops;
>
> static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> -
> -/* function pointers */
> -rte_power_freqs_t rte_power_freqs = NULL;
> -rte_power_get_freq_t rte_power_get_freq = NULL;
> -rte_power_set_freq_t rte_power_set_freq = NULL;
> -rte_power_freq_change_t rte_power_freq_up = NULL;
> -rte_power_freq_change_t rte_power_freq_down = NULL;
> -rte_power_freq_change_t rte_power_freq_max = NULL;
> -rte_power_freq_change_t rte_power_freq_min = NULL;
> -rte_power_freq_change_t rte_power_turbo_status;
> -rte_power_freq_change_t rte_power_freq_enable_turbo;
> -rte_power_freq_change_t rte_power_freq_disable_turbo;
> -rte_power_get_capabilities_t rte_power_get_capabilities;
> -
> -static void
> -reset_power_function_ptrs(void)
> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> + TAILQ_HEAD_INITIALIZER(core_ops_list);
> +
> +const char *power_env_str[] = {
> + "not set",
> + "acpi",
> + "kvm-vm",
> + "pstate",
> + "cppc",
> + "amd-pstate"
> +};
> +
<...>
> +uint32_t
> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->get_avail_freqs(lcore_id, freqs, n);
> +}
> +
> +uint32_t
> +rte_power_get_freq(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->get_freq(lcore_id);
> +}
> +
> +uint32_t
> +rte_power_set_freq(unsigned int lcore_id, uint32_t index)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->set_freq(lcore_id, index);
> +}
> +
> +int
> +rte_power_freq_up(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->freq_up(lcore_id);
> +}
> +
> +int
> +rte_power_freq_down(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->freq_down(lcore_id);
> +}
> +
> +int
> +rte_power_freq_max(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->freq_max(lcore_id);
> +}
> +
> +int
> +rte_power_freq_min(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->freq_min(lcore_id);
> +}
>
> +int
> +rte_power_turbo_status(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->turbo_status(lcore_id);
> +}
> +
> +int
> +rte_power_freq_enable_turbo(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->enable_turbo(lcore_id);
> +}
> +
> +int
> +rte_power_freq_disable_turbo(unsigned int lcore_id)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->disable_turbo(lcore_id);
> +}
> +
> +int
> +rte_power_get_capabilities(unsigned int lcore_id,
> + struct rte_power_core_capabilities *caps)
> +{
> + RTE_ASSERT(global_power_core_ops != NULL);
> + return global_power_core_ops->get_caps(lcore_id, caps);
> }
> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
> index 4fa4afe399..e9a72b92ad 100644
> --- a/lib/power/rte_power.h
> +++ b/lib/power/rte_power.h
> @@ -1,5 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> */
>
> #ifndef _RTE_POWER_H
> @@ -14,14 +15,21 @@
> #include <rte_log.h>
> #include <rte_power_guest_channel.h>
>
> +#include "rte_power_cpufreq_api.h"
From the name of rte_power.c and rte_power.h, they are supposed to work
for all power libraries I also proposed in previous version.
But rte_power.* currently just work for cpufreq lib. If we need to put
all power components togeter and create it.
Now that the rte_power_cpufreq_api.h has been created for cpufreq library.
How about directly rename rte_power.c to rte_poer_cpufreq_api.c and
rte_power.h to rte_power_cpufreq_api.h?
There will be ABI changes, but it is allowed in this 24.11. If we plan
to do it later, we'll have to wait another year.
> +
> #ifdef __cplusplus
> extern "C" {
> #endif
>
> /* Power Management Environment State */
> -enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> - PM_ENV_AMD_PSTATE_CPUFREQ};
> +enum power_management_env {
> + PM_ENV_NOT_SET = 0,
> + PM_ENV_ACPI_CPUFREQ,
> + PM_ENV_KVM_VM,
> + PM_ENV_PSTATE_CPUFREQ,
> + PM_ENV_CPPC_CPUFREQ,
> + PM_ENV_AMD_PSTATE_CPUFREQ
> +};
>
<...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v7 2/5] power: refactor uncore power management library
2024-10-21 4:07 ` [PATCH v7 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-22 1:18 ` Stephen Hemminger
@ 2024-10-22 3:17 ` lihuisong (C)
2024-10-22 6:46 ` Tummala, Sivaprasad
1 sibling, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-10-22 3:17 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, ferruh.yigit, gakhil
LGTM except for one typo,
Acked-by: Huisong Li <lihuisong@huawei.com>
在 2024/10/21 12:07, Sivaprasad Tummala 写道:
> iThis patch refactors the power management library, addressing uncore
iThis --> This
> power management. The primary changes involve the creation of dedicated
> directories for each driver within 'drivers/power/uncore/*'. The
> adjustment of meson.build files enables the selective activation
> of individual drivers.
>
> This refactor significantly improves code organization, enhances
> clarity and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> v7:
> - fixed build error with aarch32 gcc cross compilation
>
> v6:
> - fixed compilation error with symbol export in API
>
> v5:
> - fixed build errors for risc-v/ppc targets
>
> v4:
> - fixed build error with RTE_ASSERT
>
> v3:
> - fixed typo in header file inclusion
<...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v7 1/5] power: refactor core power management library
2024-10-22 1:20 ` Stephen Hemminger
@ 2024-10-22 6:45 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-22 6:45 UTC (permalink / raw)
To: Stephen Hemminger
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, Yigit, Ferruh, konstantin.ananyev,
lihuisong, dev
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Stephen,
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, October 22, 2024 6:50 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
> Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com;
> lihuisong@huawei.com; dev@dpdk.org
> Subject: Re: [PATCH v7 1/5] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Mon, 21 Oct 2024 04:07:19 +0000
> Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
>
> > diff --git a/lib/power/version.map b/lib/power/version.map index
> > c9a226614e..016e599e90 100644
> > --- a/lib/power/version.map
> > +++ b/lib/power/version.map
> > @@ -51,4 +51,19 @@ EXPERIMENTAL {
> > rte_power_set_uncore_env;
> > rte_power_uncore_freqs;
> > rte_power_unset_uncore_env;
> > + # added in 24.11
> > + rte_power_logtype;
> > +};
>
> The logtype should be kept internal, not part of public API
ACK!
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v7 2/5] power: refactor uncore power management library
2024-10-22 1:18 ` Stephen Hemminger
@ 2024-10-22 6:45 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-22 6:45 UTC (permalink / raw)
To: Stephen Hemminger
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, Yigit, Ferruh, konstantin.ananyev,
lihuisong, dev
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Stephen,
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, October 22, 2024 6:49 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
> Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com;
> lihuisong@huawei.com; dev@dpdk.org
> Subject: Re: [PATCH v7 2/5] power: refactor uncore power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Mon, 21 Oct 2024 04:07:20 +0000
> Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
>
> > diff --git a/lib/power/rte_power_uncore_ops.h b/lib/power/rte_power_uncore_ops.h
> > new file mode 100644
> > index 0000000000..f91994d3c1
> > --- /dev/null
> > +++ b/lib/power/rte_power_uncore_ops.h
> > @@ -0,0 +1,230 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2022 Intel Corporation
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> > + */
> > +
> > +#ifndef RTE_POWER_UNCORE_OPS_H
> > +#define RTE_POWER_UNCORE_OPS_H
> > +
> > +/**
> > + * @file
> > + * RTE Uncore Frequency Management
> > + */
> > +
> > +#include <rte_compat.h>
> > +#include <rte_common.h>
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
>
> Since this all internal doesn't really require C++ guards.
ACK!
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v7 2/5] power: refactor uncore power management library
2024-10-22 3:17 ` lihuisong (C)
@ 2024-10-22 6:46 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-22 6:46 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, david.hunt, anatoly.burakov, radu.nicolau, jerinj,
cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Huisong,
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Tuesday, October 22, 2024 8:47 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> radu.nicolau@intel.com; jerinj@marvell.com; cristian.dumitrescu@intel.com;
> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> gakhil@marvell.com
> Subject: Re: [PATCH v7 2/5] power: refactor uncore power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> LGTM except for one typo,
> Acked-by: Huisong Li <lihuisong@huawei.com>
>
> 在 2024/10/21 12:07, Sivaprasad Tummala 写道:
> > iThis patch refactors the power management library, addressing uncore
> iThis --> This
> > power management. The primary changes involve the creation of
> > dedicated directories for each driver within 'drivers/power/uncore/*'.
> > The adjustment of meson.build files enables the selective activation
> > of individual drivers.
> >
> > This refactor significantly improves code organization, enhances
> > clarity and boosts maintainability. It lays the foundation for more
> > focused development on individual drivers and facilitates seamless
> > integration of future enhancements, particularly the AMD uncore driver.
> >
> > v7:
> > - fixed build error with aarch32 gcc cross compilation
> >
> > v6:
> > - fixed compilation error with symbol export in API
> >
> > v5:
> > - fixed build errors for risc-v/ppc targets
> >
> > v4:
> > - fixed build error with RTE_ASSERT
> >
> > v3:
> > - fixed typo in header file inclusion
> <...>
ACK! Will fix this in next version
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v7 1/5] power: refactor core power management library
2024-10-22 3:03 ` lihuisong (C)
@ 2024-10-22 7:13 ` Tummala, Sivaprasad
2024-10-22 8:36 ` lihuisong (C)
0 siblings, 1 reply; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-22 7:13 UTC (permalink / raw)
To: lihuisong (C), david.hunt, konstantin.ananyev
Cc: dev, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, Yigit, Ferruh
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Huisong,
Please find my comments inline.
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Tuesday, October 22, 2024 8:33 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
> david.hunt@intel.com; konstantin.ananyev@huawei.com
> Cc: dev@dpdk.org; anatoly.burakov@intel.com; jerinj@marvell.com;
> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
> Ferruh <Ferruh.Yigit@amd.com>
> Subject: Re: [PATCH v7 1/5] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hi Sivaprasad,
>
> Some comments inline.
>
> 在 2024/10/21 12:07, Sivaprasad Tummala 写道:
> > This patch introduces a comprehensive refactor to the core power
> > management library. The primary focus is on improving modularity and
> > organization by relocating specific driver implementations from the
> > 'lib/power' directory to dedicated directories within
> > 'drivers/power/core/*'. The adjustment of meson.build files enables
> > the selective activation of individual drivers.
> >
> > These changes contribute to a significant enhancement in code
> > organization, providing a clearer structure for driver implementations.
> > The refactor aims to improve overall code clarity and boost
> > maintainability. Additionally, it establishes a foundation for future
> > development, allowing for more focused work on individual drivers and
> > seamless integration of forthcoming enhancements.
> >
> > v6:
> > - fixed compilation error with symbol export in API
> > - exported power_get_lcore_mapped_cpu_id as internal API to be
> > used in drivers/power/*
> >
> > v5:
> > - fixed code style warning
> >
> > v4:
> > - fixed build error with RTE_ASSERT
> >
> > v3:
> > - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
> > - re-worked on auto detection logic
> >
> > v2:
> > - added NULL check for global_core_ops in rte_power_get_core_ops
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> > drivers/meson.build | 1 +
> > .../power/acpi/acpi_cpufreq.c | 22 +-
> > .../power/acpi/acpi_cpufreq.h | 6 +-
> > drivers/power/acpi/meson.build | 10 +
> > .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> > .../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
> > drivers/power/amd_pstate/meson.build | 10 +
> > .../power/cppc/cppc_cpufreq.c | 22 +-
> > .../power/cppc/cppc_cpufreq.h | 8 +-
> > drivers/power/cppc/meson.build | 10 +
> > .../power/kvm_vm}/guest_channel.c | 0
> > .../power/kvm_vm}/guest_channel.h | 0
> > .../power/kvm_vm/kvm_vm.c | 22 +-
> > .../power/kvm_vm/kvm_vm.h | 6 +-
> > drivers/power/kvm_vm/meson.build | 14 +
> > drivers/power/meson.build | 12 +
> > drivers/power/pstate/meson.build | 10 +
> > .../power/pstate/pstate_cpufreq.c | 22 +-
> > .../power/pstate/pstate_cpufreq.h | 6 +-
> > lib/power/meson.build | 7 +-
> > lib/power/power_common.c | 2 +-
> > lib/power/power_common.h | 18 +-
> > lib/power/rte_power.c | 355 ++++++++----------
> > lib/power/rte_power.h | 116 +++---
> > lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
> > lib/power/version.map | 15 +
> > 26 files changed, 665 insertions(+), 269 deletions(-)
> > rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c
> (95%)
> > rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h
> (98%)
> > create mode 100644 drivers/power/acpi/meson.build
> > rename lib/power/power_amd_pstate_cpufreq.c =>
> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> > rename lib/power/power_amd_pstate_cpufreq.h =>
> drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
> > create mode 100644 drivers/power/amd_pstate/meson.build
> > rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c
> (95%)
> > rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h
> (97%)
> > create mode 100644 drivers/power/cppc/meson.build
> > rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
> > rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> > rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> > rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> > create mode 100644 drivers/power/kvm_vm/meson.build
> > create mode 100644 drivers/power/meson.build
> > create mode 100644 drivers/power/pstate/meson.build
> > rename lib/power/power_pstate_cpufreq.c =>
> drivers/power/pstate/pstate_cpufreq.c (96%)
> > rename lib/power/power_pstate_cpufreq.h =>
> drivers/power/pstate/pstate_cpufreq.h (98%)
> > create mode 100644 lib/power/rte_power_cpufreq_api.h
> >
> > diff --git a/drivers/meson.build b/drivers/meson.build index
> > 2733306698..7ef4f581a0 100644
> > --- a/drivers/meson.build
> > +++ b/drivers/meson.build
> > @@ -29,6 +29,7 @@ subdirs = [
> > 'event', # depends on common, bus, mempool and net.
> > 'baseband', # depends on common and bus.
> > 'gpu', # depends on common and bus.
> > + 'power', # depends on common (in future).
> > ]
> >
> > if meson.is_cross_build()
> > diff --git a/lib/power/power_acpi_cpufreq.c
> > b/drivers/power/acpi/acpi_cpufreq.c
> > similarity index 95%
> > rename from lib/power/power_acpi_cpufreq.c rename to
> > drivers/power/acpi/acpi_cpufreq.c index ae809fbb60..974fbb7ba8 100644
> > --- a/lib/power/power_acpi_cpufreq.c
> > +++ b/drivers/power/acpi/acpi_cpufreq.c
> > @@ -10,7 +10,7 @@
> > #include <rte_stdatomic.h>
> > #include <rte_string_fns.h>
> >
> > -#include "power_acpi_cpufreq.h"
> > +#include "acpi_cpufreq.h"
> > #include "power_common.h"
> >
> <...>
> > diff --git a/lib/power/power_common.c b/lib/power/power_common.c index
> > b47c63a5f1..e482f71c64 100644
> > --- a/lib/power/power_common.c
> > +++ b/lib/power/power_common.c
> > @@ -13,7 +13,7 @@
> >
> > #include "power_common.h"
> >
> > -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
> > +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
> >
> > #define POWER_SYSFILE_SCALING_DRIVER \
> > "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
> > diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
> > 82fb94d0c0..c294f561bb 100644
> > --- a/lib/power/power_common.h
> > +++ b/lib/power/power_common.h
> > @@ -6,12 +6,13 @@
> > #define _POWER_COMMON_H_
> >
> > #include <rte_common.h>
> > +#include <rte_compat.h>
> > #include <rte_log.h>
> >
> > #define RTE_POWER_INVALID_FREQ_INDEX (~0)
> >
> > -extern int power_logtype;
> > -#define RTE_LOGTYPE_POWER power_logtype
> > +extern int rte_power_logtype;
> > +#define RTE_LOGTYPE_POWER rte_power_logtype
> > #define POWER_LOG(level, ...) \
> > RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
> >
> > @@ -23,14 +24,27 @@ extern int power_logtype;
> > #endif
> >
> > /* check if scaling driver matches one we want */
> > +__rte_internal
> > int cpufreq_check_scaling_driver(const char *driver);
> > +
> > +__rte_internal
> > int power_set_governor(unsigned int lcore_id, const char *new_governor,
> > char *orig_governor, size_t orig_governor_len);
> cpufreq_check_scaling_driver and power_set_governor are just used for cpufreq,
> they shouldn't be put in this common header file.
> We've come to an aggrement in patch V2 1/4.
> I guess you forget it😁
> suggest that move these two APIs to rte_power_cpufreq_api.h.
OK!
> > +
> > +__rte_internal
> > int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
> > __rte_format_printf(3, 4);
> > +
> > +__rte_internal
> > int read_core_sysfs_u32(FILE *f, uint32_t *val);
> > +
> > +__rte_internal
> > int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
> > +
> > +__rte_internal
> > int write_core_sysfs_s(FILE *f, const char *str);
> > +
> > +__rte_internal
> > int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t
> > *cpu_id);
> >
> > #endif /* _POWER_COMMON_H_ */
> > diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
> > 36c3f3da98..416f0148a3 100644
> > --- a/lib/power/rte_power.c
> > +++ b/lib/power/rte_power.c
> > @@ -6,155 +6,88 @@
> >
> > #include <rte_errno.h>
> > #include <rte_spinlock.h>
> > +#include <rte_debug.h>
> >
> > #include "rte_power.h"
> > -#include "power_acpi_cpufreq.h"
> > -#include "power_cppc_cpufreq.h"
> > #include "power_common.h"
> > -#include "power_kvm_vm.h"
> > -#include "power_pstate_cpufreq.h"
> > -#include "power_amd_pstate_cpufreq.h"
> >
> > -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> > +static enum power_management_env global_default_env =
> PM_ENV_NOT_SET;
> > +static struct rte_power_core_ops *global_power_core_ops;
> >
> > static rte_spinlock_t global_env_cfg_lock =
> > RTE_SPINLOCK_INITIALIZER;
> > -
> > -/* function pointers */
> > -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
> > rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
> > NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
> > -rte_power_freq_change_t rte_power_freq_down = NULL;
> > -rte_power_freq_change_t rte_power_freq_max = NULL;
> > -rte_power_freq_change_t rte_power_freq_min = NULL;
> > -rte_power_freq_change_t rte_power_turbo_status;
> > -rte_power_freq_change_t rte_power_freq_enable_turbo;
> > -rte_power_freq_change_t rte_power_freq_disable_turbo;
> > -rte_power_get_capabilities_t rte_power_get_capabilities;
> > -
> > -static void
> > -reset_power_function_ptrs(void)
> > +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
> > + TAILQ_HEAD_INITIALIZER(core_ops_list);
> > +
> > +const char *power_env_str[] = {
> > + "not set",
> > + "acpi",
> > + "kvm-vm",
> > + "pstate",
> > + "cppc",
> > + "amd-pstate"
> > +};
> > +
>
> <...>
> > +uint32_t
> > +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->get_avail_freqs(lcore_id, freqs,
> > +n); }
> > +
> > +uint32_t
> > +rte_power_get_freq(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->get_freq(lcore_id);
> > +}
> > +
> > +uint32_t
> > +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->set_freq(lcore_id, index); }
> > +
> > +int
> > +rte_power_freq_up(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->freq_up(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_down(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->freq_down(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_max(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->freq_max(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_min(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->freq_min(lcore_id);
> > +}
> >
> > +int
> > +rte_power_turbo_status(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->turbo_status(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_enable_turbo(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->enable_turbo(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_freq_disable_turbo(unsigned int lcore_id) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->disable_turbo(lcore_id);
> > +}
> > +
> > +int
> > +rte_power_get_capabilities(unsigned int lcore_id,
> > + struct rte_power_core_capabilities *caps) {
> > + RTE_ASSERT(global_power_core_ops != NULL);
> > + return global_power_core_ops->get_caps(lcore_id, caps);
> > }
> > diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
> > 4fa4afe399..e9a72b92ad 100644
> > --- a/lib/power/rte_power.h
> > +++ b/lib/power/rte_power.h
> > @@ -1,5 +1,6 @@
> > /* SPDX-License-Identifier: BSD-3-Clause
> > * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2024 Advanced Micro Devices, Inc.
> > */
> >
> > #ifndef _RTE_POWER_H
> > @@ -14,14 +15,21 @@
> > #include <rte_log.h>
> > #include <rte_power_guest_channel.h>
> >
> > +#include "rte_power_cpufreq_api.h"
> From the name of rte_power.c and rte_power.h, they are supposed to work for all
> power libraries I also proposed in previous version.
> But rte_power.* currently just work for cpufreq lib. If we need to put all power
> components togeter and create it.
> Now that the rte_power_cpufreq_api.h has been created for cpufreq library.
> How about directly rename rte_power.c to rte_poer_cpufreq_api.c and rte_power.h
> to rte_power_cpufreq_api.h?
> There will be ABI changes, but it is allowed in this 24.11. If we plan to do it later, we'll
> have to wait another year.
Yes, I had split the rte_power.h as part of refactor to avoid exposing internal functions.
Renaming rte_power.* to rte_power_cpufreq.* can be considered but not merge with rte_power_cpufreq_api.h
> > +
> > #ifdef __cplusplus
> > extern "C" {
> > #endif
> >
> > /* Power Management Environment State */ -enum power_management_env
> > {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
> > - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
> > - PM_ENV_AMD_PSTATE_CPUFREQ};
> > +enum power_management_env {
> > + PM_ENV_NOT_SET = 0,
> > + PM_ENV_ACPI_CPUFREQ,
> > + PM_ENV_KVM_VM,
> > + PM_ENV_PSTATE_CPUFREQ,
> > + PM_ENV_CPPC_CPUFREQ,
> > + PM_ENV_AMD_PSTATE_CPUFREQ
> > +};
> >
> <...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v7 1/5] power: refactor core power management library
2024-10-22 7:13 ` Tummala, Sivaprasad
@ 2024-10-22 8:36 ` lihuisong (C)
0 siblings, 0 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-10-22 8:36 UTC (permalink / raw)
To: Tummala, Sivaprasad, david.hunt, konstantin.ananyev
Cc: dev, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, Yigit, Ferruh
在 2024/10/22 15:13, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Huisong,
>
> Please find my comments inline.
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Tuesday, October 22, 2024 8:33 AM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>;
>> david.hunt@intel.com; konstantin.ananyev@huawei.com
>> Cc: dev@dpdk.org; anatoly.burakov@intel.com; jerinj@marvell.com;
>> radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; Yigit,
>> Ferruh <Ferruh.Yigit@amd.com>
>> Subject: Re: [PATCH v7 1/5] power: refactor core power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> Hi Sivaprasad,
>>
>> Some comments inline.
>>
>> 在 2024/10/21 12:07, Sivaprasad Tummala 写道:
>>> This patch introduces a comprehensive refactor to the core power
>>> management library. The primary focus is on improving modularity and
>>> organization by relocating specific driver implementations from the
>>> 'lib/power' directory to dedicated directories within
>>> 'drivers/power/core/*'. The adjustment of meson.build files enables
>>> the selective activation of individual drivers.
>>>
>>> These changes contribute to a significant enhancement in code
>>> organization, providing a clearer structure for driver implementations.
>>> The refactor aims to improve overall code clarity and boost
>>> maintainability. Additionally, it establishes a foundation for future
>>> development, allowing for more focused work on individual drivers and
>>> seamless integration of forthcoming enhancements.
>>>
>>> v6:
>>> - fixed compilation error with symbol export in API
>>> - exported power_get_lcore_mapped_cpu_id as internal API to be
>>> used in drivers/power/*
>>>
>>> v5:
>>> - fixed code style warning
>>>
>>> v4:
>>> - fixed build error with RTE_ASSERT
>>>
>>> v3:
>>> - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
>>> - re-worked on auto detection logic
>>>
>>> v2:
>>> - added NULL check for global_core_ops in rte_power_get_core_ops
>>>
>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>> ---
>>> drivers/meson.build | 1 +
>>> .../power/acpi/acpi_cpufreq.c | 22 +-
>>> .../power/acpi/acpi_cpufreq.h | 6 +-
>>> drivers/power/acpi/meson.build | 10 +
>>> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
>>> .../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
>>> drivers/power/amd_pstate/meson.build | 10 +
>>> .../power/cppc/cppc_cpufreq.c | 22 +-
>>> .../power/cppc/cppc_cpufreq.h | 8 +-
>>> drivers/power/cppc/meson.build | 10 +
>>> .../power/kvm_vm}/guest_channel.c | 0
>>> .../power/kvm_vm}/guest_channel.h | 0
>>> .../power/kvm_vm/kvm_vm.c | 22 +-
>>> .../power/kvm_vm/kvm_vm.h | 6 +-
>>> drivers/power/kvm_vm/meson.build | 14 +
>>> drivers/power/meson.build | 12 +
>>> drivers/power/pstate/meson.build | 10 +
>>> .../power/pstate/pstate_cpufreq.c | 22 +-
>>> .../power/pstate/pstate_cpufreq.h | 6 +-
>>> lib/power/meson.build | 7 +-
>>> lib/power/power_common.c | 2 +-
>>> lib/power/power_common.h | 18 +-
>>> lib/power/rte_power.c | 355 ++++++++----------
>>> lib/power/rte_power.h | 116 +++---
>>> lib/power/rte_power_cpufreq_api.h | 206 ++++++++++
>>> lib/power/version.map | 15 +
>>> 26 files changed, 665 insertions(+), 269 deletions(-)
>>> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c
>> (95%)
>>> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h
>> (98%)
>>> create mode 100644 drivers/power/acpi/meson.build
>>> rename lib/power/power_amd_pstate_cpufreq.c =>
>> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
>>> rename lib/power/power_amd_pstate_cpufreq.h =>
>> drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
>>> create mode 100644 drivers/power/amd_pstate/meson.build
>>> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c
>> (95%)
>>> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h
>> (97%)
>>> create mode 100644 drivers/power/cppc/meson.build
>>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
>>> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
>>> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
>>> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
>>> create mode 100644 drivers/power/kvm_vm/meson.build
>>> create mode 100644 drivers/power/meson.build
>>> create mode 100644 drivers/power/pstate/meson.build
>>> rename lib/power/power_pstate_cpufreq.c =>
>> drivers/power/pstate/pstate_cpufreq.c (96%)
>>> rename lib/power/power_pstate_cpufreq.h =>
>> drivers/power/pstate/pstate_cpufreq.h (98%)
>>> create mode 100644 lib/power/rte_power_cpufreq_api.h
>>>
>>> diff --git a/drivers/meson.build b/drivers/meson.build index
>>> 2733306698..7ef4f581a0 100644
>>> --- a/drivers/meson.build
>>> +++ b/drivers/meson.build
>>> @@ -29,6 +29,7 @@ subdirs = [
>>> 'event', # depends on common, bus, mempool and net.
>>> 'baseband', # depends on common and bus.
>>> 'gpu', # depends on common and bus.
>>> + 'power', # depends on common (in future).
>>> ]
>>>
>>> if meson.is_cross_build()
>>> diff --git a/lib/power/power_acpi_cpufreq.c
>>> b/drivers/power/acpi/acpi_cpufreq.c
>>> similarity index 95%
>>> rename from lib/power/power_acpi_cpufreq.c rename to
>>> drivers/power/acpi/acpi_cpufreq.c index ae809fbb60..974fbb7ba8 100644
>>> --- a/lib/power/power_acpi_cpufreq.c
>>> +++ b/drivers/power/acpi/acpi_cpufreq.c
>>> @@ -10,7 +10,7 @@
>>> #include <rte_stdatomic.h>
>>> #include <rte_string_fns.h>
>>>
>>> -#include "power_acpi_cpufreq.h"
>>> +#include "acpi_cpufreq.h"
>>> #include "power_common.h"
>>>
>> <...>
>>> diff --git a/lib/power/power_common.c b/lib/power/power_common.c index
>>> b47c63a5f1..e482f71c64 100644
>>> --- a/lib/power/power_common.c
>>> +++ b/lib/power/power_common.c
>>> @@ -13,7 +13,7 @@
>>>
>>> #include "power_common.h"
>>>
>>> -RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
>>> +RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
>>>
>>> #define POWER_SYSFILE_SCALING_DRIVER \
>>> "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
>>> diff --git a/lib/power/power_common.h b/lib/power/power_common.h index
>>> 82fb94d0c0..c294f561bb 100644
>>> --- a/lib/power/power_common.h
>>> +++ b/lib/power/power_common.h
>>> @@ -6,12 +6,13 @@
>>> #define _POWER_COMMON_H_
>>>
>>> #include <rte_common.h>
>>> +#include <rte_compat.h>
>>> #include <rte_log.h>
>>>
>>> #define RTE_POWER_INVALID_FREQ_INDEX (~0)
>>>
>>> -extern int power_logtype;
>>> -#define RTE_LOGTYPE_POWER power_logtype
>>> +extern int rte_power_logtype;
>>> +#define RTE_LOGTYPE_POWER rte_power_logtype
>>> #define POWER_LOG(level, ...) \
>>> RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
>>>
>>> @@ -23,14 +24,27 @@ extern int power_logtype;
>>> #endif
>>>
>>> /* check if scaling driver matches one we want */
>>> +__rte_internal
>>> int cpufreq_check_scaling_driver(const char *driver);
>>> +
>>> +__rte_internal
>>> int power_set_governor(unsigned int lcore_id, const char *new_governor,
>>> char *orig_governor, size_t orig_governor_len);
>> cpufreq_check_scaling_driver and power_set_governor are just used for cpufreq,
>> they shouldn't be put in this common header file.
>> We've come to an aggrement in patch V2 1/4.
>> I guess you forget it😁
>> suggest that move these two APIs to rte_power_cpufreq_api.h.
> OK!
>>> +
>>> +__rte_internal
>>> int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
>>> __rte_format_printf(3, 4);
>>> +
>>> +__rte_internal
>>> int read_core_sysfs_u32(FILE *f, uint32_t *val);
>>> +
>>> +__rte_internal
>>> int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
>>> +
>>> +__rte_internal
>>> int write_core_sysfs_s(FILE *f, const char *str);
>>> +
>>> +__rte_internal
>>> int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t
>>> *cpu_id);
>>>
>>> #endif /* _POWER_COMMON_H_ */
>>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
>>> 36c3f3da98..416f0148a3 100644
>>> --- a/lib/power/rte_power.c
>>> +++ b/lib/power/rte_power.c
>>> @@ -6,155 +6,88 @@
>>>
>>> #include <rte_errno.h>
>>> #include <rte_spinlock.h>
>>> +#include <rte_debug.h>
>>>
>>> #include "rte_power.h"
>>> -#include "power_acpi_cpufreq.h"
>>> -#include "power_cppc_cpufreq.h"
>>> #include "power_common.h"
>>> -#include "power_kvm_vm.h"
>>> -#include "power_pstate_cpufreq.h"
>>> -#include "power_amd_pstate_cpufreq.h"
>>>
>>> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
>>> +static enum power_management_env global_default_env =
>> PM_ENV_NOT_SET;
>>> +static struct rte_power_core_ops *global_power_core_ops;
>>>
>>> static rte_spinlock_t global_env_cfg_lock =
>>> RTE_SPINLOCK_INITIALIZER;
>>> -
>>> -/* function pointers */
>>> -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
>>> rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
>>> NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
>>> -rte_power_freq_change_t rte_power_freq_down = NULL;
>>> -rte_power_freq_change_t rte_power_freq_max = NULL;
>>> -rte_power_freq_change_t rte_power_freq_min = NULL;
>>> -rte_power_freq_change_t rte_power_turbo_status;
>>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
>>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
>>> -rte_power_get_capabilities_t rte_power_get_capabilities;
>>> -
>>> -static void
>>> -reset_power_function_ptrs(void)
>>> +static RTE_TAILQ_HEAD(, rte_power_core_ops) core_ops_list =
>>> + TAILQ_HEAD_INITIALIZER(core_ops_list);
>>> +
>>> +const char *power_env_str[] = {
>>> + "not set",
>>> + "acpi",
>>> + "kvm-vm",
>>> + "pstate",
>>> + "cppc",
>>> + "amd-pstate"
>>> +};
>>> +
>> <...>
>>> +uint32_t
>>> +rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->get_avail_freqs(lcore_id, freqs,
>>> +n); }
>>> +
>>> +uint32_t
>>> +rte_power_get_freq(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->get_freq(lcore_id);
>>> +}
>>> +
>>> +uint32_t
>>> +rte_power_set_freq(unsigned int lcore_id, uint32_t index) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->set_freq(lcore_id, index); }
>>> +
>>> +int
>>> +rte_power_freq_up(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->freq_up(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_down(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->freq_down(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_max(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->freq_max(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_min(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->freq_min(lcore_id);
>>> +}
>>>
>>> +int
>>> +rte_power_turbo_status(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->turbo_status(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_enable_turbo(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->enable_turbo(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_freq_disable_turbo(unsigned int lcore_id) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->disable_turbo(lcore_id);
>>> +}
>>> +
>>> +int
>>> +rte_power_get_capabilities(unsigned int lcore_id,
>>> + struct rte_power_core_capabilities *caps) {
>>> + RTE_ASSERT(global_power_core_ops != NULL);
>>> + return global_power_core_ops->get_caps(lcore_id, caps);
>>> }
>>> diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h index
>>> 4fa4afe399..e9a72b92ad 100644
>>> --- a/lib/power/rte_power.h
>>> +++ b/lib/power/rte_power.h
>>> @@ -1,5 +1,6 @@
>>> /* SPDX-License-Identifier: BSD-3-Clause
>>> * Copyright(c) 2010-2014 Intel Corporation
>>> + * Copyright(c) 2024 Advanced Micro Devices, Inc.
>>> */
>>>
>>> #ifndef _RTE_POWER_H
>>> @@ -14,14 +15,21 @@
>>> #include <rte_log.h>
>>> #include <rte_power_guest_channel.h>
>>>
>>> +#include "rte_power_cpufreq_api.h"
>> From the name of rte_power.c and rte_power.h, they are supposed to work for all
>> power libraries I also proposed in previous version.
>> But rte_power.* currently just work for cpufreq lib. If we need to put all power
>> components togeter and create it.
>> Now that the rte_power_cpufreq_api.h has been created for cpufreq library.
>> How about directly rename rte_power.c to rte_poer_cpufreq_api.c and rte_power.h
>> to rte_power_cpufreq_api.h?
>> There will be ABI changes, but it is allowed in this 24.11. If we plan to do it later, we'll
>> have to wait another year.
> Yes, I had split the rte_power.h as part of refactor to avoid exposing internal functions.
> Renaming rte_power.* to rte_power_cpufreq.* can be considered but not merge with rte_power_cpufreq_api.h
What is your plan? I feel it is not very hard and just rename the file.
>>> +
>>> #ifdef __cplusplus
>>> extern "C" {
>>> #endif
>>>
>>> /* Power Management Environment State */ -enum power_management_env
>>> {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
>>> - PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
>>> - PM_ENV_AMD_PSTATE_CPUFREQ};
>>> +enum power_management_env {
>>> + PM_ENV_NOT_SET = 0,
>>> + PM_ENV_ACPI_CPUFREQ,
>>> + PM_ENV_KVM_VM,
>>> + PM_ENV_PSTATE_CPUFREQ,
>>> + PM_ENV_CPPC_CPUFREQ,
>>> + PM_ENV_AMD_PSTATE_CPUFREQ
>>> +};
>>>
>> <...>
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 0/6] power: refactor power management library
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
` (6 preceding siblings ...)
2024-10-22 1:34 ` Stephen Hemminger
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 1/6] power: refactor core " Sivaprasad Tummala
` (8 more replies)
7 siblings, 9 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (6):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
power: rename library sources for cpu frequency management
MAINTAINERS | 1 +
app/test/test_power.c | 97 +-----
app/test/test_power_cpufreq.c | 54 +--
app/test/test_power_kvm_vm.c | 38 +-
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 14 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
.../guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 13 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/power_uncore_ops.h | 244 +++++++++++++
lib/power/rte_power.c | 257 --------------
lib/power/rte_power_cpufreq.c | 230 ++++++++++++
.../{rte_power.h => rte_power_cpufreq.h} | 120 ++++---
lib/power/rte_power_pmd_mgmt.h | 2 +-
lib/power/rte_power_uncore.c | 256 +++++++-------
lib/power/rte_power_uncore.h | 61 ++--
lib/power/version.map | 15 +
49 files changed, 1746 insertions(+), 707 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
create mode 100644 lib/power/power_uncore_ops.h
delete mode 100644 lib/power/rte_power.c
create mode 100644 lib/power/rte_power_cpufreq.c
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (73%)
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 1/6] power: refactor core power management library
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 2/6] power: refactor uncore " Sivaprasad Tummala
` (7 subsequent siblings)
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v8:
- marked rte_power_logtype as internal
- removed c++ guards for internal header files
- renamed rte_power_cpufreq_api.h for naming convention
- renamed rte_power_register_ops for naming convention
v6:
- fixed compilation error with symbol export in API
- exported power_get_lcore_mapped_cpu_id as internal API to be
used in drivers/power/*
v5:
- fixed code style warning
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/version.map | 14 +
26 files changed, 650 insertions(+), 270 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 2733306698..7ef4f581a0 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index ae809fbb60..81a5e3f6ea 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -587,3 +587,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..e18a3e6af8 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 2b728eca18..95495bff7d 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -710,3 +710,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 96%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..5c273df4d7 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if amd p-state power management is supported.
@@ -216,4 +216,4 @@ int power_amd_pstate_disable_turbo(unsigned int lcore_id);
int power_amd_pstate_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_AMD_PSTATET_CPUFREQ_H */
+#endif /* _AMD_PSTATET_CPUFREQ_H */
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index cc9305bdfe..3cd4165c83 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -695,3 +695,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..d637f53dcc 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 99%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
index bc3f55b6bf..35cd4cfe6f 100644
--- a/lib/power/guest_channel.c
+++ b/drivers/power/kvm_vm/guest_channel.c
@@ -13,7 +13,7 @@
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_guest_channel.h>
#include "guest_channel.h"
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..5754a441cd 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_cpufreq_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..4fabe4c6a5 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..fe11179ab3
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 4755909466..f117ff3d17 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -898,3 +898,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..b18a1ac9bc 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..dd8e4393ac 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,19 +12,14 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
+ 'power_cpufreq.h',
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index b47c63a5f1..e482f71c64 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -13,7 +13,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 82fb94d0c0..c294f561bb 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,14 +24,27 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
+
+__rte_internal
int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/power_cpufreq.h b/lib/power/power_cpufreq.h
new file mode 100644
index 0000000000..e33d9fe18c
--- /dev/null
+++ b/lib/power/power_cpufreq.h
@@ -0,0 +1,191 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _POWER_CPUFREQ_H
+#define _POWER_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
+ uint32_t *freqs, uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_cpufreq_ops {
+ RTE_TAILQ_ENTRY(rte_power_cpufreq_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_CPUFREQ_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_cpufreq_ops(&ops); \
+}
+
+#endif
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..3168b6d301 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -6,155 +6,88 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_cpufreq_ops *global_cpufreq_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
-
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+static RTE_TAILQ_HEAD(, rte_power_cpufreq_ops) cpufreq_ops_list =
+ TAILQ_HEAD_INITIALIZER(cpufreq_ops_list);
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_cpufreq_ops, return 0 on success. */
+int
+rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering cpufreq ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&cpufreq_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_cpufreq_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_cpufreq_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
- }
-
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ goto out;
}
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_cpufreq_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,7 +97,7 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_cpufreq_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -176,82 +109,122 @@ rte_power_get_env(void) {
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
-
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ struct rte_power_cpufreq_ops *ops;
+ uint8_t env;
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_cpufreq_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_cpufreq_ops->exit(lcore_id);
+
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
- }
return -1;
+}
+
+uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_avail_freqs(lcore_id, freqs, n);
+}
+
+uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_freq(lcore_id);
+}
+
+uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->set_freq(lcore_id, index);
+}
+
+int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_up(lcore_id);
+}
+
+int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_down(lcore_id);
+}
+
+int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_max(lcore_id);
+}
+
+int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_min(lcore_id);
+}
+int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->turbo_status(lcore_id);
+}
+
+int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->enable_turbo(lcore_id);
+}
+
+int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->disable_turbo(lcore_id);
+}
+
+int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_caps(lcore_id, caps);
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..7d566551bd 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "power_cpufreq.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -108,10 +116,7 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
-
-extern rte_power_freqs_t rte_power_freqs;
+uint32_t rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t num);
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +129,7 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
-
-extern rte_power_get_freq_t rte_power_get_freq;
+uint32_t rte_power_get_freq(unsigned int lcore_id);
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,13 +147,12 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+uint32_t rte_power_set_freq(unsigned int lcore_id, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * Review each environments specific documentation for usage.
*
* @param lcore_id
* lcore id.
@@ -160,66 +162,92 @@ extern rte_power_set_freq_t rte_power_set_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
-
-/**
- * Scale up the frequency of a specific lcore according to the available
- * frequencies.
- * Review each environments specific documentation for usage.
- */
-extern rte_power_freq_change_t rte_power_freq_up;
+int rte_power_freq_up(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+int rte_power_freq_down(unsigned int lcore_id);
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+int rte_power_freq_max(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+int rte_power_freq_min(unsigned int lcore_id);
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 turbo boost enabled.
+ * - 0 turbo boost disabled.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+int rte_power_turbo_status(unsigned int lcore_id);
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+int rte_power_freq_enable_turbo(unsigned int lcore_id);
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
-
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+int rte_power_freq_disable_turbo(unsigned int lcore_id);
/**
* Returns power capabilities for a specific lcore.
@@ -235,11 +263,9 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+int rte_power_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
-
#ifdef __cplusplus
}
#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..9c1ed4d9d6 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -52,3 +52,17 @@ EXPERIMENTAL {
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
};
+
+INTERNAL {
+ global:
+
+ rte_power_register_cpufreq_ops;
+ rte_power_logtype;
+ cpufreq_check_scaling_driver;
+ power_get_lcore_mapped_cpu_id;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
+};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 2/6] power: refactor uncore power management library
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 1/6] power: refactor core " Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 3/6] test/power: removed function pointer validations Sivaprasad Tummala
` (6 subsequent siblings)
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v8:
- removed c++ guards for internal header files
- renamed rte_power_uncore_ops.h for naming convention
v7:
- fixed build error with aarch32 gcc cross compilation
v6:
- fixed compilation error with symbol export in API
v5:
- fixed build errors for risc-v/ppc targets
v4:
- fixed build error with RTE_ASSERT
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/power_uncore_ops.h | 244 +++++++++++++++++
lib/power/rte_power_uncore.c | 256 +++++++++---------
lib/power/rte_power_uncore.h | 61 ++---
lib/power/version.map | 1 +
9 files changed, 435 insertions(+), 165 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/power_uncore_ops.h
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..b9343bd2ea 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,16 +2,15 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef _INTEL_UNCORE_H
+#define _INTEL_UNCORE_H
/**
* @file
* RTE Intel Uncore Frequency Management
*/
-#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +222,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* _INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index dd8e4393ac..5fa5d062e3 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,13 +13,13 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'power_cpufreq.h',
+ 'power_uncore_ops.h',
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
diff --git a/lib/power/power_uncore_ops.h b/lib/power/power_uncore_ops.h
new file mode 100644
index 0000000000..12ed9d6205
--- /dev/null
+++ b/lib/power/power_uncore_ops.h
@@ -0,0 +1,244 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _POWER_UNCORE_OPS_H
+#define _POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+#include <rte_common.h>
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+#endif /* _POWER_UNCORE_OPS_H */
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..e59458d7a7 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -7,103 +7,57 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
-#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
+#include "power_common.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +67,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,43 +90,43 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
}
int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
-
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +134,69 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
+
+uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_freq(pkg, die);
+}
+
+int
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->set_freq(pkg, die, index);
+}
+
+int
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_max(pkg, die);
+}
+
+int
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_min(pkg, die);
+}
+
+int
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_avail_freqs(pkg, die, freqs, num);
+}
+
+int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_freqs(pkg, die);
+}
+
+unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_pkgs();
+}
+
+unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_dies(pkg);
+}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..67d55cbf96 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +11,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +115,7 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+uint32_t rte_power_get_uncore_freq(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,12 +138,14 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+int rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions.
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
*
* @param pkg
* Package number.
@@ -160,16 +159,7 @@ extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
-
-/**
- * Set minimum and maximum uncore frequency for specified die on a package
- * to maximum value according to the available frequencies.
- * It should be protected outside of this function for threadsafe.
- *
- * This function should NOT be called in the fast path.
- */
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+int rte_power_uncore_freq_max(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -177,8 +167,20 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
* It should be protected outside of this function for threadsafe.
*
* This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+int rte_power_uncore_freq_min(unsigned int pkg, unsigned int die);
/**
* Return the list of available frequencies in the index array.
@@ -200,11 +202,10 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+__rte_experimental
+int rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
uint32_t *freqs, uint32_t num);
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
-
/**
* Return the list length of available frequencies in the index array.
*
@@ -221,9 +222,7 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+int rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +234,7 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
-
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+unsigned int rte_power_uncore_get_num_pkgs(void);
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +250,7 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
-
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+unsigned int rte_power_uncore_get_num_dies(unsigned int pkg);
#ifdef __cplusplus
}
diff --git a/lib/power/version.map b/lib/power/version.map
index 9c1ed4d9d6..f442329bbc 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -57,6 +57,7 @@ INTERNAL {
global:
rte_power_register_cpufreq_ops;
+ rte_power_register_uncore_ops;
rte_power_logtype;
cpufreq_check_scaling_driver;
power_get_lcore_mapped_cpu_id;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 3/6] test/power: removed function pointer validations
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 1/6] power: refactor core " Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 2/6] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
` (5 subsequent siblings)
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index edbd34424e..f4522747d5 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -534,58 +534,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 4/6] drivers/power: uncore support for AMD EPYC processors
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-22 18:41 ` [PATCH v8 3/6] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 5/6] maintainers: update for drivers/power Sivaprasad Tummala
` (4 subsequent siblings)
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
4 files changed, 575 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..a142034479
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,225 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "power_uncore_ops.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 5/6] maintainers: update for drivers/power
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-22 18:41 ` [PATCH v8 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
` (3 subsequent siblings)
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index cd78bc7db1..91742f2261 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1743,6 +1743,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 6/6] power: rename library sources for cpu frequency management
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-22 18:41 ` [PATCH v8 5/6] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 0/6] power: refactor power management library Sivaprasad Tummala
` (2 subsequent siblings)
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch renames the existing core power library source files
from rte_power.* to rte_power_cpufreq.* for better clarity
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 2 +-
app/test/test_power_cpufreq.c | 2 +-
app/test/test_power_kvm_vm.c | 2 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 2 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
examples/vm_power_manager/guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 4 ++--
lib/power/{rte_power.c => rte_power_cpufreq.c} | 2 +-
lib/power/{rte_power.h => rte_power_cpufreq.h} | 4 ++--
lib/power/rte_power_pmd_mgmt.h | 2 +-
15 files changed, 17 insertions(+), 17 deletions(-)
rename lib/power/{rte_power.c => rte_power_cpufreq.c} (99%)
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (99%)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 5df5848c70..38507411bd 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -22,7 +22,7 @@ test_power(void)
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
static int
test_power(void)
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index f4522747d5..0331b37fe0 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -30,7 +30,7 @@ test_power_caps(void)
}
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define TEST_POWER_LCORE_ID 2U
#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index a7d104e973..1c72ba5a4e 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -20,7 +20,7 @@ test_power_kvm_vm(void)
}
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define TEST_POWER_VM_LCORE_ID 0U
#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index ddbc387c20..ea44939fba 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -17,7 +17,7 @@
#include <rte_prefetch.h>
#include <rte_distributor.h>
#include <rte_pause.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define RX_RING_SIZE 1024
#define TX_RING_SIZE 1024
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 6bd76515e6..272e069207 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -41,7 +41,7 @@
#include <rte_udp.h>
#include <rte_string_fns.h>
#include <rte_timer.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_spinlock.h>
#include <rte_metrics.h>
#include <rte_telemetry.h>
diff --git a/examples/l3fwd-power/perf_core.c b/examples/l3fwd-power/perf_core.c
index 6c0f7ea213..1b5419119a 100644
--- a/examples/l3fwd-power/perf_core.c
+++ b/examples/l3fwd-power/perf_core.c
@@ -10,7 +10,7 @@
#include <rte_common.h>
#include <rte_memory.h>
#include <rte_lcore.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_string_fns.h>
#include "perf_core.h"
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index f21556e27d..d4e0d685c1 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -31,7 +31,7 @@
#ifdef RTE_NET_I40E
#include <rte_pmd_i40e.h>
#endif
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <libvirt/libvirt.h>
#include "channel_monitor.h"
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
index ab69524af5..a9a257abd3 100644
--- a/examples/vm_power_manager/channel_monitor.h
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -5,7 +5,7 @@
#ifndef CHANNEL_MONITOR_H_
#define CHANNEL_MONITOR_H_
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include "channel_manager.h"
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
index 9da50020ac..6246cbd6b4 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -9,7 +9,7 @@
#include <string.h>
#include <rte_lcore.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_debug.h>
#include <rte_eal.h>
#include <rte_log.h>
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 5eddb47847..803b6d1f82 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -18,7 +18,7 @@
#include <rte_lcore.h>
#include <rte_ethdev.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include "vm_power_cli_guest.h"
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 0355a7f4bc..522c713ff4 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -15,7 +15,7 @@
#include <sys/types.h>
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_spinlock.h>
#include "channel_manager.h"
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 5fa5d062e3..4f4dc19687 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,14 +13,14 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'rte_power.c',
+ 'rte_power_cpufreq.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'power_cpufreq.h',
'power_uncore_ops.h',
- 'rte_power.h',
+ 'rte_power_cpufreq.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/rte_power.c b/lib/power/rte_power_cpufreq.c
similarity index 99%
rename from lib/power/rte_power.c
rename to lib/power/rte_power_cpufreq.c
index 3168b6d301..d91c530e34 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_spinlock.h>
#include <rte_debug.h>
-#include "rte_power.h"
+#include "rte_power_cpufreq.h"
#include "power_common.h"
static enum power_management_env global_default_env = PM_ENV_NOT_SET;
diff --git a/lib/power/rte_power.h b/lib/power/rte_power_cpufreq.h
similarity index 99%
rename from lib/power/rte_power.h
rename to lib/power/rte_power_cpufreq.h
index 7d566551bd..b68d8c0bbc 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power_cpufreq.h
@@ -3,8 +3,8 @@
* Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _RTE_POWER_H
-#define _RTE_POWER_H
+#ifndef _RTE_POWER_CPUFREQ_H
+#define _RTE_POWER_CPUFREQ_H
/**
* @file
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 807e454096..58c25bc3ff 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -13,7 +13,7 @@
#include <stdint.h>
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#ifdef __cplusplus
extern "C" {
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v8 0/6] power: refactor power management library
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-22 18:41 ` [PATCH v8 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
@ 2024-10-22 18:41 ` Sivaprasad Tummala
2024-10-23 1:40 ` Stephen Hemminger
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
8 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-22 18:41 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (6):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
power: rename library sources for cpu frequency management
MAINTAINERS | 1 +
app/test/test_power.c | 97 +-----
app/test/test_power_cpufreq.c | 54 +--
app/test/test_power_kvm_vm.c | 38 +-
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 14 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
.../guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 13 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/power_uncore_ops.h | 244 +++++++++++++
lib/power/rte_power.c | 257 --------------
lib/power/rte_power_cpufreq.c | 230 ++++++++++++
.../{rte_power.h => rte_power_cpufreq.h} | 120 ++++---
lib/power/rte_power_pmd_mgmt.h | 2 +-
lib/power/rte_power_uncore.c | 256 +++++++-------
lib/power/rte_power_uncore.h | 61 ++--
lib/power/version.map | 15 +
49 files changed, 1746 insertions(+), 707 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
create mode 100644 lib/power/power_uncore_ops.h
delete mode 100644 lib/power/rte_power.c
create mode 100644 lib/power/rte_power_cpufreq.c
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (73%)
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v8 0/6] power: refactor power management library
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
` (6 preceding siblings ...)
2024-10-22 18:41 ` [PATCH v8 0/6] power: refactor power management library Sivaprasad Tummala
@ 2024-10-23 1:40 ` Stephen Hemminger
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
8 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-10-23 1:40 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong,
dev
On Tue, 22 Oct 2024 18:41:26 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
>
> Sivaprasad Tummala (6):
> power: refactor core power management library
> power: refactor uncore power management library
> test/power: removed function pointer validations
> drivers/power: uncore support for AMD EPYC processors
> maintainers: update for drivers/power
> power: rename library sources for cpu frequency management
>
> MAINTAINERS | 1 +
> app/test/test_power.c | 97 +-----
> app/test/test_power_cpufreq.c | 54 +--
> app/test/test_power_kvm_vm.c | 38 +-
> drivers/meson.build | 1 +
> .../power/acpi/acpi_cpufreq.c | 22 +-
> .../power/acpi/acpi_cpufreq.h | 6 +-
> drivers/power/acpi/meson.build | 10 +
> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> .../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
> drivers/power/amd_pstate/meson.build | 10 +
> drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
> drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++
> drivers/power/amd_uncore/meson.build | 20 ++
> .../power/cppc/cppc_cpufreq.c | 22 +-
> .../power/cppc/cppc_cpufreq.h | 8 +-
> drivers/power/cppc/meson.build | 10 +
> .../power/intel_uncore/intel_uncore.c | 18 +-
> .../power/intel_uncore/intel_uncore.h | 9 +-
> drivers/power/intel_uncore/meson.build | 6 +
> .../power/kvm_vm}/guest_channel.c | 2 +-
> .../power/kvm_vm}/guest_channel.h | 0
> .../power/kvm_vm/kvm_vm.c | 22 +-
> .../power/kvm_vm/kvm_vm.h | 6 +-
> drivers/power/kvm_vm/meson.build | 14 +
> drivers/power/meson.build | 14 +
> drivers/power/pstate/meson.build | 10 +
> .../power/pstate/pstate_cpufreq.c | 22 +-
> .../power/pstate/pstate_cpufreq.h | 6 +-
> examples/distributor/main.c | 2 +-
> examples/l3fwd-power/main.c | 14 +-
> examples/l3fwd-power/perf_core.c | 2 +-
> examples/vm_power_manager/channel_monitor.c | 2 +-
> examples/vm_power_manager/channel_monitor.h | 2 +-
> examples/vm_power_manager/guest_cli/main.c | 2 +-
> .../guest_cli/vm_power_cli_guest.c | 2 +-
> examples/vm_power_manager/power_manager.c | 2 +-
> lib/power/meson.build | 13 +-
> lib/power/power_common.c | 2 +-
> lib/power/power_common.h | 18 +-
> lib/power/power_cpufreq.h | 191 ++++++++++
> lib/power/power_uncore_ops.h | 244 +++++++++++++
> lib/power/rte_power.c | 257 --------------
> lib/power/rte_power_cpufreq.c | 230 ++++++++++++
> .../{rte_power.h => rte_power_cpufreq.h} | 120 ++++---
> lib/power/rte_power_pmd_mgmt.h | 2 +-
> lib/power/rte_power_uncore.c | 256 +++++++-------
> lib/power/rte_power_uncore.h | 61 ++--
> lib/power/version.map | 15 +
> 49 files changed, 1746 insertions(+), 707 deletions(-)
> rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
> rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
> create mode 100644 drivers/power/acpi/meson.build
> rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
> create mode 100644 drivers/power/amd_pstate/meson.build
> create mode 100644 drivers/power/amd_uncore/amd_uncore.c
> create mode 100644 drivers/power/amd_uncore/amd_uncore.h
> create mode 100644 drivers/power/amd_uncore/meson.build
> rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
> rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
> create mode 100644 drivers/power/cppc/meson.build
> rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
> rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
> create mode 100644 drivers/power/intel_uncore/meson.build
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
> rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
> rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
> rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
> create mode 100644 drivers/power/kvm_vm/meson.build
> create mode 100644 drivers/power/meson.build
> create mode 100644 drivers/power/pstate/meson.build
> rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
> rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
> create mode 100644 lib/power/power_cpufreq.h
> create mode 100644 lib/power/power_uncore_ops.h
> delete mode 100644 lib/power/rte_power.c
> create mode 100644 lib/power/rte_power_cpufreq.c
> rename lib/power/{rte_power.h => rte_power_cpufreq.h} (73%)
>
This has some issues with documentation.
$ ninja -C build doc
ninja: Entering directory `build'
[3/6] Generating doc/api/doxygen-html with a custom command
/home/shemminger/DPDK/power/doc/api/doxy-api-index.md:105: warning: unable to resolve reference to 'rte_power.h' for \ref command
[5/6] Running external command doc (wrapped by meson to set env)
Building docs: Doxygen_API(HTML) Doxygen_API(Manpage) DTS_API_HTML HTML_Guides
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 0/6] power: refactor power management library
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
` (7 preceding siblings ...)
2024-10-23 1:40 ` Stephen Hemminger
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 1/6] power: refactor core " Sivaprasad Tummala
` (7 more replies)
8 siblings, 8 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (6):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
power: rename library sources for cpu frequency management
MAINTAINERS | 1 +
app/test/test_power.c | 97 +-----
app/test/test_power_cpufreq.c | 54 +--
app/test/test_power_kvm_vm.c | 38 +-
doc/api/doxy-api-index.md | 2 +-
doc/guides/prog_guide/power_man.rst | 24 +-
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 14 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
.../guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 13 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/power_uncore_ops.h | 244 +++++++++++++
lib/power/rte_power.c | 257 --------------
lib/power/rte_power_cpufreq.c | 230 ++++++++++++
.../{rte_power.h => rte_power_cpufreq.h} | 120 ++++---
lib/power/rte_power_pmd_mgmt.h | 2 +-
lib/power/rte_power_uncore.c | 256 +++++++-------
lib/power/rte_power_uncore.h | 61 ++--
lib/power/version.map | 15 +
51 files changed, 1766 insertions(+), 713 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
create mode 100644 lib/power/power_uncore_ops.h
delete mode 100644 lib/power/rte_power.c
create mode 100644 lib/power/rte_power_cpufreq.c
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (73%)
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 1/6] power: refactor core power management library
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-26 3:06 ` lihuisong (C)
2024-10-23 5:11 ` [PATCH v9 2/6] power: refactor uncore " Sivaprasad Tummala
` (6 subsequent siblings)
7 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v8:
- marked rte_power_logtype as internal
- removed c++ guards for internal header files
- renamed rte_power_cpufreq_api.h for naming convention
- renamed rte_power_register_ops for naming convention
v6:
- fixed compilation error with symbol export in API
- exported power_get_lcore_mapped_cpu_id as internal API to be
used in drivers/power/*
v5:
- fixed code style warning
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/rte_power.c | 355 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/version.map | 14 +
26 files changed, 650 insertions(+), 270 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 2733306698..7ef4f581a0 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index ae809fbb60..81a5e3f6ea 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -587,3 +587,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..e18a3e6af8 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 2b728eca18..95495bff7d 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -710,3 +710,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 96%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..5c273df4d7 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if amd p-state power management is supported.
@@ -216,4 +216,4 @@ int power_amd_pstate_disable_turbo(unsigned int lcore_id);
int power_amd_pstate_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_AMD_PSTATET_CPUFREQ_H */
+#endif /* _AMD_PSTATET_CPUFREQ_H */
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index cc9305bdfe..3cd4165c83 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -695,3 +695,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..d637f53dcc 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 99%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
index bc3f55b6bf..35cd4cfe6f 100644
--- a/lib/power/guest_channel.c
+++ b/drivers/power/kvm_vm/guest_channel.c
@@ -13,7 +13,7 @@
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_guest_channel.h>
#include "guest_channel.h"
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..5754a441cd 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_cpufreq_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..4fabe4c6a5 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..fe11179ab3
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 4755909466..f117ff3d17 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -898,3 +898,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..b18a1ac9bc 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..dd8e4393ac 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,19 +12,14 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
+ 'power_cpufreq.h',
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index b47c63a5f1..e482f71c64 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -13,7 +13,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 82fb94d0c0..c294f561bb 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,14 +24,27 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
+
+__rte_internal
int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/power_cpufreq.h b/lib/power/power_cpufreq.h
new file mode 100644
index 0000000000..e33d9fe18c
--- /dev/null
+++ b/lib/power/power_cpufreq.h
@@ -0,0 +1,191 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _POWER_CPUFREQ_H
+#define _POWER_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
+ uint32_t *freqs, uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_cpufreq_ops {
+ RTE_TAILQ_ENTRY(rte_power_cpufreq_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_CPUFREQ_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_cpufreq_ops(&ops); \
+}
+
+#endif
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..3168b6d301 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -6,155 +6,88 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_cpufreq_ops *global_cpufreq_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
-
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+static RTE_TAILQ_HEAD(, rte_power_cpufreq_ops) cpufreq_ops_list =
+ TAILQ_HEAD_INITIALIZER(cpufreq_ops_list);
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_cpufreq_ops, return 0 on success. */
+int
+rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering cpufreq ops");
+ return -EINVAL;
+ }
+
+ TAILQ_INSERT_TAIL(&cpufreq_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_cpufreq_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_cpufreq_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
- }
-
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ goto out;
}
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_cpufreq_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,7 +97,7 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_cpufreq_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -176,82 +109,122 @@ rte_power_get_env(void) {
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
-
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
+ struct rte_power_cpufreq_ops *ops;
+ uint8_t env;
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_cpufreq_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_cpufreq_ops->exit(lcore_id);
+
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
- }
return -1;
+}
+
+uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_avail_freqs(lcore_id, freqs, n);
+}
+
+uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_freq(lcore_id);
+}
+
+uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->set_freq(lcore_id, index);
+}
+
+int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_up(lcore_id);
+}
+
+int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_down(lcore_id);
+}
+
+int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_max(lcore_id);
+}
+
+int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_min(lcore_id);
+}
+int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->turbo_status(lcore_id);
+}
+
+int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->enable_turbo(lcore_id);
+}
+
+int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->disable_turbo(lcore_id);
+}
+
+int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_caps(lcore_id, caps);
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..7d566551bd 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "power_cpufreq.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -108,10 +116,7 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
-
-extern rte_power_freqs_t rte_power_freqs;
+uint32_t rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t num);
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +129,7 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
-
-extern rte_power_get_freq_t rte_power_get_freq;
+uint32_t rte_power_get_freq(unsigned int lcore_id);
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,13 +147,12 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+uint32_t rte_power_set_freq(unsigned int lcore_id, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * Review each environments specific documentation for usage.
*
* @param lcore_id
* lcore id.
@@ -160,66 +162,92 @@ extern rte_power_set_freq_t rte_power_set_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
-
-/**
- * Scale up the frequency of a specific lcore according to the available
- * frequencies.
- * Review each environments specific documentation for usage.
- */
-extern rte_power_freq_change_t rte_power_freq_up;
+int rte_power_freq_up(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+int rte_power_freq_down(unsigned int lcore_id);
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+int rte_power_freq_max(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+int rte_power_freq_min(unsigned int lcore_id);
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 turbo boost enabled.
+ * - 0 turbo boost disabled.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+int rte_power_turbo_status(unsigned int lcore_id);
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+int rte_power_freq_enable_turbo(unsigned int lcore_id);
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
-
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+int rte_power_freq_disable_turbo(unsigned int lcore_id);
/**
* Returns power capabilities for a specific lcore.
@@ -235,11 +263,9 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+int rte_power_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
-
#ifdef __cplusplus
}
#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..9c1ed4d9d6 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -52,3 +52,17 @@ EXPERIMENTAL {
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
};
+
+INTERNAL {
+ global:
+
+ rte_power_register_cpufreq_ops;
+ rte_power_logtype;
+ cpufreq_check_scaling_driver;
+ power_get_lcore_mapped_cpu_id;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
+};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 2/6] power: refactor uncore power management library
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 1/6] power: refactor core " Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-26 3:12 ` lihuisong (C)
2024-10-23 5:11 ` [PATCH v9 3/6] test/power: removed function pointer validations Sivaprasad Tummala
` (5 subsequent siblings)
7 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v9:
- documentation update
v8:
- removed c++ guards for internal header files
- renamed rte_power_uncore_ops.h for naming convention
v7:
- fixed build error with aarch32 gcc cross compilation
v6:
- fixed compilation error with symbol export in API
v5:
- fixed build errors for risc-v/ppc targets
v4:
- fixed build error with RTE_ASSERT
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
doc/guides/prog_guide/power_man.rst | 10 +-
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/power_uncore_ops.h | 244 +++++++++++++++++
lib/power/rte_power_uncore.c | 256 +++++++++---------
lib/power/rte_power_uncore.h | 61 ++---
lib/power/version.map | 1 +
10 files changed, 440 insertions(+), 170 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/power_uncore_ops.h
diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..1810ecf93b 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -191,8 +191,8 @@ API Overview for Ethernet PMD Power Management
* **Set Scaling Max Freq**: Set the maximum frequency (kHz) to be used in Frequency
Scaling mode.
-Intel Uncore API
-----------------
+Uncore API
+----------
Abstract
~~~~~~~~
@@ -211,10 +211,10 @@ which was added in 5.6.
This manipulates the context of MSR 0x620,
which sets min/max of the uncore for the SKU.
-API Overview for Intel Uncore
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Uncore API Overview
+~~~~~~~~~~~~~~~~~~~
-Overview of each function in the Intel Uncore API,
+Overview of each function in the Uncore API,
with explanation of what they do.
Each function should not be called in the fast path.
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..b9343bd2ea 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,16 +2,15 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef _INTEL_UNCORE_H
+#define _INTEL_UNCORE_H
/**
* @file
* RTE Intel Uncore Frequency Management
*/
-#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +222,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* _INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index dd8e4393ac..5fa5d062e3 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,13 +13,13 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'power_cpufreq.h',
+ 'power_uncore_ops.h',
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
diff --git a/lib/power/power_uncore_ops.h b/lib/power/power_uncore_ops.h
new file mode 100644
index 0000000000..12ed9d6205
--- /dev/null
+++ b/lib/power/power_uncore_ops.h
@@ -0,0 +1,244 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _POWER_UNCORE_OPS_H
+#define _POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+#include <rte_common.h>
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - >=0: Success; return the index of the ops struct in the table.
+ * - -EINVAL - error while registering ops struct.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+#endif /* _POWER_UNCORE_OPS_H */
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..e59458d7a7 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -7,103 +7,57 @@
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
-#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
+#include "power_common.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +67,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,43 +90,43 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
}
int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
-
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +134,69 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
+
+uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_freq(pkg, die);
+}
+
+int
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->set_freq(pkg, die, index);
+}
+
+int
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_max(pkg, die);
+}
+
+int
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_min(pkg, die);
+}
+
+int
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_avail_freqs(pkg, die, freqs, num);
+}
+
+int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_freqs(pkg, die);
+}
+
+unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_pkgs();
+}
+
+unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_dies(pkg);
+}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..67d55cbf96 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +11,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +115,7 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+uint32_t rte_power_get_uncore_freq(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,12 +138,14 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+int rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions.
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
*
* @param pkg
* Package number.
@@ -160,16 +159,7 @@ extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
-
-/**
- * Set minimum and maximum uncore frequency for specified die on a package
- * to maximum value according to the available frequencies.
- * It should be protected outside of this function for threadsafe.
- *
- * This function should NOT be called in the fast path.
- */
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+int rte_power_uncore_freq_max(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -177,8 +167,20 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
* It should be protected outside of this function for threadsafe.
*
* This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+int rte_power_uncore_freq_min(unsigned int pkg, unsigned int die);
/**
* Return the list of available frequencies in the index array.
@@ -200,11 +202,10 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+__rte_experimental
+int rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
uint32_t *freqs, uint32_t num);
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
-
/**
* Return the list length of available frequencies in the index array.
*
@@ -221,9 +222,7 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+int rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +234,7 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
-
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+unsigned int rte_power_uncore_get_num_pkgs(void);
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +250,7 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
-
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+unsigned int rte_power_uncore_get_num_dies(unsigned int pkg);
#ifdef __cplusplus
}
diff --git a/lib/power/version.map b/lib/power/version.map
index 9c1ed4d9d6..f442329bbc 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -57,6 +57,7 @@ INTERNAL {
global:
rte_power_register_cpufreq_ops;
+ rte_power_register_uncore_ops;
rte_power_logtype;
cpufreq_check_scaling_driver;
power_get_lcore_mapped_cpu_id;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 3/6] test/power: removed function pointer validations
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 1/6] power: refactor core " Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 2/6] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
` (4 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index edbd34424e..f4522747d5 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -534,58 +534,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 4/6] drivers/power: uncore support for AMD EPYC processors
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-23 5:11 ` [PATCH v9 3/6] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 5/6] maintainers: update for drivers/power Sivaprasad Tummala
` (3 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v9:
- documentation update
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
doc/guides/prog_guide/power_man.rst | 14 ++
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
5 files changed, 589 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 1810ecf93b..b06cc36438 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -203,6 +203,8 @@ to achieve high performance: L3 cache, on-die memory controller, etc.
Significant power savings can be achieved by reducing the uncore frequency
to its lowest value.
+Intel Uncore
+~~~~~~~~~~~~
The Linux kernel provides the driver "intel-uncore-frequency"
to control the uncore frequency limits for x86 platform.
The driver is available from kernel version 5.6 and above.
@@ -211,6 +213,18 @@ which was added in 5.6.
This manipulates the context of MSR 0x620,
which sets min/max of the uncore for the SKU.
+AMD EPYC Uncore
+~~~~~~~~~~~~~~~
+On AMD EPYC platforms, the Host System Management Port (HSMP) kernel module
+facilitates user-level access to HSMP mailboxes, which are implemented by
+the firmware in the System Management Unit (SMU).
+The AMD HSMP driver is available starting from kernel version 5.18.
+Please ensure that CONFIG_AMD_HSMP is enabled in your kernel configuration.
+
+Additionally, the EPYC System Management Interface In-band Library for Linux
+offers essential APIs, enabling user-space software to effectively manage
+system functions.
+
Uncore API Overview
~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..a142034479
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,225 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "power_uncore_ops.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 5/6] maintainers: update for drivers/power
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-23 5:11 ` [PATCH v9 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
` (2 subsequent siblings)
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index cd78bc7db1..91742f2261 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1743,6 +1743,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 6/6] power: rename library sources for cpu frequency management
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-23 5:11 ` [PATCH v9 5/6] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-26 4:09 ` lihuisong (C)
2024-10-23 5:11 ` [PATCH v9 0/6] power: refactor power management library Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
7 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patch renames the existing core power library source files
from rte_power.* to rte_power_cpufreq.* for better clarity
v9:
- documentation update
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 2 +-
app/test/test_power_cpufreq.c | 2 +-
app/test/test_power_kvm_vm.c | 2 +-
doc/api/doxy-api-index.md | 2 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 2 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
examples/vm_power_manager/guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 4 ++--
lib/power/{rte_power.c => rte_power_cpufreq.c} | 2 +-
lib/power/{rte_power.h => rte_power_cpufreq.h} | 4 ++--
lib/power/rte_power_pmd_mgmt.h | 2 +-
16 files changed, 18 insertions(+), 18 deletions(-)
rename lib/power/{rte_power.c => rte_power_cpufreq.c} (99%)
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (99%)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 5df5848c70..38507411bd 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -22,7 +22,7 @@ test_power(void)
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
static int
test_power(void)
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index f4522747d5..0331b37fe0 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -30,7 +30,7 @@ test_power_caps(void)
}
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define TEST_POWER_LCORE_ID 2U
#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index a7d104e973..1c72ba5a4e 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -20,7 +20,7 @@ test_power_kvm_vm(void)
}
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define TEST_POWER_VM_LCORE_ID 0U
#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 266c8b90dc..f82570b093 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -102,7 +102,7 @@ The public API headers are grouped by topics:
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
- [power/freq](@ref rte_power.h),
+ [power/freq](@ref rte_power_cpufreq.h),
[power/uncore](@ref rte_power_uncore.h),
[PMD power](@ref rte_power_pmd_mgmt.h)
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index ddbc387c20..ea44939fba 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -17,7 +17,7 @@
#include <rte_prefetch.h>
#include <rte_distributor.h>
#include <rte_pause.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define RX_RING_SIZE 1024
#define TX_RING_SIZE 1024
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 6bd76515e6..272e069207 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -41,7 +41,7 @@
#include <rte_udp.h>
#include <rte_string_fns.h>
#include <rte_timer.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_spinlock.h>
#include <rte_metrics.h>
#include <rte_telemetry.h>
diff --git a/examples/l3fwd-power/perf_core.c b/examples/l3fwd-power/perf_core.c
index 6c0f7ea213..1b5419119a 100644
--- a/examples/l3fwd-power/perf_core.c
+++ b/examples/l3fwd-power/perf_core.c
@@ -10,7 +10,7 @@
#include <rte_common.h>
#include <rte_memory.h>
#include <rte_lcore.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_string_fns.h>
#include "perf_core.h"
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index f21556e27d..d4e0d685c1 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -31,7 +31,7 @@
#ifdef RTE_NET_I40E
#include <rte_pmd_i40e.h>
#endif
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <libvirt/libvirt.h>
#include "channel_monitor.h"
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
index ab69524af5..a9a257abd3 100644
--- a/examples/vm_power_manager/channel_monitor.h
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -5,7 +5,7 @@
#ifndef CHANNEL_MONITOR_H_
#define CHANNEL_MONITOR_H_
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include "channel_manager.h"
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
index 9da50020ac..6246cbd6b4 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -9,7 +9,7 @@
#include <string.h>
#include <rte_lcore.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_debug.h>
#include <rte_eal.h>
#include <rte_log.h>
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 5eddb47847..803b6d1f82 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -18,7 +18,7 @@
#include <rte_lcore.h>
#include <rte_ethdev.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include "vm_power_cli_guest.h"
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 0355a7f4bc..522c713ff4 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -15,7 +15,7 @@
#include <sys/types.h>
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_spinlock.h>
#include "channel_manager.h"
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 5fa5d062e3..4f4dc19687 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,14 +13,14 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'rte_power.c',
+ 'rte_power_cpufreq.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'power_cpufreq.h',
'power_uncore_ops.h',
- 'rte_power.h',
+ 'rte_power_cpufreq.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/rte_power.c b/lib/power/rte_power_cpufreq.c
similarity index 99%
rename from lib/power/rte_power.c
rename to lib/power/rte_power_cpufreq.c
index 3168b6d301..d91c530e34 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_spinlock.h>
#include <rte_debug.h>
-#include "rte_power.h"
+#include "rte_power_cpufreq.h"
#include "power_common.h"
static enum power_management_env global_default_env = PM_ENV_NOT_SET;
diff --git a/lib/power/rte_power.h b/lib/power/rte_power_cpufreq.h
similarity index 99%
rename from lib/power/rte_power.h
rename to lib/power/rte_power_cpufreq.h
index 7d566551bd..b68d8c0bbc 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power_cpufreq.h
@@ -3,8 +3,8 @@
* Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _RTE_POWER_H
-#define _RTE_POWER_H
+#ifndef _RTE_POWER_CPUFREQ_H
+#define _RTE_POWER_CPUFREQ_H
/**
* @file
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 807e454096..58c25bc3ff 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -13,7 +13,7 @@
#include <stdint.h>
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#ifdef __cplusplus
extern "C" {
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v9 0/6] power: refactor power management library
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-23 5:11 ` [PATCH v9 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
@ 2024-10-23 5:11 ` Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
7 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-23 5:11 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, ferruh.yigit, konstantin.ananyev, lihuisong
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (6):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
power: rename library sources for cpu frequency management
MAINTAINERS | 1 +
app/test/test_power.c | 97 +-----
app/test/test_power_cpufreq.c | 54 +--
app/test/test_power_kvm_vm.c | 38 +-
doc/api/doxy-api-index.md | 2 +-
doc/guides/prog_guide/power_man.rst | 24 +-
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 14 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
.../guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 13 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/power_uncore_ops.h | 244 +++++++++++++
lib/power/rte_power.c | 257 --------------
lib/power/rte_power_cpufreq.c | 230 ++++++++++++
.../{rte_power.h => rte_power_cpufreq.h} | 120 ++++---
lib/power/rte_power_pmd_mgmt.h | 2 +-
lib/power/rte_power_uncore.c | 256 +++++++-------
lib/power/rte_power_uncore.h | 61 ++--
lib/power/version.map | 15 +
51 files changed, 1766 insertions(+), 713 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
create mode 100644 lib/power/power_uncore_ops.h
delete mode 100644 lib/power/rte_power.c
create mode 100644 lib/power/rte_power_cpufreq.c
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (73%)
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v9 1/6] power: refactor core power management library
2024-10-23 5:11 ` [PATCH v9 1/6] power: refactor core " Sivaprasad Tummala
@ 2024-10-26 3:06 ` lihuisong (C)
2024-10-26 5:22 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: lihuisong (C) @ 2024-10-26 3:06 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, jerinj, radu.nicolau,
cristian.dumitrescu, konstantin.ananyev, ferruh.yigit, gakhil
Hi Sivaprasad,
LGTM except for some trivial comments inline,
With belows to change, you can add
Acked-by: Huisong Li <lihuisong@huawei.com>
/Huisong
在 2024/10/23 13:11, Sivaprasad Tummala 写道:
> This patch introduces a comprehensive refactor to the core power
> management library. The primary focus is on improving modularity
> and organization by relocating specific driver implementations
> from the 'lib/power' directory to dedicated directories within
> 'drivers/power/core/*'. The adjustment of meson.build files
> enables the selective activation of individual drivers.
>
> These changes contribute to a significant enhancement in code
> organization, providing a clearer structure for driver implementations.
> The refactor aims to improve overall code clarity and boost
> maintainability. Additionally, it establishes a foundation for
> future development, allowing for more focused work on individual
> drivers and seamless integration of forthcoming enhancements.
>
> v8:
> - marked rte_power_logtype as internal
> - removed c++ guards for internal header files
> - renamed rte_power_cpufreq_api.h for naming convention
> - renamed rte_power_register_ops for naming convention
>
> v6:
> - fixed compilation error with symbol export in API
> - exported power_get_lcore_mapped_cpu_id as internal API to be
> used in drivers/power/*
>
> v5:
> - fixed code style warning
>
> v4:
> - fixed build error with RTE_ASSERT
>
> v3:
> - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
> - re-worked on auto detection logic
>
> v2:
> - added NULL check for global_core_ops in rte_power_get_core_ops
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
<snip>
> +
> +/**
> + * Register power cpu frequency operations.
> + *
> + * @param ops
> + * Pointer to an ops structure to register.
> + * @return
> + * - >=0: Success; return the index of the ops struct in the table.
> + * - -EINVAL - error while registering ops struct.
Not the index in the table, need to fix it.
BTW, this API always success now. so no return value.
> + */
> +__rte_internal
> +int rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *ops);
> +
> +/**
> + * Macro to statically register the ops of a cpufreq driver.
> + */
> +#define RTE_POWER_REGISTER_CPUFREQ_OPS(ops) \
> +RTE_INIT(power_hdlr_init_##ops) \
> +{ \
> + rte_power_register_cpufreq_ops(&ops); \
> +}
> +
> +#endif
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
> index 36c3f3da98..3168b6d301 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power.c
> @@ -6,155 +6,88 @@
>
> #include <rte_errno.h>
> #include <rte_spinlock.h>
> +#include <rte_debug.h>
>
> #include "rte_power.h"
> -#include "power_acpi_cpufreq.h"
> -#include "power_cppc_cpufreq.h"
> #include "power_common.h"
> -#include "power_kvm_vm.h"
> -#include "power_pstate_cpufreq.h"
> -#include "power_amd_pstate_cpufreq.h"
>
> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static enum power_management_env global_default_env = PM_ENV_NOT_SET;
> +static struct rte_power_cpufreq_ops *global_cpufreq_ops;
>
> static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
> -
> -/* function pointers */
> -rte_power_freqs_t rte_power_freqs = NULL;
> -rte_power_get_freq_t rte_power_get_freq = NULL;
> -rte_power_set_freq_t rte_power_set_freq = NULL;
> -rte_power_freq_change_t rte_power_freq_up = NULL;
> -rte_power_freq_change_t rte_power_freq_down = NULL;
> -rte_power_freq_change_t rte_power_freq_max = NULL;
> -rte_power_freq_change_t rte_power_freq_min = NULL;
> -rte_power_freq_change_t rte_power_turbo_status;
> -rte_power_freq_change_t rte_power_freq_enable_turbo;
> -rte_power_freq_change_t rte_power_freq_disable_turbo;
> -rte_power_get_capabilities_t rte_power_get_capabilities;
> -
> -static void
> -reset_power_function_ptrs(void)
> +static RTE_TAILQ_HEAD(, rte_power_cpufreq_ops) cpufreq_ops_list =
> + TAILQ_HEAD_INITIALIZER(cpufreq_ops_list);
> +
> +const char *power_env_str[] = {
> + "not set",
> + "acpi",
> + "kvm-vm",
> + "pstate",
> + "cppc",
> + "amd-pstate"
> +};
How use the "not set"? I don't know what its usage is. Do we need to
consider removing it later?
> +
> +/* register the ops struct in rte_power_cpufreq_ops, return 0 on success. */
> +int
> +rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *driver_ops)
> {
> - rte_power_freqs = NULL;
> - rte_power_get_freq = NULL;
> - rte_power_set_freq = NULL;
> - rte_power_freq_up = NULL;
> - rte_power_freq_down = NULL;
> - rte_power_freq_max = NULL;
> - rte_power_freq_min = NULL;
> - rte_power_turbo_status = NULL;
> - rte_power_freq_enable_turbo = NULL;
> - rte_power_freq_disable_turbo = NULL;
> - rte_power_get_capabilities = NULL;
> + if (!driver_ops->init || !driver_ops->exit ||
> + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
> + !driver_ops->get_freq || !driver_ops->set_freq ||
> + !driver_ops->freq_up || !driver_ops->freq_down ||
> + !driver_ops->freq_max || !driver_ops->freq_min ||
> + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
> + !driver_ops->disable_turbo || !driver_ops->get_caps) {
> + POWER_LOG(ERR, "Missing callbacks while registering cpufreq ops");
> + return -EINVAL;
> + }
> +
> + TAILQ_INSERT_TAIL(&cpufreq_ops_list, driver_ops, next);
> +
> + return 0;
> }
suggest that change function return value as above mention.
>
> int
> rte_power_check_env_supported(enum power_management_env env)
> {
> - switch (env) {
> - case PM_ENV_ACPI_CPUFREQ:
> - return power_acpi_cpufreq_check_supported();
> - case PM_ENV_PSTATE_CPUFREQ:
> - return power_pstate_cpufreq_check_supported();
> - case PM_ENV_KVM_VM:
> - return power_kvm_vm_check_supported();
> - case PM_ENV_CPPC_CPUFREQ:
> - return power_cppc_cpufreq_check_supported();
> - case PM_ENV_AMD_PSTATE_CPUFREQ:
> - return power_amd_pstate_cpufreq_check_supported();
> - default:
> - rte_errno = EINVAL;
> - return -1;
> - }
> + struct rte_power_cpufreq_ops *ops;
> +
> + if (env >= RTE_DIM(power_env_str))
> + return 0;
> +
> + RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
> + if (strncmp(ops->name, power_env_str[env],
> + RTE_POWER_DRIVER_NAMESZ) == 0)
> + return ops->check_env_support();
> +
> + return 0;
> }
>
<snip>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v9 2/6] power: refactor uncore power management library
2024-10-23 5:11 ` [PATCH v9 2/6] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-26 3:12 ` lihuisong (C)
0 siblings, 0 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-10-26 3:12 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, jerinj, radu.nicolau,
cristian.dumitrescu, konstantin.ananyev, ferruh.yigit, gakhil
Already reviewed before.
Acked-by: Huisong Li <lihuisong@huawei.com>
在 2024/10/23 13:11, Sivaprasad Tummala 写道:
> This patch refactors the power management library, addressing uncore
> power management. The primary changes involve the creation of dedicated
> directories for each driver within 'drivers/power/uncore/*'. The
> adjustment of meson.build files enables the selective activation
> of individual drivers.
>
> This refactor significantly improves code organization, enhances
> clarity and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> v9:
> - documentation update
>
> v8:
> - removed c++ guards for internal header files
> - renamed rte_power_uncore_ops.h for naming convention
>
> v7:
> - fixed build error with aarch32 gcc cross compilation
>
> v6:
> - fixed compilation error with symbol export in API
>
> v5:
> - fixed build errors for risc-v/ppc targets
>
> v4:
> - fixed build error with RTE_ASSERT
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> doc/guides/prog_guide/power_man.rst | 10 +-
> .../power/intel_uncore/intel_uncore.c | 18 +-
> .../power/intel_uncore/intel_uncore.h | 9 +-
> drivers/power/intel_uncore/meson.build | 6 +
> drivers/power/meson.build | 3 +-
> lib/power/meson.build | 2 +-
> lib/power/power_uncore_ops.h | 244 +++++++++++++++++
> lib/power/rte_power_uncore.c | 256 +++++++++---------
> lib/power/rte_power_uncore.h | 61 ++---
> lib/power/version.map | 1 +
> 10 files changed, 440 insertions(+), 170 deletions(-)
> rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
> rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
> create mode 100644 drivers/power/intel_uncore/meson.build
> create mode 100644 lib/power/power_uncore_ops.h
<snip>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v9 6/6] power: rename library sources for cpu frequency management
2024-10-23 5:11 ` [PATCH v9 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
@ 2024-10-26 4:09 ` lihuisong (C)
0 siblings, 0 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-10-26 4:09 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: dev, david.hunt, anatoly.burakov, jerinj, radu.nicolau,
cristian.dumitrescu, konstantin.ananyev, ferruh.yigit, gakhil,
Stephen Hemminger
Hi Sivaprasad,
在 2024/10/23 13:11, Sivaprasad Tummala 写道:
> This patch renames the existing core power library source files
> from rte_power.* to rte_power_cpufreq.* for better clarity
Now it is more clarity for cpufeq and uncore. Thank you for this work.
But I have another question, pmd_mgmt component also depands on the
initialization of cpufreq layer.
So I guess pmd_mgmt is also releated to core dvfs.
As Stephen mentioned in other place, if all configurations about core
can be extracted one API, it is very good to use for application.
I am still not sure if we can do that. But that's okay, we can do this
later if possiable.
I just worried about whether the current name of rte_power_init() which
is external interface is good for what's going on.
Now that we rename the core power library source file, the
rte_power_init() in rte_power_cpufreq.c may need to rename.
How about add the following points?
1> rename rte_power_init() to rte_power_cpufreq_init() just for the
initialization of cpufreq library.
2> create a new file rte_power_core.c and rte_power_core_init() which is
used to do what Stephen said.
3> applications use rte_power_core_init() to take place of
rte_power_init().
/Huisong
>
> v9:
> - documentation update
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> app/test/test_power.c | 2 +-
> app/test/test_power_cpufreq.c | 2 +-
> app/test/test_power_kvm_vm.c | 2 +-
> doc/api/doxy-api-index.md | 2 +-
> examples/distributor/main.c | 2 +-
> examples/l3fwd-power/main.c | 2 +-
> examples/l3fwd-power/perf_core.c | 2 +-
> examples/vm_power_manager/channel_monitor.c | 2 +-
> examples/vm_power_manager/channel_monitor.h | 2 +-
> examples/vm_power_manager/guest_cli/main.c | 2 +-
> examples/vm_power_manager/guest_cli/vm_power_cli_guest.c | 2 +-
> examples/vm_power_manager/power_manager.c | 2 +-
> lib/power/meson.build | 4 ++--
> lib/power/{rte_power.c => rte_power_cpufreq.c} | 2 +-
> lib/power/{rte_power.h => rte_power_cpufreq.h} | 4 ++--
> lib/power/rte_power_pmd_mgmt.h | 2 +-
> 16 files changed, 18 insertions(+), 18 deletions(-)
> rename lib/power/{rte_power.c => rte_power_cpufreq.c} (99%)
> rename lib/power/{rte_power.h => rte_power_cpufreq.h} (99%)
>
> diff --git a/app/test/test_power.c b/app/test/test_power.c
> index 5df5848c70..38507411bd 100644
> --- a/app/test/test_power.c
> +++ b/app/test/test_power.c
> @@ -22,7 +22,7 @@ test_power(void)
>
> #else
>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> static int
> test_power(void)
> diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
> index f4522747d5..0331b37fe0 100644
> --- a/app/test/test_power_cpufreq.c
> +++ b/app/test/test_power_cpufreq.c
> @@ -30,7 +30,7 @@ test_power_caps(void)
> }
>
> #else
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #define TEST_POWER_LCORE_ID 2U
> #define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
> diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
> index a7d104e973..1c72ba5a4e 100644
> --- a/app/test/test_power_kvm_vm.c
> +++ b/app/test/test_power_kvm_vm.c
> @@ -20,7 +20,7 @@ test_power_kvm_vm(void)
> }
>
> #else
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #define TEST_POWER_VM_LCORE_ID 0U
> #define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 266c8b90dc..f82570b093 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -102,7 +102,7 @@ The public API headers are grouped by topics:
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> - [power/freq](@ref rte_power.h),
> + [power/freq](@ref rte_power_cpufreq.h),
> [power/uncore](@ref rte_power_uncore.h),
> [PMD power](@ref rte_power_pmd_mgmt.h)
>
> diff --git a/examples/distributor/main.c b/examples/distributor/main.c
> index ddbc387c20..ea44939fba 100644
> --- a/examples/distributor/main.c
> +++ b/examples/distributor/main.c
> @@ -17,7 +17,7 @@
> #include <rte_prefetch.h>
> #include <rte_distributor.h>
> #include <rte_pause.h>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #define RX_RING_SIZE 1024
> #define TX_RING_SIZE 1024
> diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
> index 6bd76515e6..272e069207 100644
> --- a/examples/l3fwd-power/main.c
> +++ b/examples/l3fwd-power/main.c
> @@ -41,7 +41,7 @@
> #include <rte_udp.h>
> #include <rte_string_fns.h>
> #include <rte_timer.h>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
> #include <rte_spinlock.h>
> #include <rte_metrics.h>
> #include <rte_telemetry.h>
> diff --git a/examples/l3fwd-power/perf_core.c b/examples/l3fwd-power/perf_core.c
> index 6c0f7ea213..1b5419119a 100644
> --- a/examples/l3fwd-power/perf_core.c
> +++ b/examples/l3fwd-power/perf_core.c
> @@ -10,7 +10,7 @@
> #include <rte_common.h>
> #include <rte_memory.h>
> #include <rte_lcore.h>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
> #include <rte_string_fns.h>
>
> #include "perf_core.h"
> diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
> index f21556e27d..d4e0d685c1 100644
> --- a/examples/vm_power_manager/channel_monitor.c
> +++ b/examples/vm_power_manager/channel_monitor.c
> @@ -31,7 +31,7 @@
> #ifdef RTE_NET_I40E
> #include <rte_pmd_i40e.h>
> #endif
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #include <libvirt/libvirt.h>
> #include "channel_monitor.h"
> diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
> index ab69524af5..a9a257abd3 100644
> --- a/examples/vm_power_manager/channel_monitor.h
> +++ b/examples/vm_power_manager/channel_monitor.h
> @@ -5,7 +5,7 @@
> #ifndef CHANNEL_MONITOR_H_
> #define CHANNEL_MONITOR_H_
>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #include "channel_manager.h"
>
> diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
> index 9da50020ac..6246cbd6b4 100644
> --- a/examples/vm_power_manager/guest_cli/main.c
> +++ b/examples/vm_power_manager/guest_cli/main.c
> @@ -9,7 +9,7 @@
> #include <string.h>
>
> #include <rte_lcore.h>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
> #include <rte_debug.h>
> #include <rte_eal.h>
> #include <rte_log.h>
> diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
> index 5eddb47847..803b6d1f82 100644
> --- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
> +++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
> @@ -18,7 +18,7 @@
> #include <rte_lcore.h>
> #include <rte_ethdev.h>
>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #include "vm_power_cli_guest.h"
>
> diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
> index 0355a7f4bc..522c713ff4 100644
> --- a/examples/vm_power_manager/power_manager.c
> +++ b/examples/vm_power_manager/power_manager.c
> @@ -15,7 +15,7 @@
> #include <sys/types.h>
>
> #include <rte_log.h>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
> #include <rte_spinlock.h>
>
> #include "channel_manager.h"
> diff --git a/lib/power/meson.build b/lib/power/meson.build
> index 5fa5d062e3..4f4dc19687 100644
> --- a/lib/power/meson.build
> +++ b/lib/power/meson.build
> @@ -13,14 +13,14 @@ if not is_linux
> endif
> sources = files(
> 'power_common.c',
> - 'rte_power.c',
> + 'rte_power_cpufreq.c',
> 'rte_power_uncore.c',
> 'rte_power_pmd_mgmt.c',
> )
> headers = files(
> 'power_cpufreq.h',
> 'power_uncore_ops.h',
> - 'rte_power.h',
> + 'rte_power_cpufreq.h',
> 'rte_power_guest_channel.h',
> 'rte_power_pmd_mgmt.h',
> 'rte_power_uncore.h',
> diff --git a/lib/power/rte_power.c b/lib/power/rte_power_cpufreq.c
> similarity index 99%
> rename from lib/power/rte_power.c
> rename to lib/power/rte_power_cpufreq.c
> index 3168b6d301..d91c530e34 100644
> --- a/lib/power/rte_power.c
> +++ b/lib/power/rte_power_cpufreq.c
> @@ -8,7 +8,7 @@
> #include <rte_spinlock.h>
> #include <rte_debug.h>
>
> -#include "rte_power.h"
> +#include "rte_power_cpufreq.h"
> #include "power_common.h"
>
> static enum power_management_env global_default_env = PM_ENV_NOT_SET;
> diff --git a/lib/power/rte_power.h b/lib/power/rte_power_cpufreq.h
> similarity index 99%
> rename from lib/power/rte_power.h
> rename to lib/power/rte_power_cpufreq.h
> index 7d566551bd..b68d8c0bbc 100644
> --- a/lib/power/rte_power.h
> +++ b/lib/power/rte_power_cpufreq.h
> @@ -3,8 +3,8 @@
> * Copyright(c) 2024 Advanced Micro Devices, Inc.
> */
>
> -#ifndef _RTE_POWER_H
> -#define _RTE_POWER_H
> +#ifndef _RTE_POWER_CPUFREQ_H
> +#define _RTE_POWER_CPUFREQ_H
>
> /**
> * @file
> diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
> index 807e454096..58c25bc3ff 100644
> --- a/lib/power/rte_power_pmd_mgmt.h
> +++ b/lib/power/rte_power_pmd_mgmt.h
> @@ -13,7 +13,7 @@
> #include <stdint.h>
>
> #include <rte_log.h>
> -#include <rte_power.h>
> +#include <rte_power_cpufreq.h>
>
> #ifdef __cplusplus
> extern "C" {
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v9 1/6] power: refactor core power management library
2024-10-26 3:06 ` lihuisong (C)
@ 2024-10-26 5:22 ` Tummala, Sivaprasad
2024-10-26 7:03 ` lihuisong (C)
0 siblings, 1 reply; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-10-26 5:22 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, david.hunt, anatoly.burakov, jerinj, radu.nicolau,
cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Huisong,
> -----Original Message-----
> From: lihuisong (C) <lihuisong@huawei.com>
> Sent: Saturday, October 26, 2024 8:37 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
> jerinj@marvell.com; radu.nicolau@intel.com; cristian.dumitrescu@intel.com;
> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> gakhil@marvell.com
> Subject: Re: [PATCH v9 1/6] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hi Sivaprasad,
>
> LGTM except for some trivial comments inline, With belows to change, you can add
> Acked-by: Huisong Li <lihuisong@huawei.com>
>
> /Huisong
> 在 2024/10/23 13:11, Sivaprasad Tummala 写道:
> > This patch introduces a comprehensive refactor to the core power
> > management library. The primary focus is on improving modularity and
> > organization by relocating specific driver implementations from the
> > 'lib/power' directory to dedicated directories within
> > 'drivers/power/core/*'. The adjustment of meson.build files enables
> > the selective activation of individual drivers.
> >
> > These changes contribute to a significant enhancement in code
> > organization, providing a clearer structure for driver implementations.
> > The refactor aims to improve overall code clarity and boost
> > maintainability. Additionally, it establishes a foundation for future
> > development, allowing for more focused work on individual drivers and
> > seamless integration of forthcoming enhancements.
> >
> > v8:
> > - marked rte_power_logtype as internal
> > - removed c++ guards for internal header files
> > - renamed rte_power_cpufreq_api.h for naming convention
> > - renamed rte_power_register_ops for naming convention
> >
> > v6:
> > - fixed compilation error with symbol export in API
> > - exported power_get_lcore_mapped_cpu_id as internal API to be
> > used in drivers/power/*
> >
> > v5:
> > - fixed code style warning
> >
> > v4:
> > - fixed build error with RTE_ASSERT
> >
> > v3:
> > - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
> > - re-worked on auto detection logic
> >
> > v2:
> > - added NULL check for global_core_ops in rte_power_get_core_ops
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> <snip>
> > +
> > +/**
> > + * Register power cpu frequency operations.
> > + *
> > + * @param ops
> > + * Pointer to an ops structure to register.
> > + * @return
> > + * - >=0: Success; return the index of the ops struct in the table.
> > + * - -EINVAL - error while registering ops struct.
> Not the index in the table, need to fix it.
> BTW, this API always success now. so no return value.
API now consistently returns 0 on success, rather than an index in the table. It will return a negative value on error.
I'll update the documentation to reflect this change and avoid any confusion.
> > + */
> > +__rte_internal
> > +int rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops
> > +*ops);
> > +
> > +/**
> > + * Macro to statically register the ops of a cpufreq driver.
> > + */
> > +#define RTE_POWER_REGISTER_CPUFREQ_OPS(ops) \
> > +RTE_INIT(power_hdlr_init_##ops) \
> > +{ \
> > + rte_power_register_cpufreq_ops(&ops); \ }
> > +
> > +#endif
> > diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
> > 36c3f3da98..3168b6d301 100644
> > --- a/lib/power/rte_power.c
> > +++ b/lib/power/rte_power.c
> > @@ -6,155 +6,88 @@
> >
> > #include <rte_errno.h>
> > #include <rte_spinlock.h>
> > +#include <rte_debug.h>
> >
> > #include "rte_power.h"
> > -#include "power_acpi_cpufreq.h"
> > -#include "power_cppc_cpufreq.h"
> > #include "power_common.h"
> > -#include "power_kvm_vm.h"
> > -#include "power_pstate_cpufreq.h"
> > -#include "power_amd_pstate_cpufreq.h"
> >
> > -enum power_management_env global_default_env = PM_ENV_NOT_SET;
> > +static enum power_management_env global_default_env =
> PM_ENV_NOT_SET;
> > +static struct rte_power_cpufreq_ops *global_cpufreq_ops;
> >
> > static rte_spinlock_t global_env_cfg_lock =
> > RTE_SPINLOCK_INITIALIZER;
> > -
> > -/* function pointers */
> > -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
> > rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
> > NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
> > -rte_power_freq_change_t rte_power_freq_down = NULL;
> > -rte_power_freq_change_t rte_power_freq_max = NULL;
> > -rte_power_freq_change_t rte_power_freq_min = NULL;
> > -rte_power_freq_change_t rte_power_turbo_status;
> > -rte_power_freq_change_t rte_power_freq_enable_turbo;
> > -rte_power_freq_change_t rte_power_freq_disable_turbo;
> > -rte_power_get_capabilities_t rte_power_get_capabilities;
> > -
> > -static void
> > -reset_power_function_ptrs(void)
> > +static RTE_TAILQ_HEAD(, rte_power_cpufreq_ops) cpufreq_ops_list =
> > + TAILQ_HEAD_INITIALIZER(cpufreq_ops_list);
> > +
> > +const char *power_env_str[] = {
> > + "not set",
> > + "acpi",
> > + "kvm-vm",
> > + "pstate",
> > + "cppc",
> > + "amd-pstate"
> > +};
>
> How use the "not set"? I don't know what its usage is. Do we need to consider
> removing it later?
The "not set" is default state and indicates no specific cpufreq management driver is active.
If the specific driver (located in drivers/power/*) is disabled during the build process,
the API will fail to configure the environment, leaving it in the "not set" state.
>
> > +
> > +/* register the ops struct in rte_power_cpufreq_ops, return 0 on
> > +success. */ int rte_power_register_cpufreq_ops(struct
> > +rte_power_cpufreq_ops *driver_ops)
> > {
> > - rte_power_freqs = NULL;
> > - rte_power_get_freq = NULL;
> > - rte_power_set_freq = NULL;
> > - rte_power_freq_up = NULL;
> > - rte_power_freq_down = NULL;
> > - rte_power_freq_max = NULL;
> > - rte_power_freq_min = NULL;
> > - rte_power_turbo_status = NULL;
> > - rte_power_freq_enable_turbo = NULL;
> > - rte_power_freq_disable_turbo = NULL;
> > - rte_power_get_capabilities = NULL;
> > + if (!driver_ops->init || !driver_ops->exit ||
> > + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
> > + !driver_ops->get_freq || !driver_ops->set_freq ||
> > + !driver_ops->freq_up || !driver_ops->freq_down ||
> > + !driver_ops->freq_max || !driver_ops->freq_min ||
> > + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
> > + !driver_ops->disable_turbo || !driver_ops->get_caps) {
> > + POWER_LOG(ERR, "Missing callbacks while registering cpufreq ops");
> > + return -EINVAL;
> > + }
> > +
> > + TAILQ_INSERT_TAIL(&cpufreq_ops_list, driver_ops, next);
> > +
> > + return 0;
> > }
> suggest that change function return value as above mention.
Same as above.
> >
> > int
> > rte_power_check_env_supported(enum power_management_env env)
> > {
> > - switch (env) {
> > - case PM_ENV_ACPI_CPUFREQ:
> > - return power_acpi_cpufreq_check_supported();
> > - case PM_ENV_PSTATE_CPUFREQ:
> > - return power_pstate_cpufreq_check_supported();
> > - case PM_ENV_KVM_VM:
> > - return power_kvm_vm_check_supported();
> > - case PM_ENV_CPPC_CPUFREQ:
> > - return power_cppc_cpufreq_check_supported();
> > - case PM_ENV_AMD_PSTATE_CPUFREQ:
> > - return power_amd_pstate_cpufreq_check_supported();
> > - default:
> > - rte_errno = EINVAL;
> > - return -1;
> > - }
> > + struct rte_power_cpufreq_ops *ops;
> > +
> > + if (env >= RTE_DIM(power_env_str))
> > + return 0;
> > +
> > + RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
> > + if (strncmp(ops->name, power_env_str[env],
> > + RTE_POWER_DRIVER_NAMESZ) == 0)
> > + return ops->check_env_support();
> > +
> > + return 0;
> > }
> >
> <snip>
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v9 1/6] power: refactor core power management library
2024-10-26 5:22 ` Tummala, Sivaprasad
@ 2024-10-26 7:03 ` lihuisong (C)
0 siblings, 0 replies; 139+ messages in thread
From: lihuisong (C) @ 2024-10-26 7:03 UTC (permalink / raw)
To: Tummala, Sivaprasad
Cc: dev, david.hunt, anatoly.burakov, jerinj, radu.nicolau,
cristian.dumitrescu, konstantin.ananyev, Yigit, Ferruh, gakhil
在 2024/10/26 13:22, Tummala, Sivaprasad 写道:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Huisong,
>
>> -----Original Message-----
>> From: lihuisong (C) <lihuisong@huawei.com>
>> Sent: Saturday, October 26, 2024 8:37 AM
>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>> Cc: dev@dpdk.org; david.hunt@intel.com; anatoly.burakov@intel.com;
>> jerinj@marvell.com; radu.nicolau@intel.com; cristian.dumitrescu@intel.com;
>> konstantin.ananyev@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
>> gakhil@marvell.com
>> Subject: Re: [PATCH v9 1/6] power: refactor core power management library
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> Hi Sivaprasad,
>>
>> LGTM except for some trivial comments inline, With belows to change, you can add
>> Acked-by: Huisong Li <lihuisong@huawei.com>
>>
>> /Huisong
>> 在 2024/10/23 13:11, Sivaprasad Tummala 写道:
>>> This patch introduces a comprehensive refactor to the core power
>>> management library. The primary focus is on improving modularity and
>>> organization by relocating specific driver implementations from the
>>> 'lib/power' directory to dedicated directories within
>>> 'drivers/power/core/*'. The adjustment of meson.build files enables
>>> the selective activation of individual drivers.
>>>
>>> These changes contribute to a significant enhancement in code
>>> organization, providing a clearer structure for driver implementations.
>>> The refactor aims to improve overall code clarity and boost
>>> maintainability. Additionally, it establishes a foundation for future
>>> development, allowing for more focused work on individual drivers and
>>> seamless integration of forthcoming enhancements.
>>>
>>> v8:
>>> - marked rte_power_logtype as internal
>>> - removed c++ guards for internal header files
>>> - renamed rte_power_cpufreq_api.h for naming convention
>>> - renamed rte_power_register_ops for naming convention
>>>
>>> v6:
>>> - fixed compilation error with symbol export in API
>>> - exported power_get_lcore_mapped_cpu_id as internal API to be
>>> used in drivers/power/*
>>>
>>> v5:
>>> - fixed code style warning
>>>
>>> v4:
>>> - fixed build error with RTE_ASSERT
>>>
>>> v3:
>>> - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
>>> - re-worked on auto detection logic
>>>
>>> v2:
>>> - added NULL check for global_core_ops in rte_power_get_core_ops
>>>
>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>> ---
>> <snip>
>>> +
>>> +/**
>>> + * Register power cpu frequency operations.
>>> + *
>>> + * @param ops
>>> + * Pointer to an ops structure to register.
>>> + * @return
>>> + * - >=0: Success; return the index of the ops struct in the table.
>>> + * - -EINVAL - error while registering ops struct.
>> Not the index in the table, need to fix it.
>> BTW, this API always success now. so no return value.
> API now consistently returns 0 on success, rather than an index in the table. It will return a negative value on error.
> I'll update the documentation to reflect this change and avoid any confusion.
Ack
>>> + */
>>> +__rte_internal
>>> +int rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops
>>> +*ops);
>>> +
>>> +/**
>>> + * Macro to statically register the ops of a cpufreq driver.
>>> + */
>>> +#define RTE_POWER_REGISTER_CPUFREQ_OPS(ops) \
>>> +RTE_INIT(power_hdlr_init_##ops) \
>>> +{ \
>>> + rte_power_register_cpufreq_ops(&ops); \ }
>>> +
>>> +#endif
>>> diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c index
>>> 36c3f3da98..3168b6d301 100644
>>> --- a/lib/power/rte_power.c
>>> +++ b/lib/power/rte_power.c
>>> @@ -6,155 +6,88 @@
>>>
>>> #include <rte_errno.h>
>>> #include <rte_spinlock.h>
>>> +#include <rte_debug.h>
>>>
>>> #include "rte_power.h"
>>> -#include "power_acpi_cpufreq.h"
>>> -#include "power_cppc_cpufreq.h"
>>> #include "power_common.h"
>>> -#include "power_kvm_vm.h"
>>> -#include "power_pstate_cpufreq.h"
>>> -#include "power_amd_pstate_cpufreq.h"
>>>
>>> -enum power_management_env global_default_env = PM_ENV_NOT_SET;
>>> +static enum power_management_env global_default_env =
>> PM_ENV_NOT_SET;
>>> +static struct rte_power_cpufreq_ops *global_cpufreq_ops;
>>>
>>> static rte_spinlock_t global_env_cfg_lock =
>>> RTE_SPINLOCK_INITIALIZER;
>>> -
>>> -/* function pointers */
>>> -rte_power_freqs_t rte_power_freqs = NULL; -rte_power_get_freq_t
>>> rte_power_get_freq = NULL; -rte_power_set_freq_t rte_power_set_freq =
>>> NULL; -rte_power_freq_change_t rte_power_freq_up = NULL;
>>> -rte_power_freq_change_t rte_power_freq_down = NULL;
>>> -rte_power_freq_change_t rte_power_freq_max = NULL;
>>> -rte_power_freq_change_t rte_power_freq_min = NULL;
>>> -rte_power_freq_change_t rte_power_turbo_status;
>>> -rte_power_freq_change_t rte_power_freq_enable_turbo;
>>> -rte_power_freq_change_t rte_power_freq_disable_turbo;
>>> -rte_power_get_capabilities_t rte_power_get_capabilities;
>>> -
>>> -static void
>>> -reset_power_function_ptrs(void)
>>> +static RTE_TAILQ_HEAD(, rte_power_cpufreq_ops) cpufreq_ops_list =
>>> + TAILQ_HEAD_INITIALIZER(cpufreq_ops_list);
>>> +
>>> +const char *power_env_str[] = {
>>> + "not set",
>>> + "acpi",
>>> + "kvm-vm",
>>> + "pstate",
>>> + "cppc",
>>> + "amd-pstate"
>>> +};
>> How use the "not set"? I don't know what its usage is. Do we need to consider
>> removing it later?
> The "not set" is default state and indicates no specific cpufreq management driver is active.
> If the specific driver (located in drivers/power/*) is disabled during the build process,
> the API will fail to configure the environment, leaving it in the "not set" state.
ok
>>> +
>>> +/* register the ops struct in rte_power_cpufreq_ops, return 0 on
>>> +success. */ int rte_power_register_cpufreq_ops(struct
>>> +rte_power_cpufreq_ops *driver_ops)
>>> {
>>> - rte_power_freqs = NULL;
>>> - rte_power_get_freq = NULL;
>>> - rte_power_set_freq = NULL;
>>> - rte_power_freq_up = NULL;
>>> - rte_power_freq_down = NULL;
>>> - rte_power_freq_max = NULL;
>>> - rte_power_freq_min = NULL;
>>> - rte_power_turbo_status = NULL;
>>> - rte_power_freq_enable_turbo = NULL;
>>> - rte_power_freq_disable_turbo = NULL;
>>> - rte_power_get_capabilities = NULL;
>>> + if (!driver_ops->init || !driver_ops->exit ||
>>> + !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
>>> + !driver_ops->get_freq || !driver_ops->set_freq ||
>>> + !driver_ops->freq_up || !driver_ops->freq_down ||
>>> + !driver_ops->freq_max || !driver_ops->freq_min ||
>>> + !driver_ops->turbo_status || !driver_ops->enable_turbo ||
>>> + !driver_ops->disable_turbo || !driver_ops->get_caps) {
>>> + POWER_LOG(ERR, "Missing callbacks while registering cpufreq ops");
>>> + return -EINVAL;
>>> + }
>>> +
>>> + TAILQ_INSERT_TAIL(&cpufreq_ops_list, driver_ops, next);
>>> +
>>> + return 0;
>>> }
>> suggest that change function return value as above mention.
> Same as above.
>>> int
>>> rte_power_check_env_supported(enum power_management_env env)
>>> {
>>> - switch (env) {
>>> - case PM_ENV_ACPI_CPUFREQ:
>>> - return power_acpi_cpufreq_check_supported();
>>> - case PM_ENV_PSTATE_CPUFREQ:
>>> - return power_pstate_cpufreq_check_supported();
>>> - case PM_ENV_KVM_VM:
>>> - return power_kvm_vm_check_supported();
>>> - case PM_ENV_CPPC_CPUFREQ:
>>> - return power_cppc_cpufreq_check_supported();
>>> - case PM_ENV_AMD_PSTATE_CPUFREQ:
>>> - return power_amd_pstate_cpufreq_check_supported();
>>> - default:
>>> - rte_errno = EINVAL;
>>> - return -1;
>>> - }
>>> + struct rte_power_cpufreq_ops *ops;
>>> +
>>> + if (env >= RTE_DIM(power_env_str))
>>> + return 0;
>>> +
>>> + RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
>>> + if (strncmp(ops->name, power_env_str[env],
>>> + RTE_POWER_DRIVER_NAMESZ) == 0)
>>> + return ops->check_env_support();
>>> +
>>> + return 0;
>>> }
>>>
>> <snip>
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 0/6] power: refactor power management library
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
` (6 preceding siblings ...)
2024-10-23 5:11 ` [PATCH v9 0/6] power: refactor power management library Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 1/6] power: refactor core " Sivaprasad Tummala
` (6 more replies)
7 siblings, 7 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.
Sivaprasad Tummala (6):
power: refactor core power management library
power: refactor uncore power management library
test/power: removed function pointer validations
drivers/power: uncore support for AMD EPYC processors
maintainers: update for drivers/power
power: rename library sources for cpu frequency management
MAINTAINERS | 1 +
app/test/test_power.c | 97 +-----
app/test/test_power_cpufreq.c | 54 +--
app/test/test_power_kvm_vm.c | 38 +-
doc/api/doxy-api-index.md | 2 +-
doc/guides/prog_guide/power_man.rst | 24 +-
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 14 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 14 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
.../guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 13 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/power_uncore_ops.h | 244 +++++++++++++
lib/power/rte_power.c | 257 --------------
lib/power/rte_power_cpufreq.c | 227 ++++++++++++
.../{rte_power.h => rte_power_cpufreq.h} | 120 ++++---
lib/power/rte_power_pmd_mgmt.h | 2 +-
lib/power/rte_power_uncore.c | 259 +++++++-------
lib/power/rte_power_uncore.h | 61 ++--
lib/power/version.map | 15 +
51 files changed, 1763 insertions(+), 716 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
create mode 100644 lib/power/power_uncore_ops.h
delete mode 100644 lib/power/rte_power.c
create mode 100644 lib/power/rte_power_cpufreq.c
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (73%)
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 1/6] power: refactor core power management library
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-11-10 10:40 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 2/6] power: refactor uncore " Sivaprasad Tummala
` (5 subsequent siblings)
6 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.
v10:
- fixed rte_power_register_cpufreq_ops API description
- removed unused header inclusion
v8:
- marked rte_power_logtype as internal
- removed c++ guards for internal header files
- renamed rte_power_cpufreq_api.h for naming convention
- renamed rte_power_register_ops for naming convention
v6:
- fixed compilation error with symbol export in API
- exported power_get_lcore_mapped_cpu_id as internal API to be
used in drivers/power/*
v5:
- fixed code style warning
v4:
- fixed build error with RTE_ASSERT
v3:
- renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
- re-worked on auto detection logic
v2:
- added NULL check for global_core_ops in rte_power_get_core_ops
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
drivers/meson.build | 1 +
.../power/acpi/acpi_cpufreq.c | 22 +-
.../power/acpi/acpi_cpufreq.h | 6 +-
drivers/power/acpi/meson.build | 10 +
.../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
drivers/power/amd_pstate/meson.build | 10 +
.../power/cppc/cppc_cpufreq.c | 22 +-
.../power/cppc/cppc_cpufreq.h | 8 +-
drivers/power/cppc/meson.build | 10 +
.../power/kvm_vm}/guest_channel.c | 2 +-
.../power/kvm_vm}/guest_channel.h | 0
.../power/kvm_vm/kvm_vm.c | 22 +-
.../power/kvm_vm/kvm_vm.h | 6 +-
drivers/power/kvm_vm/meson.build | 14 +
drivers/power/meson.build | 12 +
drivers/power/pstate/meson.build | 10 +
.../power/pstate/pstate_cpufreq.c | 22 +-
.../power/pstate/pstate_cpufreq.h | 6 +-
lib/power/meson.build | 7 +-
lib/power/power_common.c | 2 +-
lib/power/power_common.h | 18 +-
lib/power/power_cpufreq.h | 191 ++++++++++
lib/power/rte_power.c | 358 ++++++++----------
lib/power/rte_power.h | 116 +++---
lib/power/version.map | 14 +
26 files changed, 650 insertions(+), 273 deletions(-)
rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%)
rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%)
create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (96%)
create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%)
rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%)
create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (99%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/power/meson.build
create mode 100644 drivers/power/pstate/meson.build
rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%)
rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%)
create mode 100644 lib/power/power_cpufreq.h
diff --git a/drivers/meson.build b/drivers/meson.build
index 5270160c56..70a60ae823 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
'event', # depends on common, bus, mempool and net.
'baseband', # depends on common and bus.
'gpu', # depends on common and bus.
+ 'power', # depends on common (in future).
]
if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index ae809fbb60..81a5e3f6ea 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
#include <rte_stdatomic.h>
#include <rte_string_fns.h>
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
#include "power_common.h"
#define STR_SIZE 1024
@@ -587,3 +587,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops acpi_ops = {
+ .name = "acpi",
+ .init = power_acpi_cpufreq_init,
+ .exit = power_acpi_cpufreq_exit,
+ .check_env_support = power_acpi_cpufreq_check_supported,
+ .get_avail_freqs = power_acpi_cpufreq_freqs,
+ .get_freq = power_acpi_cpufreq_get_freq,
+ .set_freq = power_acpi_cpufreq_set_freq,
+ .freq_down = power_acpi_cpufreq_freq_down,
+ .freq_up = power_acpi_cpufreq_freq_up,
+ .freq_max = power_acpi_cpufreq_freq_max,
+ .freq_min = power_acpi_cpufreq_freq_min,
+ .turbo_status = power_acpi_turbo_status,
+ .enable_turbo = power_acpi_enable_turbo,
+ .disable_turbo = power_acpi_disable_turbo,
+ .get_caps = power_acpi_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(acpi_ops);
diff --git a/lib/power/power_acpi_cpufreq.h b/drivers/power/acpi/acpi_cpufreq.h
similarity index 98%
rename from lib/power/power_acpi_cpufreq.h
rename to drivers/power/acpi/acpi_cpufreq.h
index 682fd9278c..e18a3e6af8 100644
--- a/lib/power/power_acpi_cpufreq.h
+++ b/drivers/power/acpi/acpi_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_ACPI_CPUFREQ_H
-#define _POWER_ACPI_CPUFREQ_H
+#ifndef _ACPI_CPUFREQ_H
+#define _ACPI_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace ACPI cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if ACPI power management is supported.
diff --git a/drivers/power/acpi/meson.build b/drivers/power/acpi/meson.build
new file mode 100644
index 0000000000..f5afc893ce
--- /dev/null
+++ b/drivers/power/acpi/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('acpi_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_amd_pstate_cpufreq.c b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
similarity index 95%
rename from lib/power/power_amd_pstate_cpufreq.c
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.c
index 2b728eca18..95495bff7d 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.c
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#include <stdlib.h>
@@ -9,7 +9,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_amd_pstate_cpufreq.h"
+#include "amd_pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 1000 */
@@ -710,3 +710,23 @@ power_amd_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops amd_pstate_ops = {
+ .name = "amd-pstate",
+ .init = power_amd_pstate_cpufreq_init,
+ .exit = power_amd_pstate_cpufreq_exit,
+ .check_env_support = power_amd_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_amd_pstate_cpufreq_freqs,
+ .get_freq = power_amd_pstate_cpufreq_get_freq,
+ .set_freq = power_amd_pstate_cpufreq_set_freq,
+ .freq_down = power_amd_pstate_cpufreq_freq_down,
+ .freq_up = power_amd_pstate_cpufreq_freq_up,
+ .freq_max = power_amd_pstate_cpufreq_freq_max,
+ .freq_min = power_amd_pstate_cpufreq_freq_min,
+ .turbo_status = power_amd_pstate_turbo_status,
+ .enable_turbo = power_amd_pstate_enable_turbo,
+ .disable_turbo = power_amd_pstate_disable_turbo,
+ .get_caps = power_amd_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(amd_pstate_ops);
diff --git a/lib/power/power_amd_pstate_cpufreq.h b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
similarity index 96%
rename from lib/power/power_amd_pstate_cpufreq.h
rename to drivers/power/amd_pstate/amd_pstate_cpufreq.h
index b02f9f98e4..5c273df4d7 100644
--- a/lib/power/power_amd_pstate_cpufreq.h
+++ b/drivers/power/amd_pstate/amd_pstate_cpufreq.h
@@ -1,18 +1,18 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2021 Intel Corporation
* Copyright(c) 2021 Arm Limited
- * Copyright(c) 2023 Amd Limited
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _POWER_AMD_PSTATE_CPUFREQ_H
-#define _POWER_AMD_PSTATE_CPUFREQ_H
+#ifndef _AMD_PSTATE_CPUFREQ_H
+#define _AMD_PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace AMD pstate cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if amd p-state power management is supported.
@@ -216,4 +216,4 @@ int power_amd_pstate_disable_turbo(unsigned int lcore_id);
int power_amd_pstate_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_AMD_PSTATET_CPUFREQ_H */
+#endif /* _AMD_PSTATET_CPUFREQ_H */
diff --git a/drivers/power/amd_pstate/meson.build b/drivers/power/amd_pstate/meson.build
new file mode 100644
index 0000000000..acaf20b388
--- /dev/null
+++ b/drivers/power/amd_pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('amd_pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_cppc_cpufreq.c b/drivers/power/cppc/cppc_cpufreq.c
similarity index 95%
rename from lib/power/power_cppc_cpufreq.c
rename to drivers/power/cppc/cppc_cpufreq.c
index cc9305bdfe..3cd4165c83 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/drivers/power/cppc/cppc_cpufreq.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
#include <rte_stdatomic.h>
-#include "power_cppc_cpufreq.h"
+#include "cppc_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -695,3 +695,23 @@ power_cppc_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops cppc_ops = {
+ .name = "cppc",
+ .init = power_cppc_cpufreq_init,
+ .exit = power_cppc_cpufreq_exit,
+ .check_env_support = power_cppc_cpufreq_check_supported,
+ .get_avail_freqs = power_cppc_cpufreq_freqs,
+ .get_freq = power_cppc_cpufreq_get_freq,
+ .set_freq = power_cppc_cpufreq_set_freq,
+ .freq_down = power_cppc_cpufreq_freq_down,
+ .freq_up = power_cppc_cpufreq_freq_up,
+ .freq_max = power_cppc_cpufreq_freq_max,
+ .freq_min = power_cppc_cpufreq_freq_min,
+ .turbo_status = power_cppc_turbo_status,
+ .enable_turbo = power_cppc_enable_turbo,
+ .disable_turbo = power_cppc_disable_turbo,
+ .get_caps = power_cppc_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(cppc_ops);
diff --git a/lib/power/power_cppc_cpufreq.h b/drivers/power/cppc/cppc_cpufreq.h
similarity index 97%
rename from lib/power/power_cppc_cpufreq.h
rename to drivers/power/cppc/cppc_cpufreq.h
index f4121b237e..d637f53dcc 100644
--- a/lib/power/power_cppc_cpufreq.h
+++ b/drivers/power/cppc/cppc_cpufreq.h
@@ -3,15 +3,15 @@
* Copyright(c) 2021 Arm Limited
*/
-#ifndef _POWER_CPPC_CPUFREQ_H
-#define _POWER_CPPC_CPUFREQ_H
+#ifndef _CPPC_CPUFREQ_H
+#define _CPPC_CPUFREQ_H
/**
* @file
* RTE Power Management via userspace CPPC cpufreq
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if CPPC power management is supported.
@@ -215,4 +215,4 @@ int power_cppc_disable_turbo(unsigned int lcore_id);
int power_cppc_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-#endif /* _POWER_CPPC_CPUFREQ_H */
+#endif /* _CPPC_CPUFREQ_H */
diff --git a/drivers/power/cppc/meson.build b/drivers/power/cppc/meson.build
new file mode 100644
index 0000000000..f1948cd424
--- /dev/null
+++ b/drivers/power/cppc/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('cppc_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/guest_channel.c b/drivers/power/kvm_vm/guest_channel.c
similarity index 99%
rename from lib/power/guest_channel.c
rename to drivers/power/kvm_vm/guest_channel.c
index bc3f55b6bf..35cd4cfe6f 100644
--- a/lib/power/guest_channel.c
+++ b/drivers/power/kvm_vm/guest_channel.c
@@ -13,7 +13,7 @@
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_guest_channel.h>
#include "guest_channel.h"
diff --git a/lib/power/guest_channel.h b/drivers/power/kvm_vm/guest_channel.h
similarity index 100%
rename from lib/power/guest_channel.h
rename to drivers/power/kvm_vm/guest_channel.h
diff --git a/lib/power/power_kvm_vm.c b/drivers/power/kvm_vm/kvm_vm.c
similarity index 82%
rename from lib/power/power_kvm_vm.c
rename to drivers/power/kvm_vm/kvm_vm.c
index f15be8fac5..5754a441cd 100644
--- a/lib/power/power_kvm_vm.c
+++ b/drivers/power/kvm_vm/kvm_vm.c
@@ -9,7 +9,7 @@
#include "rte_power_guest_channel.h"
#include "guest_channel.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
+#include "kvm_vm.h"
#define FD_PATH "/dev/virtio-ports/virtio.serial.port.poweragent"
@@ -137,3 +137,23 @@ int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
POWER_LOG(ERR, "rte_power_get_capabilities is not implemented for Virtual Machine Power Management");
return -ENOTSUP;
}
+
+static struct rte_power_cpufreq_ops kvm_vm_ops = {
+ .name = "kvm-vm",
+ .init = power_kvm_vm_init,
+ .exit = power_kvm_vm_exit,
+ .check_env_support = power_kvm_vm_check_supported,
+ .get_avail_freqs = power_kvm_vm_freqs,
+ .get_freq = power_kvm_vm_get_freq,
+ .set_freq = power_kvm_vm_set_freq,
+ .freq_down = power_kvm_vm_freq_down,
+ .freq_up = power_kvm_vm_freq_up,
+ .freq_max = power_kvm_vm_freq_max,
+ .freq_min = power_kvm_vm_freq_min,
+ .turbo_status = power_kvm_vm_turbo_status,
+ .enable_turbo = power_kvm_vm_enable_turbo,
+ .disable_turbo = power_kvm_vm_disable_turbo,
+ .get_caps = power_kvm_vm_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(kvm_vm_ops);
diff --git a/lib/power/power_kvm_vm.h b/drivers/power/kvm_vm/kvm_vm.h
similarity index 98%
rename from lib/power/power_kvm_vm.h
rename to drivers/power/kvm_vm/kvm_vm.h
index 303fcc041b..4fabe4c6a5 100644
--- a/lib/power/power_kvm_vm.h
+++ b/drivers/power/kvm_vm/kvm_vm.h
@@ -2,15 +2,15 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#ifndef _POWER_KVM_VM_H
-#define _POWER_KVM_VM_H
+#ifndef _KVM_VM_H
+#define _KVM_VM_H
/**
* @file
* RTE Power Management KVM VM
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if KVM power management is supported.
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
new file mode 100644
index 0000000000..fe11179ab3
--- /dev/null
+++ b/drivers/power/kvm_vm/meson.build
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+sources = files(
+ 'guest_channel.c',
+ 'kvm_vm.c',
+)
+
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
new file mode 100644
index 0000000000..8c7215c639
--- /dev/null
+++ b/drivers/power/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+drivers = [
+ 'acpi',
+ 'amd_pstate',
+ 'cppc',
+ 'kvm_vm',
+ 'pstate'
+]
+
+std_deps = ['power']
diff --git a/drivers/power/pstate/meson.build b/drivers/power/pstate/meson.build
new file mode 100644
index 0000000000..9cd47833fb
--- /dev/null
+++ b/drivers/power/pstate/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+endif
+sources = files('pstate_cpufreq.c')
+
+deps += ['power']
diff --git a/lib/power/power_pstate_cpufreq.c b/drivers/power/pstate/pstate_cpufreq.c
similarity index 96%
rename from lib/power/power_pstate_cpufreq.c
rename to drivers/power/pstate/pstate_cpufreq.c
index 4755909466..f117ff3d17 100644
--- a/lib/power/power_pstate_cpufreq.c
+++ b/drivers/power/pstate/pstate_cpufreq.c
@@ -15,7 +15,7 @@
#include <rte_stdatomic.h>
#include "rte_power_pmd_mgmt.h"
-#include "power_pstate_cpufreq.h"
+#include "pstate_cpufreq.h"
#include "power_common.h"
/* macros used for rounding frequency to nearest 100000 */
@@ -898,3 +898,23 @@ int power_pstate_get_capabilities(unsigned int lcore_id,
return 0;
}
+
+static struct rte_power_cpufreq_ops pstate_ops = {
+ .name = "pstate",
+ .init = power_pstate_cpufreq_init,
+ .exit = power_pstate_cpufreq_exit,
+ .check_env_support = power_pstate_cpufreq_check_supported,
+ .get_avail_freqs = power_pstate_cpufreq_freqs,
+ .get_freq = power_pstate_cpufreq_get_freq,
+ .set_freq = power_pstate_cpufreq_set_freq,
+ .freq_down = power_pstate_cpufreq_freq_down,
+ .freq_up = power_pstate_cpufreq_freq_up,
+ .freq_max = power_pstate_cpufreq_freq_max,
+ .freq_min = power_pstate_cpufreq_freq_min,
+ .turbo_status = power_pstate_turbo_status,
+ .enable_turbo = power_pstate_enable_turbo,
+ .disable_turbo = power_pstate_disable_turbo,
+ .get_caps = power_pstate_get_capabilities
+};
+
+RTE_POWER_REGISTER_CPUFREQ_OPS(pstate_ops);
diff --git a/lib/power/power_pstate_cpufreq.h b/drivers/power/pstate/pstate_cpufreq.h
similarity index 98%
rename from lib/power/power_pstate_cpufreq.h
rename to drivers/power/pstate/pstate_cpufreq.h
index 7bf64a518c..b18a1ac9bc 100644
--- a/lib/power/power_pstate_cpufreq.h
+++ b/drivers/power/pstate/pstate_cpufreq.h
@@ -2,15 +2,15 @@
* Copyright(c) 2018 Intel Corporation
*/
-#ifndef _POWER_PSTATE_CPUFREQ_H
-#define _POWER_PSTATE_CPUFREQ_H
+#ifndef _PSTATE_CPUFREQ_H
+#define _PSTATE_CPUFREQ_H
/**
* @file
* RTE Power Management via Intel Pstate driver
*/
-#include "rte_power.h"
+#include "power_cpufreq.h"
/**
* Check if pstate power management is supported.
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 2f0f3d26e9..dd8e4393ac 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -12,19 +12,14 @@ if not is_linux
reason = 'only supported on Linux'
endif
sources = files(
- 'guest_channel.c',
- 'power_acpi_cpufreq.c',
- 'power_amd_pstate_cpufreq.c',
'power_common.c',
- 'power_cppc_cpufreq.c',
- 'power_kvm_vm.c',
'power_intel_uncore.c',
- 'power_pstate_cpufreq.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
+ 'power_cpufreq.h',
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
diff --git a/lib/power/power_common.c b/lib/power/power_common.c
index b47c63a5f1..e482f71c64 100644
--- a/lib/power/power_common.c
+++ b/lib/power/power_common.c
@@ -13,7 +13,7 @@
#include "power_common.h"
-RTE_LOG_REGISTER_DEFAULT(power_logtype, INFO);
+RTE_LOG_REGISTER_DEFAULT(rte_power_logtype, INFO);
#define POWER_SYSFILE_SCALING_DRIVER \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_driver"
diff --git a/lib/power/power_common.h b/lib/power/power_common.h
index 82fb94d0c0..c294f561bb 100644
--- a/lib/power/power_common.h
+++ b/lib/power/power_common.h
@@ -6,12 +6,13 @@
#define _POWER_COMMON_H_
#include <rte_common.h>
+#include <rte_compat.h>
#include <rte_log.h>
#define RTE_POWER_INVALID_FREQ_INDEX (~0)
-extern int power_logtype;
-#define RTE_LOGTYPE_POWER power_logtype
+extern int rte_power_logtype;
+#define RTE_LOGTYPE_POWER rte_power_logtype
#define POWER_LOG(level, ...) \
RTE_LOG_LINE(level, POWER, "" __VA_ARGS__)
@@ -23,14 +24,27 @@ extern int power_logtype;
#endif
/* check if scaling driver matches one we want */
+__rte_internal
int cpufreq_check_scaling_driver(const char *driver);
+
+__rte_internal
int power_set_governor(unsigned int lcore_id, const char *new_governor,
char *orig_governor, size_t orig_governor_len);
+
+__rte_internal
int open_core_sysfs_file(FILE **f, const char *mode, const char *format, ...)
__rte_format_printf(3, 4);
+
+__rte_internal
int read_core_sysfs_u32(FILE *f, uint32_t *val);
+
+__rte_internal
int read_core_sysfs_s(FILE *f, char *buf, unsigned int len);
+
+__rte_internal
int write_core_sysfs_s(FILE *f, const char *str);
+
+__rte_internal
int power_get_lcore_mapped_cpu_id(uint32_t lcore_id, uint32_t *cpu_id);
#endif /* _POWER_COMMON_H_ */
diff --git a/lib/power/power_cpufreq.h b/lib/power/power_cpufreq.h
new file mode 100644
index 0000000000..5e527f4ebd
--- /dev/null
+++ b/lib/power/power_cpufreq.h
@@ -0,0 +1,191 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _POWER_CPUFREQ_H
+#define _POWER_CPUFREQ_H
+
+/**
+ * @file
+ * RTE Power Management
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_compat.h>
+
+#define RTE_POWER_DRIVER_NAMESZ 24
+
+/**
+ * Initialize power management for a specific lcore. If rte_power_set_env() has
+ * not been called then an auto-detect of the environment will start and
+ * initialise the corresponding resources.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_init_t)(unsigned int lcore_id);
+
+/**
+ * Exit power management on a specific lcore. This will call the environment
+ * dependent exit function.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_cpufreq_exit_t)(unsigned int lcore_id);
+
+/**
+ * Check if a specific power management environment type is supported on a
+ * currently running system.
+ *
+ * @return
+ * - 1 if supported
+ * - 0 if unsupported
+ * - -1 if error, with rte_errno indicating reason for error.
+ */
+typedef int (*rte_power_check_env_support_t)(void);
+
+/**
+ * Get the available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * The number of available frequencies.
+ */
+typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id,
+ uint32_t *freqs, uint32_t num);
+
+/**
+ * Return the current index of available frequencies of a specific lcore.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * The current index of available frequencies.
+ */
+typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
+
+/**
+ * Set the new frequency for a specific lcore by indicating the index of
+ * available frequencies.
+ * Function pointer definition. Review each environments
+ * specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
+
+/**
+ * Function pointer definition for generic frequency change functions. Review
+ * each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+
+/**
+ * Power capabilities summary.
+ */
+struct rte_power_core_capabilities {
+ union {
+ uint64_t capabilities;
+ struct {
+ uint64_t turbo:1; /**< Turbo can be enabled. */
+ uint64_t priority:1; /**< SST-BF high freq core */
+ };
+ };
+};
+
+typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps);
+
+/** Structure defining core power operations structure */
+struct rte_power_cpufreq_ops {
+ RTE_TAILQ_ENTRY(rte_power_cpufreq_ops) next; /**< Next in list. */
+ char name[RTE_POWER_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_cpufreq_init_t init; /**< Initialize power management. */
+ rte_power_cpufreq_exit_t exit; /**< Exit power management. */
+ rte_power_check_env_support_t check_env_support;/**< verify env is supported. */
+ rte_power_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_freq_t set_freq; /**< Set frequency index. */
+ rte_power_freq_change_t freq_up; /**< Scale up frequency. */
+ rte_power_freq_change_t freq_down; /**< Scale down frequency. */
+ rte_power_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+ rte_power_freq_change_t turbo_status; /**< Get Turbo status. */
+ rte_power_freq_change_t enable_turbo; /**< Enable Turbo. */
+ rte_power_freq_change_t disable_turbo; /**< Disable Turbo. */
+ rte_power_get_capabilities_t get_caps; /**< power capabilities. */
+};
+
+/**
+ * Register power cpu frequency operations.
+ *
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - 0: Success.
+ * - Negative on error.
+ */
+__rte_internal
+int rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *ops);
+
+/**
+ * Macro to statically register the ops of a cpufreq driver.
+ */
+#define RTE_POWER_REGISTER_CPUFREQ_OPS(ops) \
+RTE_INIT(power_hdlr_init_##ops) \
+{ \
+ rte_power_register_cpufreq_ops(&ops); \
+}
+
+#endif
diff --git a/lib/power/rte_power.c b/lib/power/rte_power.c
index 36c3f3da98..90c9cf2198 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power.c
@@ -2,159 +2,89 @@
* Copyright(c) 2010-2014 Intel Corporation
*/
-#include <errno.h>
-
-#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
#include "rte_power.h"
-#include "power_acpi_cpufreq.h"
-#include "power_cppc_cpufreq.h"
#include "power_common.h"
-#include "power_kvm_vm.h"
-#include "power_pstate_cpufreq.h"
-#include "power_amd_pstate_cpufreq.h"
-enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static enum power_management_env global_default_env = PM_ENV_NOT_SET;
+static struct rte_power_cpufreq_ops *global_cpufreq_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
-
-/* function pointers */
-rte_power_freqs_t rte_power_freqs = NULL;
-rte_power_get_freq_t rte_power_get_freq = NULL;
-rte_power_set_freq_t rte_power_set_freq = NULL;
-rte_power_freq_change_t rte_power_freq_up = NULL;
-rte_power_freq_change_t rte_power_freq_down = NULL;
-rte_power_freq_change_t rte_power_freq_max = NULL;
-rte_power_freq_change_t rte_power_freq_min = NULL;
-rte_power_freq_change_t rte_power_turbo_status;
-rte_power_freq_change_t rte_power_freq_enable_turbo;
-rte_power_freq_change_t rte_power_freq_disable_turbo;
-rte_power_get_capabilities_t rte_power_get_capabilities;
-
-static void
-reset_power_function_ptrs(void)
+static RTE_TAILQ_HEAD(, rte_power_cpufreq_ops) cpufreq_ops_list =
+ TAILQ_HEAD_INITIALIZER(cpufreq_ops_list);
+
+const char *power_env_str[] = {
+ "not set",
+ "acpi",
+ "kvm-vm",
+ "pstate",
+ "cppc",
+ "amd-pstate"
+};
+
+/* register the ops struct in rte_power_cpufreq_ops, return 0 on success. */
+int
+rte_power_register_cpufreq_ops(struct rte_power_cpufreq_ops *driver_ops)
{
- rte_power_freqs = NULL;
- rte_power_get_freq = NULL;
- rte_power_set_freq = NULL;
- rte_power_freq_up = NULL;
- rte_power_freq_down = NULL;
- rte_power_freq_max = NULL;
- rte_power_freq_min = NULL;
- rte_power_turbo_status = NULL;
- rte_power_freq_enable_turbo = NULL;
- rte_power_freq_disable_turbo = NULL;
- rte_power_get_capabilities = NULL;
+ if (!driver_ops->init || !driver_ops->exit ||
+ !driver_ops->check_env_support || !driver_ops->get_avail_freqs ||
+ !driver_ops->get_freq || !driver_ops->set_freq ||
+ !driver_ops->freq_up || !driver_ops->freq_down ||
+ !driver_ops->freq_max || !driver_ops->freq_min ||
+ !driver_ops->turbo_status || !driver_ops->enable_turbo ||
+ !driver_ops->disable_turbo || !driver_ops->get_caps) {
+ POWER_LOG(ERR, "Missing callbacks while registering cpufreq ops");
+ return -1;
+ }
+
+ TAILQ_INSERT_TAIL(&cpufreq_ops_list, driver_ops, next);
+
+ return 0;
}
int
rte_power_check_env_supported(enum power_management_env env)
{
- switch (env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_check_supported();
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_check_supported();
- case PM_ENV_KVM_VM:
- return power_kvm_vm_check_supported();
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_check_supported();
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_check_supported();
- default:
- rte_errno = EINVAL;
- return -1;
- }
+ struct rte_power_cpufreq_ops *ops;
+
+ if (env >= RTE_DIM(power_env_str))
+ return 0;
+
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0)
+ return ops->check_env_support();
+
+ return 0;
}
int
rte_power_set_env(enum power_management_env env)
{
+ struct rte_power_cpufreq_ops *ops;
+ int ret = -1;
+
rte_spinlock_lock(&global_env_cfg_lock);
if (global_default_env != PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Power Management Environment already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
- }
-
- int ret = 0;
-
- if (env == PM_ENV_ACPI_CPUFREQ) {
- rte_power_freqs = power_acpi_cpufreq_freqs;
- rte_power_get_freq = power_acpi_cpufreq_get_freq;
- rte_power_set_freq = power_acpi_cpufreq_set_freq;
- rte_power_freq_up = power_acpi_cpufreq_freq_up;
- rte_power_freq_down = power_acpi_cpufreq_freq_down;
- rte_power_freq_min = power_acpi_cpufreq_freq_min;
- rte_power_freq_max = power_acpi_cpufreq_freq_max;
- rte_power_turbo_status = power_acpi_turbo_status;
- rte_power_freq_enable_turbo = power_acpi_enable_turbo;
- rte_power_freq_disable_turbo = power_acpi_disable_turbo;
- rte_power_get_capabilities = power_acpi_get_capabilities;
- } else if (env == PM_ENV_KVM_VM) {
- rte_power_freqs = power_kvm_vm_freqs;
- rte_power_get_freq = power_kvm_vm_get_freq;
- rte_power_set_freq = power_kvm_vm_set_freq;
- rte_power_freq_up = power_kvm_vm_freq_up;
- rte_power_freq_down = power_kvm_vm_freq_down;
- rte_power_freq_min = power_kvm_vm_freq_min;
- rte_power_freq_max = power_kvm_vm_freq_max;
- rte_power_turbo_status = power_kvm_vm_turbo_status;
- rte_power_freq_enable_turbo = power_kvm_vm_enable_turbo;
- rte_power_freq_disable_turbo = power_kvm_vm_disable_turbo;
- rte_power_get_capabilities = power_kvm_vm_get_capabilities;
- } else if (env == PM_ENV_PSTATE_CPUFREQ) {
- rte_power_freqs = power_pstate_cpufreq_freqs;
- rte_power_get_freq = power_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_pstate_disable_turbo;
- rte_power_get_capabilities = power_pstate_get_capabilities;
-
- } else if (env == PM_ENV_CPPC_CPUFREQ) {
- rte_power_freqs = power_cppc_cpufreq_freqs;
- rte_power_get_freq = power_cppc_cpufreq_get_freq;
- rte_power_set_freq = power_cppc_cpufreq_set_freq;
- rte_power_freq_up = power_cppc_cpufreq_freq_up;
- rte_power_freq_down = power_cppc_cpufreq_freq_down;
- rte_power_freq_min = power_cppc_cpufreq_freq_min;
- rte_power_freq_max = power_cppc_cpufreq_freq_max;
- rte_power_turbo_status = power_cppc_turbo_status;
- rte_power_freq_enable_turbo = power_cppc_enable_turbo;
- rte_power_freq_disable_turbo = power_cppc_disable_turbo;
- rte_power_get_capabilities = power_cppc_get_capabilities;
- } else if (env == PM_ENV_AMD_PSTATE_CPUFREQ) {
- rte_power_freqs = power_amd_pstate_cpufreq_freqs;
- rte_power_get_freq = power_amd_pstate_cpufreq_get_freq;
- rte_power_set_freq = power_amd_pstate_cpufreq_set_freq;
- rte_power_freq_up = power_amd_pstate_cpufreq_freq_up;
- rte_power_freq_down = power_amd_pstate_cpufreq_freq_down;
- rte_power_freq_min = power_amd_pstate_cpufreq_freq_min;
- rte_power_freq_max = power_amd_pstate_cpufreq_freq_max;
- rte_power_turbo_status = power_amd_pstate_turbo_status;
- rte_power_freq_enable_turbo = power_amd_pstate_enable_turbo;
- rte_power_freq_disable_turbo = power_amd_pstate_disable_turbo;
- rte_power_get_capabilities = power_amd_pstate_get_capabilities;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
- env);
- ret = -1;
- }
-
- if (ret == 0)
- global_default_env = env;
- else {
- global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ goto out;
}
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next)
+ if (strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) {
+ global_cpufreq_ops = ops;
+ global_default_env = env;
+ ret = 0;
+ goto out;
+ }
+
+ POWER_LOG(ERR, "Invalid Power Management Environment(%d) set",
+ env);
+out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
}
@@ -164,7 +94,7 @@ rte_power_unset_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
global_default_env = PM_ENV_NOT_SET;
- reset_power_function_ptrs();
+ global_cpufreq_ops = NULL;
rte_spinlock_unlock(&global_env_cfg_lock);
}
@@ -176,82 +106,122 @@ rte_power_get_env(void) {
int
rte_power_init(unsigned int lcore_id)
{
- int ret = -1;
+ struct rte_power_cpufreq_ops *ops;
+ uint8_t env;
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_init(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_init(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_init(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_init(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_init(lcore_id);
- default:
- POWER_LOG(INFO, "Env isn't set yet!");
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise ACPI cpufreq power management...");
- ret = power_acpi_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
- goto out;
- }
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_cpufreq_ops->init(lcore_id);
- POWER_LOG(INFO, "Attempting to initialise PSTAT power management...");
- ret = power_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_PSTATE_CPUFREQ);
- goto out;
- }
+ POWER_LOG(INFO, "Env isn't set yet!");
- POWER_LOG(INFO, "Attempting to initialise AMD PSTATE power management...");
- ret = power_amd_pstate_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_AMD_PSTATE_CPUFREQ);
- goto out;
+ /* Auto detect Environment */
+ RTE_TAILQ_FOREACH(ops, &cpufreq_ops_list, next) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s cpufreq power management...",
+ ops->name);
+ for (env = 0; env < RTE_DIM(power_env_str); env++) {
+ if ((strncmp(ops->name, power_env_str[env],
+ RTE_POWER_DRIVER_NAMESZ) == 0) &&
+ (ops->init(lcore_id) == 0)) {
+ rte_power_set_env(env);
+ return 0;
+ }
+ }
}
- POWER_LOG(INFO, "Attempting to initialise CPPC power management...");
- ret = power_cppc_cpufreq_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_CPPC_CPUFREQ);
- goto out;
- }
+ POWER_LOG(ERR,
+ "Unable to set Power Management Environment for lcore %u",
+ lcore_id);
- POWER_LOG(INFO, "Attempting to initialise VM power management...");
- ret = power_kvm_vm_init(lcore_id);
- if (ret == 0) {
- rte_power_set_env(PM_ENV_KVM_VM);
- goto out;
- }
- POWER_LOG(ERR, "Unable to set Power Management Environment for lcore "
- "%u", lcore_id);
-out:
- return ret;
+ return -1;
}
int
rte_power_exit(unsigned int lcore_id)
{
- switch (global_default_env) {
- case PM_ENV_ACPI_CPUFREQ:
- return power_acpi_cpufreq_exit(lcore_id);
- case PM_ENV_KVM_VM:
- return power_kvm_vm_exit(lcore_id);
- case PM_ENV_PSTATE_CPUFREQ:
- return power_pstate_cpufreq_exit(lcore_id);
- case PM_ENV_CPPC_CPUFREQ:
- return power_cppc_cpufreq_exit(lcore_id);
- case PM_ENV_AMD_PSTATE_CPUFREQ:
- return power_amd_pstate_cpufreq_exit(lcore_id);
- default:
- POWER_LOG(ERR, "Environment has not been set, unable to exit gracefully");
+ if (global_default_env != PM_ENV_NOT_SET)
+ return global_cpufreq_ops->exit(lcore_id);
+
+ POWER_LOG(ERR,
+ "Environment has not been set, unable to exit gracefully");
- }
return -1;
+}
+
+uint32_t
+rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t n)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_avail_freqs(lcore_id, freqs, n);
+}
+
+uint32_t
+rte_power_get_freq(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_freq(lcore_id);
+}
+
+uint32_t
+rte_power_set_freq(unsigned int lcore_id, uint32_t index)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->set_freq(lcore_id, index);
+}
+
+int
+rte_power_freq_up(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_up(lcore_id);
+}
+
+int
+rte_power_freq_down(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_down(lcore_id);
+}
+
+int
+rte_power_freq_max(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_max(lcore_id);
+}
+
+int
+rte_power_freq_min(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->freq_min(lcore_id);
+}
+
+int
+rte_power_turbo_status(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->turbo_status(lcore_id);
+}
+int
+rte_power_freq_enable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->enable_turbo(lcore_id);
+}
+
+int
+rte_power_freq_disable_turbo(unsigned int lcore_id)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->disable_turbo(lcore_id);
+}
+
+int
+rte_power_get_capabilities(unsigned int lcore_id,
+ struct rte_power_core_capabilities *caps)
+{
+ RTE_ASSERT(global_cpufreq_ops != NULL);
+ return global_cpufreq_ops->get_caps(lcore_id, caps);
}
diff --git a/lib/power/rte_power.h b/lib/power/rte_power.h
index 4fa4afe399..7d566551bd 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef _RTE_POWER_H
@@ -14,14 +15,21 @@
#include <rte_log.h>
#include <rte_power_guest_channel.h>
+#include "power_cpufreq.h"
+
#ifdef __cplusplus
extern "C" {
#endif
/* Power Management Environment State */
-enum power_management_env {PM_ENV_NOT_SET, PM_ENV_ACPI_CPUFREQ, PM_ENV_KVM_VM,
- PM_ENV_PSTATE_CPUFREQ, PM_ENV_CPPC_CPUFREQ,
- PM_ENV_AMD_PSTATE_CPUFREQ};
+enum power_management_env {
+ PM_ENV_NOT_SET = 0,
+ PM_ENV_ACPI_CPUFREQ,
+ PM_ENV_KVM_VM,
+ PM_ENV_PSTATE_CPUFREQ,
+ PM_ENV_CPPC_CPUFREQ,
+ PM_ENV_AMD_PSTATE_CPUFREQ
+};
/**
* Check if a specific power management environment type is supported on a
@@ -108,10 +116,7 @@ int rte_power_exit(unsigned int lcore_id);
* @return
* The number of available frequencies.
*/
-typedef uint32_t (*rte_power_freqs_t)(unsigned int lcore_id, uint32_t *freqs,
- uint32_t num);
-
-extern rte_power_freqs_t rte_power_freqs;
+uint32_t rte_power_freqs(unsigned int lcore_id, uint32_t *freqs, uint32_t num);
/**
* Return the current index of available frequencies of a specific lcore.
@@ -124,9 +129,7 @@ extern rte_power_freqs_t rte_power_freqs;
* @return
* The current index of available frequencies.
*/
-typedef uint32_t (*rte_power_get_freq_t)(unsigned int lcore_id);
-
-extern rte_power_get_freq_t rte_power_get_freq;
+uint32_t rte_power_get_freq(unsigned int lcore_id);
/**
* Set the new frequency for a specific lcore by indicating the index of
@@ -144,13 +147,12 @@ extern rte_power_get_freq_t rte_power_get_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_freq_t)(unsigned int lcore_id, uint32_t index);
-
-extern rte_power_set_freq_t rte_power_set_freq;
+uint32_t rte_power_set_freq(unsigned int lcore_id, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions. Review
- * each environments specific documentation for usage.
+ * Scale up the frequency of a specific lcore according to the available
+ * frequencies.
+ * Review each environments specific documentation for usage.
*
* @param lcore_id
* lcore id.
@@ -160,66 +162,92 @@ extern rte_power_set_freq_t rte_power_set_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_freq_change_t)(unsigned int lcore_id);
-
-/**
- * Scale up the frequency of a specific lcore according to the available
- * frequencies.
- * Review each environments specific documentation for usage.
- */
-extern rte_power_freq_change_t rte_power_freq_up;
+int rte_power_freq_up(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore according to the available
* frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_down;
+int rte_power_freq_down(unsigned int lcore_id);
/**
* Scale up the frequency of a specific lcore to the highest according to the
* available frequencies.
* Review each environments specific documentation for usage.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_max;
+int rte_power_freq_max(unsigned int lcore_id);
/**
* Scale down the frequency of a specific lcore to the lowest according to the
* available frequencies.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_min;
+int rte_power_freq_min(unsigned int lcore_id);
/**
* Query the Turbo Boost status of a specific lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 turbo boost enabled.
+ * - 0 turbo boost disabled.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_turbo_status;
+int rte_power_turbo_status(unsigned int lcore_id);
/**
* Enable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+int rte_power_freq_enable_turbo(unsigned int lcore_id);
/**
* Disable Turbo Boost for this lcore.
* Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
*/
-extern rte_power_freq_change_t rte_power_freq_disable_turbo;
-
-/**
- * Power capabilities summary.
- */
-struct rte_power_core_capabilities {
- union {
- uint64_t capabilities;
- struct {
- uint64_t turbo:1; /**< Turbo can be enabled. */
- uint64_t priority:1; /**< SST-BF high freq core */
- };
- };
-};
+int rte_power_freq_disable_turbo(unsigned int lcore_id);
/**
* Returns power capabilities for a specific lcore.
@@ -235,11 +263,9 @@ struct rte_power_core_capabilities {
* - 0 on success.
* - Negative on error.
*/
-typedef int (*rte_power_get_capabilities_t)(unsigned int lcore_id,
+int rte_power_get_capabilities(unsigned int lcore_id,
struct rte_power_core_capabilities *caps);
-extern rte_power_get_capabilities_t rte_power_get_capabilities;
-
#ifdef __cplusplus
}
#endif
diff --git a/lib/power/version.map b/lib/power/version.map
index c9a226614e..9c1ed4d9d6 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -52,3 +52,17 @@ EXPERIMENTAL {
rte_power_uncore_freqs;
rte_power_unset_uncore_env;
};
+
+INTERNAL {
+ global:
+
+ rte_power_register_cpufreq_ops;
+ rte_power_logtype;
+ cpufreq_check_scaling_driver;
+ power_get_lcore_mapped_cpu_id;
+ power_set_governor;
+ open_core_sysfs_file;
+ read_core_sysfs_u32;
+ read_core_sysfs_s;
+ write_core_sysfs_s;
+};
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 2/6] power: refactor uncore power management library
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 1/6] power: refactor core " Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-11-16 0:55 ` Stephen Hemminger
2024-10-28 19:55 ` [PATCH v10 3/6] test/power: removed function pointer validations Sivaprasad Tummala
` (4 subsequent siblings)
6 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.
This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
v10:
- removed unused header inclusions
- fixed rte_power_register_uncore_ops API description
v9:
- documentation update
v8:
- removed c++ guards for internal header files
- renamed rte_power_uncore_ops.h for naming convention
v7:
- fixed build error with aarch32 gcc cross compilation
v6:
- fixed compilation error with symbol export in API
v5:
- fixed build errors for risc-v/ppc targets
v4:
- fixed build error with RTE_ASSERT
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
---
doc/guides/prog_guide/power_man.rst | 10 +-
.../power/intel_uncore/intel_uncore.c | 18 +-
.../power/intel_uncore/intel_uncore.h | 9 +-
drivers/power/intel_uncore/meson.build | 6 +
drivers/power/meson.build | 3 +-
lib/power/meson.build | 2 +-
lib/power/power_uncore_ops.h | 244 +++++++++++++++++
lib/power/rte_power_uncore.c | 259 +++++++++---------
lib/power/rte_power_uncore.h | 61 ++---
lib/power/version.map | 1 +
10 files changed, 440 insertions(+), 173 deletions(-)
rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%)
rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%)
create mode 100644 drivers/power/intel_uncore/meson.build
create mode 100644 lib/power/power_uncore_ops.h
diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..1810ecf93b 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -191,8 +191,8 @@ API Overview for Ethernet PMD Power Management
* **Set Scaling Max Freq**: Set the maximum frequency (kHz) to be used in Frequency
Scaling mode.
-Intel Uncore API
-----------------
+Uncore API
+----------
Abstract
~~~~~~~~
@@ -211,10 +211,10 @@ which was added in 5.6.
This manipulates the context of MSR 0x620,
which sets min/max of the uncore for the SKU.
-API Overview for Intel Uncore
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Uncore API Overview
+~~~~~~~~~~~~~~~~~~~
-Overview of each function in the Intel Uncore API,
+Overview of each function in the Uncore API,
with explanation of what they do.
Each function should not be called in the fast path.
diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
#include <rte_memcpy.h>
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
#include "power_common.h"
#define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
return count;
}
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+ .name = "intel-uncore",
+ .init = power_intel_uncore_init,
+ .exit = power_intel_uncore_exit,
+ .get_avail_freqs = power_intel_uncore_freqs,
+ .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+ .get_num_dies = power_intel_uncore_get_num_dies,
+ .get_num_freqs = power_intel_uncore_get_num_freqs,
+ .get_freq = power_get_intel_uncore_freq,
+ .set_freq = power_set_intel_uncore_freq,
+ .freq_max = power_intel_uncore_freq_max,
+ .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..b9343bd2ea 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,16 +2,15 @@
* Copyright(c) 2022 Intel Corporation
*/
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef _INTEL_UNCORE_H
+#define _INTEL_UNCORE_H
/**
* @file
* RTE Intel Uncore Frequency Management
*/
-#include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -223,4 +222,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
}
#endif
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* _INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 0000000000..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
'amd_pstate',
'cppc',
'kvm_vm',
- 'pstate'
+ 'pstate',
+ 'intel_uncore'
]
std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index dd8e4393ac..5fa5d062e3 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,13 +13,13 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'power_intel_uncore.c',
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'power_cpufreq.h',
+ 'power_uncore_ops.h',
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
diff --git a/lib/power/power_uncore_ops.h b/lib/power/power_uncore_ops.h
new file mode 100644
index 0000000000..eded2f7f1c
--- /dev/null
+++ b/lib/power/power_uncore_ops.h
@@ -0,0 +1,244 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef _POWER_UNCORE_OPS_H
+#define _POWER_UNCORE_OPS_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include <rte_compat.h>
+#include <rte_common.h>
+
+#define RTE_POWER_UNCORE_DRIVER_NAMESZ 24
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_init_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_exit_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num);
+/**
+ * Function pointers for generic frequency change functions.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
+typedef void (*rte_power_uncore_driver_cb_t)(void);
+
+/** Structure defining uncore power operations structure */
+struct rte_power_uncore_ops {
+ RTE_TAILQ_ENTRY(rte_power_uncore_ops) next; /**< Next in list. */
+ char name[RTE_POWER_UNCORE_DRIVER_NAMESZ]; /**< power mgmt driver. */
+ rte_power_uncore_driver_cb_t cb; /**< Driver specific callbacks. */
+ rte_power_uncore_init_t init; /**< Initialize power management. */
+ rte_power_uncore_exit_t exit; /**< Exit power management. */
+ rte_power_uncore_get_num_pkgs_t get_num_pkgs;
+ rte_power_uncore_get_num_dies_t get_num_dies;
+ rte_power_uncore_get_num_freqs_t get_num_freqs; /**< Number of available frequencies. */
+ rte_power_uncore_freqs_t get_avail_freqs; /**< Get the available frequencies. */
+ rte_power_get_uncore_freq_t get_freq; /**< Get frequency index. */
+ rte_power_set_uncore_freq_t set_freq; /**< Set frequency index. */
+ rte_power_uncore_freq_change_t freq_max; /**< Scale up frequency to highest. */
+ rte_power_uncore_freq_change_t freq_min; /**< Scale up frequency to lowest. */
+};
+
+/**
+ * Register power uncore frequency operations.
+ * @param ops
+ * Pointer to an ops structure to register.
+ * @return
+ * - 0: Success.
+ * - Negative on error.
+ */
+__rte_internal
+int rte_power_register_uncore_ops(struct rte_power_uncore_ops *ops);
+
+/**
+ * Macro to statically register the ops of an uncore driver.
+ */
+#define RTE_POWER_REGISTER_UNCORE_OPS(ops) \
+RTE_INIT(power_hdlr_init_uncore_##ops) \
+{ \
+ rte_power_register_uncore_ops(&ops); \
+}
+
+#endif /* _POWER_UNCORE_OPS_H */
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48c75a5da0..741e067932 100644
--- a/lib/power/rte_power_uncore.c
+++ b/lib/power/rte_power_uncore.c
@@ -3,107 +3,58 @@
* Copyright(c) 2023 AMD Corporation
*/
-#include <errno.h>
-
-#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_debug.h>
-#include "power_common.h"
#include "rte_power_uncore.h"
-#include "power_intel_uncore.h"
+#include "power_common.h"
-enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static enum rte_uncore_power_mgmt_env global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
+static struct rte_power_uncore_ops *global_uncore_ops;
static rte_spinlock_t global_env_cfg_lock = RTE_SPINLOCK_INITIALIZER;
+static RTE_TAILQ_HEAD(, rte_power_uncore_ops) uncore_ops_list =
+ TAILQ_HEAD_INITIALIZER(uncore_ops_list);
-static uint32_t
-power_get_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_set_dummy_uncore_freq(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused, uint32_t index __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_max(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
-
-static int
-power_dummy_uncore_freq_min(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+const char *uncore_env_str[] = {
+ "not set",
+ "auto-detect",
+ "intel-uncore",
+ "amd-hsmp"
+};
-static int
-power_dummy_uncore_freqs(unsigned int pkg __rte_unused, unsigned int die __rte_unused,
- uint32_t *freqs __rte_unused, uint32_t num __rte_unused)
+/* register the ops struct in rte_power_uncore_ops, return 0 on success. */
+int
+rte_power_register_uncore_ops(struct rte_power_uncore_ops *driver_ops)
{
- return 0;
-}
+ if (!driver_ops->init || !driver_ops->exit || !driver_ops->get_num_pkgs ||
+ !driver_ops->get_num_dies || !driver_ops->get_num_freqs ||
+ !driver_ops->get_avail_freqs || !driver_ops->get_freq ||
+ !driver_ops->set_freq || !driver_ops->freq_max ||
+ !driver_ops->freq_min) {
+ POWER_LOG(ERR, "Missing callbacks while registering power ops");
+ return -1;
+ }
-static int
-power_dummy_uncore_get_num_freqs(unsigned int pkg __rte_unused,
- unsigned int die __rte_unused)
-{
- return 0;
-}
+ if (driver_ops->cb)
+ driver_ops->cb();
-static unsigned int
-power_dummy_uncore_get_num_pkgs(void)
-{
- return 0;
-}
+ TAILQ_INSERT_TAIL(&uncore_ops_list, driver_ops, next);
-static unsigned int
-power_dummy_uncore_get_num_dies(unsigned int pkg __rte_unused)
-{
return 0;
}
-/* function pointers */
-rte_power_get_uncore_freq_t rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
-rte_power_set_uncore_freq_t rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
-rte_power_uncore_freq_change_t rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
-rte_power_uncore_freqs_t rte_power_uncore_freqs = power_dummy_uncore_freqs;
-rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
-rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
-rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-
-static void
-reset_power_uncore_function_ptrs(void)
-{
- rte_power_get_uncore_freq = power_get_dummy_uncore_freq;
- rte_power_set_uncore_freq = power_set_dummy_uncore_freq;
- rte_power_uncore_freq_max = power_dummy_uncore_freq_max;
- rte_power_uncore_freq_min = power_dummy_uncore_freq_min;
- rte_power_uncore_freqs = power_dummy_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_dummy_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_dummy_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_dummy_uncore_get_num_dies;
-}
-
int
rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
{
- int ret;
+ int ret = -1;
+ struct rte_power_uncore_ops *ops;
rte_spinlock_lock(&global_env_cfg_lock);
- if (default_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
+ if (global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) {
POWER_LOG(ERR, "Uncore Power Management Env already set.");
- rte_spinlock_unlock(&global_env_cfg_lock);
- return -1;
+ goto out;
}
if (env == RTE_UNCORE_PM_ENV_AUTO_DETECT)
@@ -113,23 +64,20 @@ rte_power_set_uncore_env(enum rte_uncore_power_mgmt_env env)
*/
env = RTE_UNCORE_PM_ENV_INTEL_UNCORE;
- ret = 0;
- if (env == RTE_UNCORE_PM_ENV_INTEL_UNCORE) {
- rte_power_get_uncore_freq = power_get_intel_uncore_freq;
- rte_power_set_uncore_freq = power_set_intel_uncore_freq;
- rte_power_uncore_freq_min = power_intel_uncore_freq_min;
- rte_power_uncore_freq_max = power_intel_uncore_freq_max;
- rte_power_uncore_freqs = power_intel_uncore_freqs;
- rte_power_uncore_get_num_freqs = power_intel_uncore_get_num_freqs;
- rte_power_uncore_get_num_pkgs = power_intel_uncore_get_num_pkgs;
- rte_power_uncore_get_num_dies = power_intel_uncore_get_num_dies;
- } else {
- POWER_LOG(ERR, "Invalid Power Management Environment(%d) set", env);
- ret = -1;
- goto out;
- }
+ if (env <= RTE_DIM(uncore_env_str)) {
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ global_uncore_env = env;
+ global_uncore_ops = ops;
+ ret = 0;
+ goto out;
+ }
+ POWER_LOG(ERR, "Power Management (%s) not supported",
+ uncore_env_str[env]);
+ } else
+ POWER_LOG(ERR, "Invalid Power Management Environment");
- default_uncore_env = env;
out:
rte_spinlock_unlock(&global_env_cfg_lock);
return ret;
@@ -139,43 +87,43 @@ void
rte_power_unset_uncore_env(void)
{
rte_spinlock_lock(&global_env_cfg_lock);
- default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
- reset_power_uncore_function_ptrs();
+ global_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET;
rte_spinlock_unlock(&global_env_cfg_lock);
}
enum rte_uncore_power_mgmt_env
rte_power_get_uncore_env(void)
{
- return default_uncore_env;
+ return global_uncore_env;
}
int
rte_power_uncore_init(unsigned int pkg, unsigned int die)
{
int ret = -1;
-
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_init(pkg, die);
- default:
- POWER_LOG(INFO, "Uncore Env isn't set yet!");
- break;
- }
-
- /* Auto detect Environment */
- POWER_LOG(INFO, "Attempting to initialise Intel Uncore power mgmt...");
- ret = power_intel_uncore_init(pkg, die);
- if (ret == 0) {
- rte_power_set_uncore_env(RTE_UNCORE_PM_ENV_INTEL_UNCORE);
- goto out;
- }
-
- if (default_uncore_env == RTE_UNCORE_PM_ENV_NOT_SET) {
- POWER_LOG(ERR, "Unable to set Power Management Environment "
- "for package %u Die %u", pkg, die);
- ret = 0;
- }
+ struct rte_power_uncore_ops *ops;
+ uint8_t env;
+
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ (global_uncore_env != RTE_UNCORE_PM_ENV_AUTO_DETECT))
+ return global_uncore_ops->init(pkg, die);
+
+ /* Auto Detect Environment */
+ RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
+ if (ops) {
+ POWER_LOG(INFO,
+ "Attempting to initialise %s power management...",
+ ops->name);
+ ret = ops->init(pkg, die);
+ if (ret == 0) {
+ for (env = 0; env < RTE_DIM(uncore_env_str); env++)
+ if (strncmp(ops->name, uncore_env_str[env],
+ RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
+ rte_power_set_uncore_env(env);
+ goto out;
+ }
+ }
+ }
out:
return ret;
}
@@ -183,12 +131,69 @@ rte_power_uncore_init(unsigned int pkg, unsigned int die)
int
rte_power_uncore_exit(unsigned int pkg, unsigned int die)
{
- switch (default_uncore_env) {
- case RTE_UNCORE_PM_ENV_INTEL_UNCORE:
- return power_intel_uncore_exit(pkg, die);
- default:
- POWER_LOG(ERR, "Uncore Env has not been set, unable to exit gracefully");
- break;
- }
+ if ((global_uncore_env != RTE_UNCORE_PM_ENV_NOT_SET) &&
+ global_uncore_ops)
+ return global_uncore_ops->exit(pkg, die);
+
+ POWER_LOG(ERR,
+ "Uncore Env has not been set, unable to exit gracefully");
+
return -1;
}
+
+uint32_t
+rte_power_get_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_freq(pkg, die);
+}
+
+int
+rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->set_freq(pkg, die, index);
+}
+
+int
+rte_power_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_max(pkg, die);
+}
+
+int
+rte_power_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->freq_min(pkg, die);
+}
+
+int
+rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
+ uint32_t *freqs, uint32_t num)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_avail_freqs(pkg, die, freqs, num);
+}
+
+int
+rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_freqs(pkg, die);
+}
+
+unsigned int
+rte_power_uncore_get_num_pkgs(void)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_pkgs();
+}
+
+unsigned int
+rte_power_uncore_get_num_dies(unsigned int pkg)
+{
+ RTE_ASSERT(global_uncore_ops != NULL);
+ return global_uncore_ops->get_num_dies(pkg);
+}
diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
index 99859042dd..67d55cbf96 100644
--- a/lib/power/rte_power_uncore.h
+++ b/lib/power/rte_power_uncore.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2022 Intel Corporation
- * Copyright(c) 2023 AMD Corporation
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
#ifndef RTE_POWER_UNCORE_H
@@ -11,8 +11,7 @@
* RTE Uncore Frequency Management
*/
-#include <rte_compat.h>
-#include "rte_power.h"
+#include "power_uncore_ops.h"
#ifdef __cplusplus
extern "C" {
@@ -116,9 +115,7 @@ rte_power_uncore_exit(unsigned int pkg, unsigned int die);
* The current index of available frequencies.
* If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
*/
-typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+uint32_t rte_power_get_uncore_freq(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -141,12 +138,14 @@ extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, uint32_t index);
-
-extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+int rte_power_set_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
/**
- * Function pointer definition for generic frequency change functions.
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
*
* @param pkg
* Package number.
@@ -160,16 +159,7 @@ extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
* - 0 on success without frequency changed.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int die);
-
-/**
- * Set minimum and maximum uncore frequency for specified die on a package
- * to maximum value according to the available frequencies.
- * It should be protected outside of this function for threadsafe.
- *
- * This function should NOT be called in the fast path.
- */
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
+int rte_power_uncore_freq_max(unsigned int pkg, unsigned int die);
/**
* Set minimum and maximum uncore frequency for specified die on a package
@@ -177,8 +167,20 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_max;
* It should be protected outside of this function for threadsafe.
*
* This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
*/
-extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
+int rte_power_uncore_freq_min(unsigned int pkg, unsigned int die);
/**
* Return the list of available frequencies in the index array.
@@ -200,11 +202,10 @@ extern rte_power_uncore_freq_change_t rte_power_uncore_freq_min;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_freqs_t)(unsigned int pkg, unsigned int die,
+__rte_experimental
+int rte_power_uncore_freqs(unsigned int pkg, unsigned int die,
uint32_t *freqs, uint32_t num);
-extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
-
/**
* Return the list length of available frequencies in the index array.
*
@@ -221,9 +222,7 @@ extern rte_power_uncore_freqs_t rte_power_uncore_freqs;
* - The number of available index's in frequency array.
* - Negative on error.
*/
-typedef int (*rte_power_uncore_get_num_freqs_t)(unsigned int pkg, unsigned int die);
-
-extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
+int rte_power_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
/**
* Return the number of packages (CPUs) on a system
@@ -235,9 +234,7 @@ extern rte_power_uncore_get_num_freqs_t rte_power_uncore_get_num_freqs;
* - Zero on error.
* - Number of package on system on success.
*/
-typedef unsigned int (*rte_power_uncore_get_num_pkgs_t)(void);
-
-extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
+unsigned int rte_power_uncore_get_num_pkgs(void);
/**
* Return the number of dies for pakckages (CPUs) specified
@@ -253,9 +250,7 @@ extern rte_power_uncore_get_num_pkgs_t rte_power_uncore_get_num_pkgs;
* - Zero on error.
* - Number of dies for package on sucecss.
*/
-typedef unsigned int (*rte_power_uncore_get_num_dies_t)(unsigned int pkg);
-
-extern rte_power_uncore_get_num_dies_t rte_power_uncore_get_num_dies;
+unsigned int rte_power_uncore_get_num_dies(unsigned int pkg);
#ifdef __cplusplus
}
diff --git a/lib/power/version.map b/lib/power/version.map
index 9c1ed4d9d6..f442329bbc 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -57,6 +57,7 @@ INTERNAL {
global:
rte_power_register_cpufreq_ops;
+ rte_power_register_uncore_ops;
rte_power_logtype;
cpufreq_check_scaling_driver;
power_get_lcore_mapped_cpu_id;
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 3/6] test/power: removed function pointer validations
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 1/6] power: refactor core " Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 2/6] power: refactor uncore " Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-11-10 10:11 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
` (3 subsequent siblings)
6 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.
v2:
- removed function pointer validation in l3fwd-power app.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 95 -----------------------------------
app/test/test_power_cpufreq.c | 52 -------------------
app/test/test_power_kvm_vm.c | 36 -------------
examples/l3fwd-power/main.c | 12 ++---
4 files changed, 4 insertions(+), 191 deletions(-)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
#include <rte_power.h>
-static int
-check_function_ptrs(void)
-{
- enum power_management_env env = rte_power_get_env();
-
- const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
- const char *inject_not_string1 = not_null_expected ? " not" : "";
- const char *inject_not_string2 = not_null_expected ? "" : " not";
-
- if ((rte_power_freqs == NULL) == not_null_expected) {
- printf("rte_power_freqs should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_freq == NULL) == not_null_expected) {
- printf("rte_power_get_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_set_freq == NULL) == not_null_expected) {
- printf("rte_power_set_freq should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_up == NULL) == not_null_expected) {
- printf("rte_power_freq_up should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_down == NULL) == not_null_expected) {
- printf("rte_power_freq_down should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_max == NULL) == not_null_expected) {
- printf("rte_power_freq_max should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_min == NULL) == not_null_expected) {
- printf("rte_power_freq_min should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_turbo_status == NULL) == not_null_expected) {
- printf("rte_power_turbo_status should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_enable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
- printf("rte_power_freq_disable_turbo should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
- if ((rte_power_get_capabilities == NULL) == not_null_expected) {
- printf("rte_power_get_capabilities should%s be NULL, environment has%s been "
- "initialised\n", inject_not_string1,
- inject_not_string2);
- return -1;
- }
-
- return 0;
-}
-
static int
test_power(void)
{
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)
return -1;
}
- /* Verify that function pointers are NOT NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
rte_power_unset_env();
- /* Verify that function pointers are NULL */
- if (check_function_ptrs() < 0)
- goto fail_all;
-
}
return 0;
-fail_all:
- rte_power_unset_env();
- return -1;
}
#endif
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index edbd34424e..f4522747d5 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -534,58 +534,6 @@ test_power_cpufreq(void)
goto fail_all;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- goto fail_all;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_turbo_status == NULL) {
- printf("rte_power_turbo_status should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_enable_turbo == NULL) {
- printf("rte_power_freq_enable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
- if (rte_power_freq_disable_turbo == NULL) {
- printf("rte_power_freq_disable_turbo should not be NULL, environment has not "
- "been initialised\n");
- goto fail_all;
- }
-
ret = rte_power_exit(TEST_POWER_LCORE_ID);
if (ret < 0) {
printf("Cannot exit power management for lcore %u\n",
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index 464e06002e..a7d104e973 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -47,42 +47,6 @@ test_power_kvm_vm(void)
return -1;
}
- /* verify that function pointers are not NULL */
- if (rte_power_freqs == NULL) {
- printf("rte_power_freqs should not be NULL, environment has not been "
- "initialised\n");
- return -1;
- }
- if (rte_power_get_freq == NULL) {
- printf("rte_power_get_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_set_freq == NULL) {
- printf("rte_power_set_freq should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_up == NULL) {
- printf("rte_power_freq_up should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_down == NULL) {
- printf("rte_power_freq_down should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_max == NULL) {
- printf("rte_power_freq_max should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
- if (rte_power_freq_min == NULL) {
- printf("rte_power_freq_min should not be NULL, environment has not "
- "been initialised\n");
- return -1;
- }
/* Test initialisation of an out of bounds lcore */
ret = rte_power_init(TEST_POWER_VM_LCORE_OUT_OF_BOUNDS);
if (ret != -1) {
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 2bb6b092c3..6bd76515e6 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -440,8 +440,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* check whether need to scale down frequency a step if it sleep a lot.
*/
if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
@@ -449,8 +448,7 @@ power_timer_cb(__rte_unused struct rte_timer *tim,
* scale down a step if average packet per iteration less
* than expectation.
*/
- if (rte_power_freq_down)
- rte_power_freq_down(lcore_id);
+ rte_power_freq_down(lcore_id);
}
/**
@@ -1344,11 +1342,9 @@ main_legacy_loop(__rte_unused void *dummy)
}
if (lcore_scaleup_hint == FREQ_HIGHEST) {
- if (rte_power_freq_max)
- rte_power_freq_max(lcore_id);
+ rte_power_freq_max(lcore_id);
} else if (lcore_scaleup_hint == FREQ_HIGHER) {
- if (rte_power_freq_up)
- rte_power_freq_up(lcore_id);
+ rte_power_freq_up(lcore_id);
}
} else {
/**
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 4/6] drivers/power: uncore support for AMD EPYC processors
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
` (2 preceding siblings ...)
2024-10-28 19:55 ` [PATCH v10 3/6] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-11-10 10:52 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 5/6] maintainers: update for drivers/power Sivaprasad Tummala
` (2 subsequent siblings)
6 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.
v9:
- documentation update
v2:
- fixed typo in comments section.
- added fabric frequency get support for legacy platforms.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
doc/guides/prog_guide/power_man.rst | 14 ++
drivers/power/amd_uncore/amd_uncore.c | 329 ++++++++++++++++++++++++++
drivers/power/amd_uncore/amd_uncore.h | 225 ++++++++++++++++++
drivers/power/amd_uncore/meson.build | 20 ++
drivers/power/meson.build | 1 +
5 files changed, 589 insertions(+)
create mode 100644 drivers/power/amd_uncore/amd_uncore.c
create mode 100644 drivers/power/amd_uncore/amd_uncore.h
create mode 100644 drivers/power/amd_uncore/meson.build
diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index 1810ecf93b..b06cc36438 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -203,6 +203,8 @@ to achieve high performance: L3 cache, on-die memory controller, etc.
Significant power savings can be achieved by reducing the uncore frequency
to its lowest value.
+Intel Uncore
+~~~~~~~~~~~~
The Linux kernel provides the driver "intel-uncore-frequency"
to control the uncore frequency limits for x86 platform.
The driver is available from kernel version 5.6 and above.
@@ -211,6 +213,18 @@ which was added in 5.6.
This manipulates the context of MSR 0x620,
which sets min/max of the uncore for the SKU.
+AMD EPYC Uncore
+~~~~~~~~~~~~~~~
+On AMD EPYC platforms, the Host System Management Port (HSMP) kernel module
+facilitates user-level access to HSMP mailboxes, which are implemented by
+the firmware in the System Management Unit (SMU).
+The AMD HSMP driver is available starting from kernel version 5.18.
+Please ensure that CONFIG_AMD_HSMP is enabled in your kernel configuration.
+
+Additionally, the EPYC System Management Interface In-band Library for Linux
+offers essential APIs, enabling user-space software to effectively manage
+system functions.
+
Uncore API Overview
~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/power/amd_uncore/amd_uncore.c b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 0000000000..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <errno.h>
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include <rte_memcpy.h>
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct __rte_cache_aligned uncore_power_info {
+ unsigned int die; /* Core die id */
+ unsigned int pkg; /* Package id */
+ uint32_t freqs[RTE_MAX_UNCORE_FREQS]; /* Frequency array */
+ uint32_t nb_freqs; /* Number of available freqs */
+ uint32_t curr_idx; /* Freq index in freqs array */
+ uint32_t max_freq; /* System max uncore freq */
+ uint32_t min_freq; /* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+ int ret;
+
+ if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+ POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+ "should be less than %u", idx, ui->nb_freqs);
+ return -1;
+ }
+
+ ret = esmi_apb_disable(ui->pkg, idx);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+ idx, ui->pkg);
+ return -1;
+ }
+
+ POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+ idx, ui->pkg, ui->die);
+
+ /* write the minimum value first if the target freq is less than current max */
+ ui->curr_idx = idx;
+
+ return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->max_freq = 1800000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->max_freq = 1600000; /* Hz */
+ ui->min_freq = 1200000; /* Hz */
+ }
+
+ return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+ ui->nb_freqs = 3;
+ if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+ POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+ ui->nb_freqs);
+ return -1;
+ }
+
+ /* Generate the uncore freq bucket array. */
+ switch (hsmp_proto_ver) {
+ case HSMP_PROTO_VER5:
+ ui->freqs[0] = 1800000;
+ ui->freqs[1] = 1440000;
+ ui->freqs[2] = 1200000;
+ break;
+ case HSMP_PROTO_VER2:
+ default:
+ ui->freqs[0] = 1600000;
+ ui->freqs[1] = 1333000;
+ ui->freqs[2] = 1200000;
+ }
+
+ POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+ ui->num_uncore_freqs, ui->pkg, ui->die);
+
+ return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+ unsigned int max_pkgs, max_dies;
+ max_pkgs = power_amd_uncore_get_num_pkgs();
+ if (max_pkgs == 0)
+ return -1;
+ if (pkg >= max_pkgs) {
+ POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+ pkg, max_pkgs);
+ return -1;
+ }
+
+ max_dies = power_amd_uncore_get_num_dies(pkg);
+ if (max_dies == 0)
+ return -1;
+ if (die >= max_dies) {
+ POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+ die, max_dies);
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+ if (esmi_init() == ESMI_SUCCESS) {
+ if (esmi_hsmp_proto_ver_get(&hsmp_proto_ver) ==
+ ESMI_SUCCESS)
+ esmi_initialized = 1;
+ }
+}
+
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+ int ret;
+
+ if (!esmi_initialized) {
+ ret = esmi_init();
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "ESMI Not initialized, drivers not found");
+ return -1;
+ }
+ ret = esmi_hsmp_proto_ver_get(&hsmp_proto_ver);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(DEBUG, "HSMP Proto Version Get failed with "
+ "error %s", esmi_get_err_msg(ret));
+ esmi_exit();
+ return -1;
+ }
+ esmi_initialized = 1;
+ }
+
+ ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->die = die;
+ ui->pkg = pkg;
+
+ /* Init for setting uncore die frequency */
+ if (power_init_for_setting_uncore_freq(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot init for setting uncore frequency for "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ /* Get the available frequencies */
+ if (power_get_available_uncore_freqs(ui) < 0) {
+ POWER_LOG(DEBUG, "Cannot get available uncore frequencies of "
+ "pkg %02u die %02u", pkg, die);
+ return -1;
+ }
+
+ return 0;
+}
+
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ ui = &uncore_info[pkg][die];
+ ui->nb_freqs = 0;
+
+ if (esmi_initialized) {
+ esmi_exit();
+ esmi_initialized = 0;
+ }
+
+ return 0;
+}
+
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].curr_idx;
+}
+
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), index);
+}
+
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), 0);
+}
+
+
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ struct uncore_power_info *ui = &uncore_info[pkg][die];
+
+ return set_uncore_freq_internal(&(uncore_info[pkg][die]), ui->nb_freqs - 1);
+}
+
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die, uint32_t *freqs, uint32_t num)
+{
+ struct uncore_power_info *ui;
+
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ if (freqs == NULL) {
+ POWER_LOG(ERR, "NULL buffer supplied");
+ return 0;
+ }
+
+ ui = &uncore_info[pkg][die];
+ if (num < ui->nb_freqs) {
+ POWER_LOG(ERR, "Buffer size is not enough");
+ return 0;
+ }
+ rte_memcpy(freqs, ui->freqs, ui->nb_freqs * sizeof(uint32_t));
+
+ return ui->nb_freqs;
+}
+
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die)
+{
+ int ret = check_pkg_die_values(pkg, die);
+ if (ret < 0)
+ return -1;
+
+ return uncore_info[pkg][die].nb_freqs;
+}
+
+unsigned int
+power_amd_uncore_get_num_pkgs(void)
+{
+ uint32_t num_pkgs = 0;
+ int ret;
+
+ if (esmi_initialized) {
+ ret = esmi_number_of_sockets_get(&num_pkgs);
+ if (ret != ESMI_SUCCESS) {
+ POWER_LOG(ERR, "Failed to get number of sockets");
+ num_pkgs = 0;
+ }
+ }
+ return num_pkgs;
+}
+
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg)
+{
+ if (pkg >= power_amd_uncore_get_num_pkgs()) {
+ POWER_LOG(ERR, "Invalid package ID");
+ return 0;
+ }
+
+ return 1;
+}
+
+static struct rte_power_uncore_ops amd_uncore_ops = {
+ .name = "amd-hsmp",
+ .cb = power_amd_uncore_esmi_init,
+ .init = power_amd_uncore_init,
+ .exit = power_amd_uncore_exit,
+ .get_avail_freqs = power_amd_uncore_freqs,
+ .get_num_pkgs = power_amd_uncore_get_num_pkgs,
+ .get_num_dies = power_amd_uncore_get_num_dies,
+ .get_num_freqs = power_amd_uncore_get_num_freqs,
+ .get_freq = power_get_amd_uncore_freq,
+ .set_freq = power_set_amd_uncore_freq,
+ .freq_max = power_amd_uncore_freq_max,
+ .freq_min = power_amd_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(amd_uncore_ops);
diff --git a/drivers/power/amd_uncore/amd_uncore.h b/drivers/power/amd_uncore/amd_uncore.h
new file mode 100644
index 0000000000..a142034479
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.h
@@ -0,0 +1,225 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#ifndef POWER_AMD_UNCORE_H
+#define POWER_AMD_UNCORE_H
+
+/**
+ * @file
+ * RTE AMD Uncore Frequency Management
+ */
+
+#include "power_uncore_ops.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * The current index of available frequencies.
+ * If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+uint32_t
+power_get_amd_uncore_freq(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ * The index of available frequencies.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_set_amd_uncore_freq(unsigned int pkg, unsigned int die, uint32_t index);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_max(unsigned int pkg, unsigned int die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to minimum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - 1 on success with frequency changed.
+ * - 0 on success without frequency changed.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freq_min(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the list of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ * @param freqs
+ * The buffer array to save the frequencies.
+ * @param num
+ * The number of frequencies to get.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_freqs(unsigned int pkg, unsigned int die,
+ unsigned int *freqs, unsigned int num);
+
+/**
+ * Return the list length of available frequencies in the index array.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ * @param die
+ * Die number.
+ * Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ * - The number of available index's in frequency array.
+ * - Negative on error.
+ */
+int
+power_amd_uncore_get_num_freqs(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the number of packages (CPUs) on a system
+ * by parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of package on system on success.
+ */
+unsigned int
+power_amd_uncore_get_num_pkgs(void);
+
+/**
+ * Return the number of dies for pakckages (CPUs) specified
+ * from parsing the uncore sysfs directory.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ * Package number.
+ * Each physical CPU in a system is referred to as a package.
+ *
+ * @return
+ * - Zero on error.
+ * - Number of dies for package on sucecss.
+ */
+unsigned int
+power_amd_uncore_get_num_dies(unsigned int pkg);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* POWER_INTEL_UNCORE_H */
diff --git a/drivers/power/amd_uncore/meson.build b/drivers/power/amd_uncore/meson.build
new file mode 100644
index 0000000000..8cbab47b01
--- /dev/null
+++ b/drivers/power/amd_uncore/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+if not is_linux
+ build = false
+ reason = 'only supported on Linux'
+ subdir_done()
+endif
+
+ESMI_header = '#include<e_smi/e_smi.h>'
+lib = cc.find_library('e_smi64', required: false)
+if not lib.found()
+ build = false
+ reason = 'missing dependency, "libe_smi"'
+else
+ ext_deps += lib
+endif
+
+sources = files('amd_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index c83047af94..4ba5954e13 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -7,6 +7,7 @@ drivers = [
'cppc',
'kvm_vm',
'pstate',
+ 'amd_uncore',
'intel_uncore'
]
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 5/6] maintainers: update for drivers/power
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
` (3 preceding siblings ...)
2024-10-28 19:55 ` [PATCH v10 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-11-10 10:54 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
2024-11-10 18:35 ` [PATCH v10 0/6] power: refactor power management library Thomas Monjalon
6 siblings, 1 reply; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
Update maintainers for drivers/power/*.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index e25e9465b5..df7a756612 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1753,6 +1753,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
M: David Hunt <david.hunt@intel.com>
M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
F: lib/power/
+F: drivers/power/*
F: doc/guides/prog_guide/power_man.rst
F: app/test/test_power*
F: examples/l3fwd-power/
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* [PATCH v10 6/6] power: rename library sources for cpu frequency management
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
` (4 preceding siblings ...)
2024-10-28 19:55 ` [PATCH v10 5/6] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-10-28 19:55 ` Sivaprasad Tummala
2024-11-10 18:35 ` [PATCH v10 0/6] power: refactor power management library Thomas Monjalon
6 siblings, 0 replies; 139+ messages in thread
From: Sivaprasad Tummala @ 2024-10-28 19:55 UTC (permalink / raw)
To: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev
Cc: dev
This patch renames the existing core power library source files
from rte_power.* to rte_power_cpufreq.* for better clarity
v9:
- documentation update
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
app/test/test_power.c | 2 +-
app/test/test_power_cpufreq.c | 2 +-
app/test/test_power_kvm_vm.c | 2 +-
doc/api/doxy-api-index.md | 2 +-
examples/distributor/main.c | 2 +-
examples/l3fwd-power/main.c | 2 +-
examples/l3fwd-power/perf_core.c | 2 +-
examples/vm_power_manager/channel_monitor.c | 2 +-
examples/vm_power_manager/channel_monitor.h | 2 +-
examples/vm_power_manager/guest_cli/main.c | 2 +-
examples/vm_power_manager/guest_cli/vm_power_cli_guest.c | 2 +-
examples/vm_power_manager/power_manager.c | 2 +-
lib/power/meson.build | 4 ++--
lib/power/{rte_power.c => rte_power_cpufreq.c} | 2 +-
lib/power/{rte_power.h => rte_power_cpufreq.h} | 4 ++--
lib/power/rte_power_pmd_mgmt.h | 2 +-
16 files changed, 18 insertions(+), 18 deletions(-)
rename lib/power/{rte_power.c => rte_power_cpufreq.c} (99%)
rename lib/power/{rte_power.h => rte_power_cpufreq.h} (99%)
diff --git a/app/test/test_power.c b/app/test/test_power.c
index 5df5848c70..38507411bd 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -22,7 +22,7 @@ test_power(void)
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
static int
test_power(void)
diff --git a/app/test/test_power_cpufreq.c b/app/test/test_power_cpufreq.c
index f4522747d5..0331b37fe0 100644
--- a/app/test/test_power_cpufreq.c
+++ b/app/test/test_power_cpufreq.c
@@ -30,7 +30,7 @@ test_power_caps(void)
}
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define TEST_POWER_LCORE_ID 2U
#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
diff --git a/app/test/test_power_kvm_vm.c b/app/test/test_power_kvm_vm.c
index a7d104e973..1c72ba5a4e 100644
--- a/app/test/test_power_kvm_vm.c
+++ b/app/test/test_power_kvm_vm.c
@@ -20,7 +20,7 @@ test_power_kvm_vm(void)
}
#else
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define TEST_POWER_VM_LCORE_ID 0U
#define TEST_POWER_VM_LCORE_OUT_OF_BOUNDS (RTE_MAX_LCORE+1)
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 266c8b90dc..f82570b093 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -102,7 +102,7 @@ The public API headers are grouped by topics:
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
- [power/freq](@ref rte_power.h),
+ [power/freq](@ref rte_power_cpufreq.h),
[power/uncore](@ref rte_power_uncore.h),
[PMD power](@ref rte_power_pmd_mgmt.h)
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index ddbc387c20..ea44939fba 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -17,7 +17,7 @@
#include <rte_prefetch.h>
#include <rte_distributor.h>
#include <rte_pause.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#define RX_RING_SIZE 1024
#define TX_RING_SIZE 1024
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 6bd76515e6..272e069207 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -41,7 +41,7 @@
#include <rte_udp.h>
#include <rte_string_fns.h>
#include <rte_timer.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_spinlock.h>
#include <rte_metrics.h>
#include <rte_telemetry.h>
diff --git a/examples/l3fwd-power/perf_core.c b/examples/l3fwd-power/perf_core.c
index 6c0f7ea213..1b5419119a 100644
--- a/examples/l3fwd-power/perf_core.c
+++ b/examples/l3fwd-power/perf_core.c
@@ -10,7 +10,7 @@
#include <rte_common.h>
#include <rte_memory.h>
#include <rte_lcore.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_string_fns.h>
#include "perf_core.h"
diff --git a/examples/vm_power_manager/channel_monitor.c b/examples/vm_power_manager/channel_monitor.c
index f21556e27d..d4e0d685c1 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -31,7 +31,7 @@
#ifdef RTE_NET_I40E
#include <rte_pmd_i40e.h>
#endif
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <libvirt/libvirt.h>
#include "channel_monitor.h"
diff --git a/examples/vm_power_manager/channel_monitor.h b/examples/vm_power_manager/channel_monitor.h
index ab69524af5..a9a257abd3 100644
--- a/examples/vm_power_manager/channel_monitor.h
+++ b/examples/vm_power_manager/channel_monitor.h
@@ -5,7 +5,7 @@
#ifndef CHANNEL_MONITOR_H_
#define CHANNEL_MONITOR_H_
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include "channel_manager.h"
diff --git a/examples/vm_power_manager/guest_cli/main.c b/examples/vm_power_manager/guest_cli/main.c
index 9da50020ac..6246cbd6b4 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -9,7 +9,7 @@
#include <string.h>
#include <rte_lcore.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_debug.h>
#include <rte_eal.h>
#include <rte_log.h>
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 5eddb47847..803b6d1f82 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -18,7 +18,7 @@
#include <rte_lcore.h>
#include <rte_ethdev.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include "vm_power_cli_guest.h"
diff --git a/examples/vm_power_manager/power_manager.c b/examples/vm_power_manager/power_manager.c
index 0355a7f4bc..522c713ff4 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -15,7 +15,7 @@
#include <sys/types.h>
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#include <rte_spinlock.h>
#include "channel_manager.h"
diff --git a/lib/power/meson.build b/lib/power/meson.build
index 5fa5d062e3..4f4dc19687 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,14 +13,14 @@ if not is_linux
endif
sources = files(
'power_common.c',
- 'rte_power.c',
+ 'rte_power_cpufreq.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
)
headers = files(
'power_cpufreq.h',
'power_uncore_ops.h',
- 'rte_power.h',
+ 'rte_power_cpufreq.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
diff --git a/lib/power/rte_power.c b/lib/power/rte_power_cpufreq.c
similarity index 99%
rename from lib/power/rte_power.c
rename to lib/power/rte_power_cpufreq.c
index 90c9cf2198..86a7180c49 100644
--- a/lib/power/rte_power.c
+++ b/lib/power/rte_power_cpufreq.c
@@ -5,7 +5,7 @@
#include <rte_spinlock.h>
#include <rte_debug.h>
-#include "rte_power.h"
+#include "rte_power_cpufreq.h"
#include "power_common.h"
static enum power_management_env global_default_env = PM_ENV_NOT_SET;
diff --git a/lib/power/rte_power.h b/lib/power/rte_power_cpufreq.h
similarity index 99%
rename from lib/power/rte_power.h
rename to lib/power/rte_power_cpufreq.h
index 7d566551bd..b68d8c0bbc 100644
--- a/lib/power/rte_power.h
+++ b/lib/power/rte_power_cpufreq.h
@@ -3,8 +3,8 @@
* Copyright(c) 2024 Advanced Micro Devices, Inc.
*/
-#ifndef _RTE_POWER_H
-#define _RTE_POWER_H
+#ifndef _RTE_POWER_CPUFREQ_H
+#define _RTE_POWER_CPUFREQ_H
/**
* @file
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 807e454096..58c25bc3ff 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -13,7 +13,7 @@
#include <stdint.h>
#include <rte_log.h>
-#include <rte_power.h>
+#include <rte_power_cpufreq.h>
#ifdef __cplusplus
extern "C" {
--
2.34.1
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 3/6] test/power: removed function pointer validations
2024-10-28 19:55 ` [PATCH v10 3/6] test/power: removed function pointer validations Sivaprasad Tummala
@ 2024-11-10 10:11 ` Thomas Monjalon
0 siblings, 0 replies; 139+ messages in thread
From: Thomas Monjalon @ 2024-11-10 10:11 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
28/10/2024 20:55, Sivaprasad Tummala:
> After refactoring the power library, power management operations are now
> consistently supported regardless of the operating environment, making
> function pointer checks unnecessary and thus removed from applications.
>
> v2:
> - removed function pointer validation in l3fwd-power app.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
This patch must be first, because the rework makes the compiler stops on this test.
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 1/6] power: refactor core power management library
2024-10-28 19:55 ` [PATCH v10 1/6] power: refactor core " Sivaprasad Tummala
@ 2024-11-10 10:40 ` Thomas Monjalon
0 siblings, 0 replies; 139+ messages in thread
From: Thomas Monjalon @ 2024-11-10 10:40 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
28/10/2024 20:55, Sivaprasad Tummala:
> drivers/meson.build | 1 +
> .../power/acpi/acpi_cpufreq.c | 22 +-
> .../power/acpi/acpi_cpufreq.h | 6 +-
> drivers/power/acpi/meson.build | 10 +
> .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +-
> .../power/amd_pstate/amd_pstate_cpufreq.h | 10 +-
> drivers/power/amd_pstate/meson.build | 10 +
> .../power/cppc/cppc_cpufreq.c | 22 +-
> .../power/cppc/cppc_cpufreq.h | 8 +-
> drivers/power/cppc/meson.build | 10 +
> .../power/kvm_vm}/guest_channel.c | 2 +-
> .../power/kvm_vm}/guest_channel.h | 0
> .../power/kvm_vm/kvm_vm.c | 22 +-
> .../power/kvm_vm/kvm_vm.h | 6 +-
> drivers/power/kvm_vm/meson.build | 14 +
> drivers/power/meson.build | 12 +
> drivers/power/pstate/meson.build | 10 +
> .../power/pstate/pstate_cpufreq.c | 22 +-
> .../power/pstate/pstate_cpufreq.h | 6 +-
For consistency, we must name it intel_pstate, not just "pstate".
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 4/6] drivers/power: uncore support for AMD EPYC processors
2024-10-28 19:55 ` [PATCH v10 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
@ 2024-11-10 10:52 ` Thomas Monjalon
0 siblings, 0 replies; 139+ messages in thread
From: Thomas Monjalon @ 2024-11-10 10:52 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
28/10/2024 20:55, Sivaprasad Tummala:
> --- /dev/null
> +++ b/drivers/power/amd_uncore/amd_uncore.h
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
As said in previous versions, you don't need this in an internal header.
[...]
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* POWER_INTEL_UNCORE_H */
ah ah :)
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 5/6] maintainers: update for drivers/power
2024-10-28 19:55 ` [PATCH v10 5/6] maintainers: update for drivers/power Sivaprasad Tummala
@ 2024-11-10 10:54 ` Thomas Monjalon
0 siblings, 0 replies; 139+ messages in thread
From: Thomas Monjalon @ 2024-11-10 10:54 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
28/10/2024 20:55, Sivaprasad Tummala:
> Update maintainers for drivers/power/*.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
> MAINTAINERS | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e25e9465b5..df7a756612 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1753,6 +1753,7 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
> M: David Hunt <david.hunt@intel.com>
> M: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> F: lib/power/
> +F: drivers/power/*
You must not add this wildcard, it does not work.
And this patch can be squashed in the rework.
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 0/6] power: refactor power management library
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
` (5 preceding siblings ...)
2024-10-28 19:55 ` [PATCH v10 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
@ 2024-11-10 18:35 ` Thomas Monjalon
2024-11-10 19:29 ` Stephen Hemminger
2024-11-12 8:20 ` David Marchand
6 siblings, 2 replies; 139+ messages in thread
From: Thomas Monjalon @ 2024-11-10 18:35 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
28/10/2024 20:55, Sivaprasad Tummala:
> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
>
> Sivaprasad Tummala (6):
> power: refactor core power management library
> power: refactor uncore power management library
> test/power: removed function pointer validations
> drivers/power: uncore support for AMD EPYC processors
> maintainers: update for drivers/power
> power: rename library sources for cpu frequency management
I'm a bit sad there is not more reviews.
I've moved the pointers check removal first,
renamed intel_pstate files (not the functions),
fixed few things like __cplusplus, include guards,
sorting and maintainers file.
Applied
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 0/6] power: refactor power management library
2024-11-10 18:35 ` [PATCH v10 0/6] power: refactor power management library Thomas Monjalon
@ 2024-11-10 19:29 ` Stephen Hemminger
2024-11-10 23:40 ` Thomas Monjalon
2024-11-12 8:20 ` David Marchand
1 sibling, 1 reply; 139+ messages in thread
From: Stephen Hemminger @ 2024-11-10 19:29 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Sivaprasad Tummala, david.hunt, anatoly.burakov, jerinj,
radu.nicolau, gakhil, cristian.dumitrescu, lihuisong,
ferruh.yigit, konstantin.ananyev, dev
On Sun, 10 Nov 2024 19:35:55 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:
> 28/10/2024 20:55, Sivaprasad Tummala:
> > This patchset refactors the power management library, addressing both
> > core and uncore power management. The primary changes involve the
> > creation of dedicated directories for each driver within
> > 'drivers/power/core/*' and 'drivers/power/uncore/*'.
> >
> > This refactor significantly improves code organization, enhances
> > clarity, and boosts maintainability. It lays the foundation for more
> > focused development on individual drivers and facilitates seamless
> > integration of future enhancements, particularly the AMD uncore driver.
> >
> > Furthermore, this effort aims to streamline code maintenance by
> > consolidating common functions for cpufreq and cppc across various
> > core drivers, thus reducing code duplication.
> >
> > Sivaprasad Tummala (6):
> > power: refactor core power management library
> > power: refactor uncore power management library
> > test/power: removed function pointer validations
> > drivers/power: uncore support for AMD EPYC processors
> > maintainers: update for drivers/power
> > power: rename library sources for cpu frequency management
>
> I'm a bit sad there is not more reviews.
>
> I've moved the pointers check removal first,
> renamed intel_pstate files (not the functions),
> fixed few things like __cplusplus, include guards,
> sorting and maintainers file.
>
> Applied
>
>
I think this broke the build. The next unrelated change in CI is failing.
Fixing the initialization in mlx5 patch failed.
-------------------------------BEGIN LOGS----------------------------
####################################################################################
#### [Begin job log] "ubuntu-22.04-gcc-stdatomic" at step Build and test
####################################################################################
Message: drivers/event/skeleton: Defining dependency "event_skeleton"
Message: drivers/event/sw: Defining dependency "event_sw"
Message: drivers/event/octeontx: Defining dependency "event_octeontx"
Run-time dependency flexran_sdk_ldpc_decoder_5gnr found: NO (tried pkgconfig and cmake)
Message: drivers/baseband/acc: Defining dependency "baseband_acc"
Message: drivers/baseband/fpga_5gnr_fec: Defining dependency "baseband_fpga_5gnr_fec"
Message: drivers/baseband/fpga_lte_fec: Defining dependency "baseband_fpga_lte_fec"
Message: drivers/baseband/la12xx: Defining dependency "baseband_la12xx"
Message: drivers/baseband/null: Defining dependency "baseband_null"
Run-time dependency flexran_sdk_turbo found: NO (tried pkgconfig and cmake)
Run-time dependency flexran_sdk_ldpc_decoder_5gnr found: NO (tried pkgconfig and cmake)
Message: drivers/baseband/turbo_sw: Defining dependency "baseband_turbo_sw"
Has header "cuda.h" : NO
Message: drivers/power/acpi: Defining dependency "power_acpi"
Message: drivers/power/amd_pstate: Defining dependency "power_amd_pstate"
Library e_smi64 found: NO
Message: drivers/power/cppc: Defining dependency "power_cppc"
Message: drivers/power/intel_pstate: Defining dependency "power_intel_pstate"
Message: drivers/power/intel_uncore: Defining dependency "power_intel_uncore"
Message: drivers/power/kvm_vm: Defining dependency "power_kvm_vm"
drivers/meson.build:132:8: ERROR: Include dir power/pstate does not exist.
A full log can be found at /home/runner/work/dpdk/dpdk/build/meson-logs/meson-log.txt
##[error]Process completed with exit code 1.
####################################################################################
#### [End job log] "ubuntu-22.04-gcc-stdatomic" at step Build and test
####################################################################################
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 0/6] power: refactor power management library
2024-11-10 19:29 ` Stephen Hemminger
@ 2024-11-10 23:40 ` Thomas Monjalon
0 siblings, 0 replies; 139+ messages in thread
From: Thomas Monjalon @ 2024-11-10 23:40 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Sivaprasad Tummala, david.hunt, anatoly.burakov, jerinj,
radu.nicolau, gakhil, cristian.dumitrescu, lihuisong,
ferruh.yigit, konstantin.ananyev, dev
10/11/2024 20:29, Stephen Hemminger:
> On Sun, 10 Nov 2024 19:35:55 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
>
> > 28/10/2024 20:55, Sivaprasad Tummala:
> > > This patchset refactors the power management library, addressing both
> > > core and uncore power management. The primary changes involve the
> > > creation of dedicated directories for each driver within
> > > 'drivers/power/core/*' and 'drivers/power/uncore/*'.
> > >
> > > This refactor significantly improves code organization, enhances
> > > clarity, and boosts maintainability. It lays the foundation for more
> > > focused development on individual drivers and facilitates seamless
> > > integration of future enhancements, particularly the AMD uncore driver.
> > >
> > > Furthermore, this effort aims to streamline code maintenance by
> > > consolidating common functions for cpufreq and cppc across various
> > > core drivers, thus reducing code duplication.
> > >
> > > Sivaprasad Tummala (6):
> > > power: refactor core power management library
> > > power: refactor uncore power management library
> > > test/power: removed function pointer validations
> > > drivers/power: uncore support for AMD EPYC processors
> > > maintainers: update for drivers/power
> > > power: rename library sources for cpu frequency management
> >
> > I'm a bit sad there is not more reviews.
> >
> > I've moved the pointers check removal first,
> > renamed intel_pstate files (not the functions),
> > fixed few things like __cplusplus, include guards,
> > sorting and maintainers file.
> >
> > Applied
> >
> >
>
> I think this broke the build. The next unrelated change in CI is failing.
> Fixing the initialization in mlx5 patch failed.
Indeed.
Thanks for the notice.
It should be fixed now.
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 0/6] power: refactor power management library
2024-11-10 18:35 ` [PATCH v10 0/6] power: refactor power management library Thomas Monjalon
2024-11-10 19:29 ` Stephen Hemminger
@ 2024-11-12 8:20 ` David Marchand
2024-11-12 10:37 ` David Marchand
1 sibling, 1 reply; 139+ messages in thread
From: David Marchand @ 2024-11-12 8:20 UTC (permalink / raw)
To: Sivaprasad Tummala, Thomas Monjalon
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
Hello Siva, Thomas,
On Sun, Nov 10, 2024 at 7:36 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 28/10/2024 20:55, Sivaprasad Tummala:
> > This patchset refactors the power management library, addressing both
> > core and uncore power management. The primary changes involve the
> > creation of dedicated directories for each driver within
> > 'drivers/power/core/*' and 'drivers/power/uncore/*'.
> >
> > This refactor significantly improves code organization, enhances
> > clarity, and boosts maintainability. It lays the foundation for more
> > focused development on individual drivers and facilitates seamless
> > integration of future enhancements, particularly the AMD uncore driver.
> >
> > Furthermore, this effort aims to streamline code maintenance by
> > consolidating common functions for cpufreq and cppc across various
> > core drivers, thus reducing code duplication.
> >
> > Sivaprasad Tummala (6):
> > power: refactor core power management library
> > power: refactor uncore power management library
> > test/power: removed function pointer validations
> > drivers/power: uncore support for AMD EPYC processors
> > maintainers: update for drivers/power
> > power: rename library sources for cpu frequency management
>
> I'm a bit sad there is not more reviews.
>
> I've moved the pointers check removal first,
> renamed intel_pstate files (not the functions),
> fixed few things like __cplusplus, include guards,
> sorting and maintainers file.
>
> Applied
This series breaks compilation of the vm_power_manager example as the
"guest channel" API symbols are not provided by the power library
(itself) anymore.
ninja: Entering directory `/home/dmarchan/builds/main/build-gcc-shared'
[3355/3373] Linking target examples/dpdk-guest_cli
FAILED: examples/dpdk-guest_cli
gcc -o examples/dpdk-guest_cli
examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_main.c.o
examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_parse.c.o
examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o
-Wl,--as-needed -Wl,--no-undefined -Wl,--no-as-needed
-Wl,--undefined-version -pthread -Wl,--start-group -lm -ldl -lnuma
-lfdt '-Wl,-rpath,$ORIGIN/../lib'
-Wl,-rpath-link,/home/dmarchan/builds/main/build-gcc-shared/lib
lib/librte_eal.so.25.0 lib/librte_kvargs.so.25.0
lib/librte_log.so.25.0 lib/librte_telemetry.so.25.0
lib/librte_mempool.so.25.0 lib/librte_ring.so.25.0
lib/librte_net.so.25.0 lib/librte_mbuf.so.25.0
lib/librte_ethdev.so.25.0 lib/librte_meter.so.25.0
lib/librte_cmdline.so.25.0 lib/librte_power.so.25.0
lib/librte_timer.so.25.0 -lpcap -lvirt /usr/lib64/libbsd.so
/usr/lib64/libarchive.so -Wl,--end-group
/usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
in function `check_response_cmd':
/home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:382:
undefined reference to `rte_power_guest_channel_receive_msg'
/usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
in function `query_data':
/home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:147:
undefined reference to `rte_power_guest_channel_send_msg'
/usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
in function `receive_capabilities':
/home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:271:
undefined reference to `rte_power_guest_channel_receive_msg'
/usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
in function `send_policy':
/home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:476:
undefined reference to `rte_power_guest_channel_send_msg'
/usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
in function `query_data':
/home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:147:
undefined reference to `rte_power_guest_channel_send_msg'
/usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
in function `receive_freq_list':
/home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:161:
undefined reference to `rte_power_guest_channel_receive_msg'
collect2: error: ld returned 1 exit status
[3357/3373] Generating drivers/rte_common_cnxk.sym_chk with a custom
command (wrapped by meson to capture output)
ninja: build stopped: subcommand failed.
Siva, please have a look quickly.
Here is a quick fix written before first coffee of the day:
$ git diff --cached
diff --git a/drivers/power/kvm_vm/meson.build b/drivers/power/kvm_vm/meson.build
index fe11179ab3..e921c012e9 100644
--- a/drivers/power/kvm_vm/meson.build
+++ b/drivers/power/kvm_vm/meson.build
@@ -10,5 +10,6 @@ sources = files(
'guest_channel.c',
'kvm_vm.c',
)
+headers = files('rte_power_guest_channel.h')
deps += ['power']
diff --git a/lib/power/rte_power_guest_channel.h
b/drivers/power/kvm_vm/rte_power_guest_channel.h
similarity index 100%
rename from lib/power/rte_power_guest_channel.h
rename to drivers/power/kvm_vm/rte_power_guest_channel.h
diff --git a/drivers/power/kvm_vm/version.map b/drivers/power/kvm_vm/version.map
new file mode 100644
index 0000000000..ffa676624b
--- /dev/null
+++ b/drivers/power/kvm_vm/version.map
@@ -0,0 +1,8 @@
+DPDK_25 {
+ global:
+
+ rte_power_guest_channel_receive_msg;
+ rte_power_guest_channel_send_msg;
+
+ local: *;
+};
diff --git a/examples/vm_power_manager/guest_cli/meson.build
b/examples/vm_power_manager/guest_cli/meson.build
index a69f809e3b..bc3916a170 100644
--- a/examples/vm_power_manager/guest_cli/meson.build
+++ b/examples/vm_power_manager/guest_cli/meson.build
@@ -6,7 +6,7 @@
# To build this example as a standalone application with an already-installed
# DPDK instance, use 'make'
-deps += ['power']
+deps += ['power', 'power/kvm_vm']
sources = files(
'main.c',
diff --git a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
index 803b6d1f82..14d1f3dd95 100644
--- a/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
+++ b/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
@@ -19,6 +19,7 @@
#include <rte_ethdev.h>
#include <rte_power_cpufreq.h>
+#include <rte_power_guest_channel.h>
#include "vm_power_cli_guest.h"
diff --git a/examples/vm_power_manager/meson.build
b/examples/vm_power_manager/meson.build
index b866d8fd54..1903b68ed9 100644
--- a/examples/vm_power_manager/meson.build
+++ b/examples/vm_power_manager/meson.build
@@ -6,7 +6,7 @@
# To build this example as a standalone application with an already-installed
# DPDK instance, use 'make'
-deps += ['power']
+deps += ['power', 'power/kvm_vm']
if dpdk_conf.has('RTE_NET_BNXT')
deps += ['net_bnxt']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index cd7c83b6e9..b3a7bc7b2e 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -22,7 +22,6 @@ headers = files(
'power_cpufreq.h',
'power_uncore_ops.h',
'rte_power_cpufreq.h',
- 'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_qos.h',
'rte_power_uncore.h',
diff --git a/lib/power/rte_power_cpufreq.h b/lib/power/rte_power_cpufreq.h
index 73f9820bdf..82d274214b 100644
--- a/lib/power/rte_power_cpufreq.h
+++ b/lib/power/rte_power_cpufreq.h
@@ -13,7 +13,6 @@
#include <rte_common.h>
#include <rte_log.h>
-#include <rte_power_guest_channel.h>
#include "power_cpufreq.h"
diff --git a/lib/power/version.map b/lib/power/version.map
index 920c8e79b3..9a36046a64 100644
--- a/lib/power/version.map
+++ b/lib/power/version.map
@@ -16,8 +16,6 @@ DPDK_25 {
rte_power_get_env;
rte_power_get_freq;
rte_power_get_uncore_freq;
- rte_power_guest_channel_receive_msg;
- rte_power_guest_channel_send_msg;
rte_power_init;
rte_power_pmd_mgmt_get_emptypoll_max;
rte_power_pmd_mgmt_get_pause_duration;
--
David Marchand
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 0/6] power: refactor power management library
2024-11-12 8:20 ` David Marchand
@ 2024-11-12 10:37 ` David Marchand
2024-11-12 14:50 ` Tummala, Sivaprasad
0 siblings, 1 reply; 139+ messages in thread
From: David Marchand @ 2024-11-12 10:37 UTC (permalink / raw)
To: Sivaprasad Tummala, Thomas Monjalon
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
On Tue, Nov 12, 2024 at 9:20 AM David Marchand
<david.marchand@redhat.com> wrote:
>
> Hello Siva, Thomas,
>
> On Sun, Nov 10, 2024 at 7:36 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 28/10/2024 20:55, Sivaprasad Tummala:
> > > This patchset refactors the power management library, addressing both
> > > core and uncore power management. The primary changes involve the
> > > creation of dedicated directories for each driver within
> > > 'drivers/power/core/*' and 'drivers/power/uncore/*'.
> > >
> > > This refactor significantly improves code organization, enhances
> > > clarity, and boosts maintainability. It lays the foundation for more
> > > focused development on individual drivers and facilitates seamless
> > > integration of future enhancements, particularly the AMD uncore driver.
> > >
> > > Furthermore, this effort aims to streamline code maintenance by
> > > consolidating common functions for cpufreq and cppc across various
> > > core drivers, thus reducing code duplication.
> > >
> > > Sivaprasad Tummala (6):
> > > power: refactor core power management library
> > > power: refactor uncore power management library
> > > test/power: removed function pointer validations
> > > drivers/power: uncore support for AMD EPYC processors
> > > maintainers: update for drivers/power
> > > power: rename library sources for cpu frequency management
> >
> > I'm a bit sad there is not more reviews.
> >
> > I've moved the pointers check removal first,
> > renamed intel_pstate files (not the functions),
> > fixed few things like __cplusplus, include guards,
> > sorting and maintainers file.
> >
> > Applied
>
> This series breaks compilation of the vm_power_manager example as the
> "guest channel" API symbols are not provided by the power library
> (itself) anymore.
>
> ninja: Entering directory `/home/dmarchan/builds/main/build-gcc-shared'
> [3355/3373] Linking target examples/dpdk-guest_cli
> FAILED: examples/dpdk-guest_cli
> gcc -o examples/dpdk-guest_cli
> examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_main.c.o
> examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_parse.c.o
> examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o
> -Wl,--as-needed -Wl,--no-undefined -Wl,--no-as-needed
> -Wl,--undefined-version -pthread -Wl,--start-group -lm -ldl -lnuma
> -lfdt '-Wl,-rpath,$ORIGIN/../lib'
> -Wl,-rpath-link,/home/dmarchan/builds/main/build-gcc-shared/lib
> lib/librte_eal.so.25.0 lib/librte_kvargs.so.25.0
> lib/librte_log.so.25.0 lib/librte_telemetry.so.25.0
> lib/librte_mempool.so.25.0 lib/librte_ring.so.25.0
> lib/librte_net.so.25.0 lib/librte_mbuf.so.25.0
> lib/librte_ethdev.so.25.0 lib/librte_meter.so.25.0
> lib/librte_cmdline.so.25.0 lib/librte_power.so.25.0
> lib/librte_timer.so.25.0 -lpcap -lvirt /usr/lib64/libbsd.so
> /usr/lib64/libarchive.so -Wl,--end-group
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `check_response_cmd':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:382:
> undefined reference to `rte_power_guest_channel_receive_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `query_data':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:147:
> undefined reference to `rte_power_guest_channel_send_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `receive_capabilities':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:271:
> undefined reference to `rte_power_guest_channel_receive_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `send_policy':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:476:
> undefined reference to `rte_power_guest_channel_send_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `query_data':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:147:
> undefined reference to `rte_power_guest_channel_send_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `receive_freq_list':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:161:
> undefined reference to `rte_power_guest_channel_receive_msg'
> collect2: error: ld returned 1 exit status
> [3357/3373] Generating drivers/rte_common_cnxk.sym_chk with a custom
> command (wrapped by meson to capture output)
> ninja: build stopped: subcommand failed.
>
> Siva, please have a look quickly.
>
> Here is a quick fix written before first coffee of the day.
I ended up sending this as a patch:
https://patchwork.dpdk.org/project/dpdk/patch/20241112103454.1543861-1-david.marchand@redhat.com/
--
David Marchand
^ permalink raw reply [flat|nested] 139+ messages in thread
* RE: [PATCH v10 0/6] power: refactor power management library
2024-11-12 10:37 ` David Marchand
@ 2024-11-12 14:50 ` Tummala, Sivaprasad
0 siblings, 0 replies; 139+ messages in thread
From: Tummala, Sivaprasad @ 2024-11-12 14:50 UTC (permalink / raw)
To: David Marchand, Thomas Monjalon
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, Yigit, Ferruh,
konstantin.ananyev, dev
[AMD Official Use Only - AMD Internal Distribution Only]
Hi David,
-----Original Message-----
From: David Marchand <david.marchand@redhat.com>
Sent: Tuesday, November 12, 2024 4:08 PM
To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>; Thomas Monjalon <thomas@monjalon.net>
Cc: david.hunt@intel.com; anatoly.burakov@intel.com; jerinj@marvell.com; radu.nicolau@intel.com; gakhil@marvell.com; cristian.dumitrescu@intel.com; lihuisong@huawei.com; Yigit, Ferruh <Ferruh.Yigit@amd.com>; konstantin.ananyev@huawei.com; dev@dpdk.org
Subject: Re: [PATCH v10 0/6] power: refactor power management library
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
On Tue, Nov 12, 2024 at 9:20 AM David Marchand <david.marchand@redhat.com> wrote:
>
> Hello Siva, Thomas,
>
> On Sun, Nov 10, 2024 at 7:36 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 28/10/2024 20:55, Sivaprasad Tummala:
> > > This patchset refactors the power management library, addressing
> > > both core and uncore power management. The primary changes involve
> > > the creation of dedicated directories for each driver within
> > > 'drivers/power/core/*' and 'drivers/power/uncore/*'.
> > >
> > > This refactor significantly improves code organization, enhances
> > > clarity, and boosts maintainability. It lays the foundation for
> > > more focused development on individual drivers and facilitates
> > > seamless integration of future enhancements, particularly the AMD uncore driver.
> > >
> > > Furthermore, this effort aims to streamline code maintenance by
> > > consolidating common functions for cpufreq and cppc across various
> > > core drivers, thus reducing code duplication.
> > >
> > > Sivaprasad Tummala (6):
> > > power: refactor core power management library
> > > power: refactor uncore power management library
> > > test/power: removed function pointer validations
> > > drivers/power: uncore support for AMD EPYC processors
> > > maintainers: update for drivers/power
> > > power: rename library sources for cpu frequency management
> >
> > I'm a bit sad there is not more reviews.
> >
> > I've moved the pointers check removal first, renamed intel_pstate
> > files (not the functions), fixed few things like __cplusplus,
> > include guards, sorting and maintainers file.
> >
> > Applied
>
> This series breaks compilation of the vm_power_manager example as the
> "guest channel" API symbols are not provided by the power library
> (itself) anymore.
>
> ninja: Entering directory `/home/dmarchan/builds/main/build-gcc-shared'
> [3355/3373] Linking target examples/dpdk-guest_cli
> FAILED: examples/dpdk-guest_cli
> gcc -o examples/dpdk-guest_cli
> examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_main.c.o
> examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_parse.c.o
> examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_gues
> t.c.o -Wl,--as-needed -Wl,--no-undefined -Wl,--no-as-needed
> -Wl,--undefined-version -pthread -Wl,--start-group -lm -ldl -lnuma
> -lfdt '-Wl,-rpath,$ORIGIN/../lib'
> -Wl,-rpath-link,/home/dmarchan/builds/main/build-gcc-shared/lib
> lib/librte_eal.so.25.0 lib/librte_kvargs.so.25.0
> lib/librte_log.so.25.0 lib/librte_telemetry.so.25.0
> lib/librte_mempool.so.25.0 lib/librte_ring.so.25.0
> lib/librte_net.so.25.0 lib/librte_mbuf.so.25.0
> lib/librte_ethdev.so.25.0 lib/librte_meter.so.25.0
> lib/librte_cmdline.so.25.0 lib/librte_power.so.25.0
> lib/librte_timer.so.25.0 -lpcap -lvirt /usr/lib64/libbsd.so
> /usr/lib64/libarchive.so -Wl,--end-group
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `check_response_cmd':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:382:
> undefined reference to `rte_power_guest_channel_receive_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `query_data':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:147:
> undefined reference to `rte_power_guest_channel_send_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `receive_capabilities':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:271:
> undefined reference to `rte_power_guest_channel_receive_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `send_policy':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:476:
> undefined reference to `rte_power_guest_channel_send_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `query_data':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:147:
> undefined reference to `rte_power_guest_channel_send_msg'
> /usr/bin/ld: examples/dpdk-guest_cli.p/vm_power_manager_guest_cli_vm_power_cli_guest.c.o:
> in function `receive_freq_list':
> /home/dmarchan/builds/main/build-gcc-shared/../../../git/pub/dpdk.org/main/examples/vm_power_manager/guest_cli/vm_power_cli_guest.c:161:
> undefined reference to `rte_power_guest_channel_receive_msg'
> collect2: error: ld returned 1 exit status [3357/3373] Generating
> drivers/rte_common_cnxk.sym_chk with a custom command (wrapped by
> meson to capture output)
> ninja: build stopped: subcommand failed.
>
> Siva, please have a look quickly.
>
> Here is a quick fix written before first coffee of the day.
LGTM!
--
David Marchand
^ permalink raw reply [flat|nested] 139+ messages in thread
* Re: [PATCH v10 2/6] power: refactor uncore power management library
2024-10-28 19:55 ` [PATCH v10 2/6] power: refactor uncore " Sivaprasad Tummala
@ 2024-11-16 0:55 ` Stephen Hemminger
0 siblings, 0 replies; 139+ messages in thread
From: Stephen Hemminger @ 2024-11-16 0:55 UTC (permalink / raw)
To: Sivaprasad Tummala
Cc: david.hunt, anatoly.burakov, jerinj, radu.nicolau, gakhil,
cristian.dumitrescu, lihuisong, ferruh.yigit, konstantin.ananyev,
dev
On Mon, 28 Oct 2024 19:55:52 +0000
Sivaprasad Tummala <sivaprasad.tummala@amd.com> wrote:
> + /* Auto Detect Environment */
> + RTE_TAILQ_FOREACH(ops, &uncore_ops_list, next)
> + if (ops) {
> + POWER_LOG(INFO,
> + "Attempting to initialise %s power management...",
> + ops->name);
> + ret = ops->init(pkg, die);
> + if (ret == 0) {
> + for (env = 0; env < RTE_DIM(uncore_env_str); env++)
> + if (strncmp(ops->name, uncore_env_str[env],
> + RTE_POWER_UNCORE_DRIVER_NAMESZ) == 0) {
> + rte_power_set_uncore_env(env);
> + goto out;
> + }
> + }
> + }
> out:
Static analyzer complains:
lib/power/rte_power_uncore.c:113:1: warning: V547 Expression 'ops' is always true.
Since the macro RTE_TAILQ_FOREACH() iterates until ops is NULL, that whole if() part
can be removed.
^ permalink raw reply [flat|nested] 139+ messages in thread
end of thread, other threads:[~2024-11-16 0:55 UTC | newest]
Thread overview: 139+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-20 15:33 [RFC PATCH 0/2] power: refactor power management library Sivaprasad Tummala
2024-02-20 15:33 ` Sivaprasad Tummala
2024-02-20 15:33 ` [RFC PATCH 1/2] power: refactor core " Sivaprasad Tummala
2024-02-27 16:18 ` Ferruh Yigit
2024-02-29 7:10 ` Tummala, Sivaprasad
2024-02-28 12:51 ` Ferruh Yigit
2024-03-01 2:56 ` lihuisong (C)
2024-03-01 10:39 ` Hunt, David
2024-03-05 4:35 ` Tummala, Sivaprasad
2024-02-20 15:33 ` [RFC PATCH 2/2] power: refactor uncore " Sivaprasad Tummala
2024-03-01 3:33 ` lihuisong (C)
2024-03-01 6:06 ` Tummala, Sivaprasad
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor " Sivaprasad Tummala
2024-07-20 16:50 ` [PATCH v1 1/4] power: refactor core " Sivaprasad Tummala
2024-07-23 10:03 ` Hunt, David
2024-07-27 18:44 ` Tummala, Sivaprasad
2024-07-20 16:50 ` [PATCH v1 2/4] power: refactor uncore " Sivaprasad Tummala
2024-07-23 10:26 ` Hunt, David
2024-07-20 16:50 ` [PATCH v1 3/4] test/power: removed function pointer validations Sivaprasad Tummala
2024-07-22 10:49 ` Hunt, David
2024-07-27 18:45 ` Tummala, Sivaprasad
2024-07-20 16:50 ` [PATCH v1 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
2024-07-23 10:33 ` Hunt, David
2024-07-27 18:46 ` Tummala, Sivaprasad
2024-07-20 16:50 ` [PATCH v1 0/4] power: refactor power management library Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 " Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 1/4] power: refactor core " Sivaprasad Tummala
2024-08-26 15:26 ` Stephen Hemminger
2024-10-07 19:25 ` Tummala, Sivaprasad
2024-08-27 8:21 ` lihuisong (C)
2024-09-12 11:17 ` Tummala, Sivaprasad
2024-09-13 7:34 ` lihuisong (C)
2024-09-18 8:37 ` Tummala, Sivaprasad
2024-09-19 3:37 ` lihuisong (C)
2024-08-26 13:06 ` [PATCH v2 2/4] power: refactor uncore " Sivaprasad Tummala
2024-08-27 13:02 ` lihuisong (C)
2024-10-08 6:19 ` Tummala, Sivaprasad
2024-10-22 2:05 ` lihuisong (C)
2024-08-26 13:06 ` [PATCH v2 3/4] test/power: removed function pointer validations Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 4/4] power/amd_uncore: uncore power management support for AMD EPYC processors Sivaprasad Tummala
2024-08-26 13:06 ` [PATCH v2 0/4] power: refactor power management library Sivaprasad Tummala
2024-10-07 18:01 ` Stephen Hemminger
2024-10-08 17:27 ` [PATCH v3 0/5] " Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 3/5] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 5/5] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-08 17:27 ` [PATCH v3 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-08 17:43 ` Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 1/5] power: refactor core " Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 3/5] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 5/5] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-08 17:43 ` [PATCH v3 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-12 17:44 ` Stephen Hemminger
2024-10-15 2:49 ` [PATCH v4 " Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 1/5] power: refactor core " Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 3/5] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 4/5] power/amd_uncore: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 5/5] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-15 2:49 ` [PATCH v4 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-15 3:15 ` Stephen Hemminger
2024-10-17 10:26 ` [PATCH v5 " Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 1/5] power: refactor core " Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 3/5] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 5/5] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-17 10:26 ` [PATCH v5 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-17 16:17 ` Stephen Hemminger
2024-10-20 9:22 ` [PATCH v6 " Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 1/5] power: refactor core " Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-20 23:25 ` Stephen Hemminger
2024-10-20 23:28 ` Stephen Hemminger
2024-10-20 9:22 ` [PATCH v6 3/5] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 5/5] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-20 9:22 ` [PATCH v6 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 " Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 1/5] power: refactor core " Sivaprasad Tummala
2024-10-22 1:20 ` Stephen Hemminger
2024-10-22 6:45 ` Tummala, Sivaprasad
2024-10-22 3:03 ` lihuisong (C)
2024-10-22 7:13 ` Tummala, Sivaprasad
2024-10-22 8:36 ` lihuisong (C)
2024-10-21 4:07 ` [PATCH v7 2/5] power: refactor uncore " Sivaprasad Tummala
2024-10-22 1:18 ` Stephen Hemminger
2024-10-22 6:45 ` Tummala, Sivaprasad
2024-10-22 3:17 ` lihuisong (C)
2024-10-22 6:46 ` Tummala, Sivaprasad
2024-10-21 4:07 ` [PATCH v7 3/5] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 4/5] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 5/5] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-21 4:07 ` [PATCH v7 0/5] power: refactor power management library Sivaprasad Tummala
2024-10-22 1:34 ` Stephen Hemminger
2024-10-22 18:41 ` [PATCH v8 0/6] " Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 1/6] power: refactor core " Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 2/6] power: refactor uncore " Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 3/6] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 5/6] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
2024-10-22 18:41 ` [PATCH v8 0/6] power: refactor power management library Sivaprasad Tummala
2024-10-23 1:40 ` Stephen Hemminger
2024-10-23 5:11 ` [PATCH v9 " Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 1/6] power: refactor core " Sivaprasad Tummala
2024-10-26 3:06 ` lihuisong (C)
2024-10-26 5:22 ` Tummala, Sivaprasad
2024-10-26 7:03 ` lihuisong (C)
2024-10-23 5:11 ` [PATCH v9 2/6] power: refactor uncore " Sivaprasad Tummala
2024-10-26 3:12 ` lihuisong (C)
2024-10-23 5:11 ` [PATCH v9 3/6] test/power: removed function pointer validations Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 5/6] maintainers: update for drivers/power Sivaprasad Tummala
2024-10-23 5:11 ` [PATCH v9 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
2024-10-26 4:09 ` lihuisong (C)
2024-10-23 5:11 ` [PATCH v9 0/6] power: refactor power management library Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 " Sivaprasad Tummala
2024-10-28 19:55 ` [PATCH v10 1/6] power: refactor core " Sivaprasad Tummala
2024-11-10 10:40 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 2/6] power: refactor uncore " Sivaprasad Tummala
2024-11-16 0:55 ` Stephen Hemminger
2024-10-28 19:55 ` [PATCH v10 3/6] test/power: removed function pointer validations Sivaprasad Tummala
2024-11-10 10:11 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 4/6] drivers/power: uncore support for AMD EPYC processors Sivaprasad Tummala
2024-11-10 10:52 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 5/6] maintainers: update for drivers/power Sivaprasad Tummala
2024-11-10 10:54 ` Thomas Monjalon
2024-10-28 19:55 ` [PATCH v10 6/6] power: rename library sources for cpu frequency management Sivaprasad Tummala
2024-11-10 18:35 ` [PATCH v10 0/6] power: refactor power management library Thomas Monjalon
2024-11-10 19:29 ` Stephen Hemminger
2024-11-10 23:40 ` Thomas Monjalon
2024-11-12 8:20 ` David Marchand
2024-11-12 10:37 ` David Marchand
2024-11-12 14:50 ` Tummala, Sivaprasad
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).