DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
@ 2023-04-13 11:53 Sivaprasad Tummala
  2023-04-13 11:53 ` [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
                   ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-13 11:53 UTC (permalink / raw)
  To: david.hunt; +Cc: dev

Add a new CPUID flag to indicate support for monitorx instruction
on AMD Epyc processors.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/include/generic/rte_cpuflags.h | 2 ++
 lib/eal/x86/include/rte_cpuflags.h     | 1 +
 lib/eal/x86/rte_cpuflags.c             | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/lib/eal/include/generic/rte_cpuflags.h b/lib/eal/include/generic/rte_cpuflags.h
index d35551e931..db653a8dd7 100644
--- a/lib/eal/include/generic/rte_cpuflags.h
+++ b/lib/eal/include/generic/rte_cpuflags.h
@@ -26,6 +26,8 @@ struct rte_cpu_intrinsics {
 	/**< indicates support for rte_power_pause function */
 	uint32_t power_monitor_multi : 1;
 	/**< indicates support for rte_power_monitor_multi function */
+	uint32_t amd_power_monitorx : 1;
+	/**< indicates amd support for rte_power_monitor function */
 };
 
 /**
diff --git a/lib/eal/x86/include/rte_cpuflags.h b/lib/eal/x86/include/rte_cpuflags.h
index 92e90fb6e0..d3e608533a 100644
--- a/lib/eal/x86/include/rte_cpuflags.h
+++ b/lib/eal/x86/include/rte_cpuflags.h
@@ -133,6 +133,7 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_AVX512VP2INTERSECT,     /**< AVX512 Two Register Intersection */
 
 	RTE_CPUFLAG_WAITPKG,                /**< UMONITOR/UMWAIT/TPAUSE */
+	RTE_CPUFLAG_MONITORX,               /**< MONITORX */
 
 	/* The last item */
 	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c
index d6b518251b..ae2e0a8470 100644
--- a/lib/eal/x86/rte_cpuflags.c
+++ b/lib/eal/x86/rte_cpuflags.c
@@ -133,6 +133,7 @@ const struct feature_entry rte_cpu_feature_table[] = {
 
 	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX,  0)
 	FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX,  4)
+	FEAT_DEF(MONITORX, 0x80000001, 0, RTE_REG_ECX,  29)
 
 	FEAT_DEF(SYSCALL, 0x80000001, 0, RTE_REG_EDX, 11)
 	FEAT_DEF(XD, 0x80000001, 0, RTE_REG_EDX, 20)
@@ -191,5 +192,7 @@ rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
 		intrinsics->power_pause = 1;
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RTM))
 			intrinsics->power_monitor_multi = 1;
+	} else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) {
+		intrinsics->amd_power_monitorx = 1;
 	}
 }
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-13 11:53 [PATCH v2 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
@ 2023-04-13 11:53 ` Sivaprasad Tummala
  2023-04-17  4:31   ` [PATCH v3 1/4] " Sivaprasad Tummala
  2023-04-13 11:53 ` [PATCH v2 3/3] " Sivaprasad Tummala
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-13 11:53 UTC (permalink / raw)
  To: david.hunt; +Cc: dev

A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
DPDK 23.07 release to support monitorx instruction on Epyc processors.
This results in ABI breakage for legacy apps.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 doc/guides/rel_notes/deprecation.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index dcc1ca1696..65e849616d 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -163,3 +163,6 @@ Deprecation Notices
   The new port library API (functions rte_swx_port_*)
   will gradually transition from experimental to stable status
   starting with DPDK 23.07 release.
+
+* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
+  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on Epyc processors.
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 3/3] power: amd power monitor support
  2023-04-13 11:53 [PATCH v2 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2023-04-13 11:53 ` [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
@ 2023-04-13 11:53 ` Sivaprasad Tummala
  2023-04-17  4:34   ` [PATCH v3 4/4] " Sivaprasad Tummala
  2023-04-13 11:59 ` [PATCH v2 1/3] eal: add x86 cpuid support for monitorx David Marchand
  2023-04-17  4:32 ` [PATCH v3 2/4] " Sivaprasad Tummala
  3 siblings, 1 reply; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-13 11:53 UTC (permalink / raw)
  To: david.hunt; +Cc: dev

mwaitx allows epyc processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 83 ++++++++++++++++++++++++++++--
 lib/power/rte_power_pmd_mgmt.c     |  3 +-
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index f749da9b85..d688066b3a 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -30,6 +30,7 @@ __umwait_wakeup(volatile void *addr)
 
 static bool wait_supported;
 static bool wait_multi_supported;
+static bool amd_mwaitx_supported;
 
 static inline uint64_t
 __get_umwait_val(const volatile void *p, const uint8_t sz)
@@ -65,6 +66,76 @@ __check_val_size(const uint8_t sz)
 	}
 }
 
+/**
+ * This function uses MONITORX/MWAITX instructions and will enter C1 state.
+ * For more information about usage of these instructions, please refer to
+ * AMD64 Architecture Programmer’s Manual.
+ */
+static inline int
+amd_power_monitorx(const struct rte_power_monitor_cond *pmc,
+		const uint64_t tsc_timestamp)
+{
+	const unsigned int lcore_id = rte_lcore_id();
+	struct power_wait_status *s;
+	uint64_t cur_value;
+
+	RTE_SET_USED(tsc_timestamp);
+
+	/* prevent non-EAL thread from using this API */
+	if (lcore_id >= RTE_MAX_LCORE)
+		return -EINVAL;
+
+	if (pmc == NULL)
+		return -EINVAL;
+
+	if (__check_val_size(pmc->size) < 0)
+		return -EINVAL;
+
+	if (pmc->fn == NULL)
+		return -EINVAL;
+
+	s = &wait_status[lcore_id];
+
+	/* update sleep address */
+	rte_spinlock_lock(&s->lock);
+	s->monitor_addr = pmc->addr;
+
+	/*
+	 * we're using raw byte codes for now as only the newest compiler
+	 * versions support this instruction natively.
+	 */
+	/* set address for MONITORX */
+	asm volatile(".byte 0x0f, 0x01, 0xfa;"
+			:
+			: "a"(pmc->addr),
+			"c"(0),  /* no extensions */
+			"d"(0)); /* no hints */
+
+	/* now that we've put this address into monitor, we can unlock */
+	rte_spinlock_unlock(&s->lock);
+
+	cur_value = __get_umwait_val(pmc->addr, pmc->size);
+
+	/* check if callback indicates we should abort */
+	if (pmc->fn(cur_value, pmc->opaque) != 0)
+		goto end;
+
+	/* execute MWAITX */
+	asm volatile(".byte 0x0f, 0x01, 0xfb;"
+			: /* ignore rflags */
+			: "a"(0), /* enter C1 */
+			"c"(0), /* no time-out */
+			"b"(0));
+
+end:
+       /* erase sleep address */
+	rte_spinlock_lock(&s->lock);
+	s->monitor_addr = NULL;
+	rte_spinlock_unlock(&s->lock);
+
+	return 0;
+}
+
 /**
  * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
  * For more information about usage of these instructions, please refer to
@@ -81,8 +152,12 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
-		return -ENOTSUP;
+	if (!wait_supported) {
+		if (amd_mwaitx_supported)
+			return amd_power_monitorx(pmc, tsc_timestamp);
+		else
+			return -ENOTSUP;
+	}
 
 	/* prevent non-EAL thread from using this API */
 	if (lcore_id >= RTE_MAX_LCORE)
@@ -170,6 +245,8 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_supported = 1;
 	if (i.power_monitor_multi)
 		wait_multi_supported = 1;
+	if (i.amd_power_monitorx)
+		amd_mwaitx_supported = 1;
 }
 
 int
@@ -178,7 +255,7 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
 	struct power_wait_status *s;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!wait_supported && !amd_mwaitx_supported)
 		return -ENOTSUP;
 
 	/* prevent buffer overrun */
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index ca1840387c..048a41dc29 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -447,7 +447,8 @@ check_monitor(struct pmd_core_cfg *cfg, const union queue *qdata)
 	bool multimonitor_supported;
 
 	/* check if rte_power_monitor is supported */
-	if (!global_data.intrinsics_support.power_monitor) {
+	if ((!global_data.intrinsics_support.power_monitor) &&
+		(!global_data.intrinsics_support.amd_power_monitorx)) {
 		RTE_LOG(DEBUG, POWER, "Monitoring intrinsics are not supported\n");
 		return -ENOTSUP;
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
  2023-04-13 11:53 [PATCH v2 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2023-04-13 11:53 ` [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
  2023-04-13 11:53 ` [PATCH v2 3/3] " Sivaprasad Tummala
@ 2023-04-13 11:59 ` David Marchand
  2023-04-13 17:50   ` Tummala, Sivaprasad
  2023-04-17  4:32 ` [PATCH v3 2/4] " Sivaprasad Tummala
  3 siblings, 1 reply; 51+ messages in thread
From: David Marchand @ 2023-04-13 11:59 UTC (permalink / raw)
  To: Sivaprasad Tummala; +Cc: david.hunt, dev, Thomas Monjalon, Burakov, Anatoly

On Thu, Apr 13, 2023 at 1:54 PM Sivaprasad Tummala
<sivaprasad.tummala@amd.com> wrote:
>
> Add a new CPUID flag to indicate support for monitorx instruction
> on AMD Epyc processors.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
>  lib/eal/include/generic/rte_cpuflags.h | 2 ++
>  lib/eal/x86/include/rte_cpuflags.h     | 1 +
>  lib/eal/x86/rte_cpuflags.c             | 3 +++
>  3 files changed, 6 insertions(+)
>
> diff --git a/lib/eal/include/generic/rte_cpuflags.h b/lib/eal/include/generic/rte_cpuflags.h
> index d35551e931..db653a8dd7 100644
> --- a/lib/eal/include/generic/rte_cpuflags.h
> +++ b/lib/eal/include/generic/rte_cpuflags.h
> @@ -26,6 +26,8 @@ struct rte_cpu_intrinsics {
>         /**< indicates support for rte_power_pause function */
>         uint32_t power_monitor_multi : 1;
>         /**< indicates support for rte_power_monitor_multi function */
> +       uint32_t amd_power_monitorx : 1;
> +       /**< indicates amd support for rte_power_monitor function */

I did not look at the patch detail, I just stopped at this part.
What makes the AMD monitorx stuff special that it needs to be exposed
in the generic API?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
  2023-04-13 11:59 ` [PATCH v2 1/3] eal: add x86 cpuid support for monitorx David Marchand
@ 2023-04-13 17:50   ` Tummala, Sivaprasad
  2023-04-14  7:05     ` David Marchand
  0 siblings, 1 reply; 51+ messages in thread
From: Tummala, Sivaprasad @ 2023-04-13 17:50 UTC (permalink / raw)
  To: David Marchand; +Cc: david.hunt, dev, Thomas Monjalon, Burakov, Anatoly

[AMD Official Use Only - General]

Hi David, 

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, April 13, 2023 5:30 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; dev@dpdk.org; Thomas Monjalon
> <thomas@monjalon.net>; Burakov, Anatoly <anatoly.burakov@intel.com>
> Subject: Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> On Thu, Apr 13, 2023 at 1:54 PM Sivaprasad Tummala
> <sivaprasad.tummala@amd.com> wrote:
> >
> > Add a new CPUID flag to indicate support for monitorx instruction on
> > AMD Epyc processors.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> >  lib/eal/include/generic/rte_cpuflags.h | 2 ++
> >  lib/eal/x86/include/rte_cpuflags.h     | 1 +
> >  lib/eal/x86/rte_cpuflags.c             | 3 +++
> >  3 files changed, 6 insertions(+)
> >
> > diff --git a/lib/eal/include/generic/rte_cpuflags.h
> > b/lib/eal/include/generic/rte_cpuflags.h
> > index d35551e931..db653a8dd7 100644
> > --- a/lib/eal/include/generic/rte_cpuflags.h
> > +++ b/lib/eal/include/generic/rte_cpuflags.h
> > @@ -26,6 +26,8 @@ struct rte_cpu_intrinsics {
> >         /**< indicates support for rte_power_pause function */
> >         uint32_t power_monitor_multi : 1;
> >         /**< indicates support for rte_power_monitor_multi function */
> > +       uint32_t amd_power_monitorx : 1;
> > +       /**< indicates amd support for rte_power_monitor function */
> 
> I did not look at the patch detail, I just stopped at this part.
> What makes the AMD monitorx stuff special that it needs to be exposed in the
> generic API?

Monitorx is different ISA /opcode (0F 01 FA) as compared to UMonitor (0F 01 C8). This need to be distinguished 
on specific x86 platforms. Hence in the current power intrinsics, for x86 we require a new flag to 
distinguish MonitorX and UMonitor and invoke the appropriate x86 ISA in the datapath.

Thanks & Regards,
Sivaprasad

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
  2023-04-13 17:50   ` Tummala, Sivaprasad
@ 2023-04-14  7:05     ` David Marchand
  2023-04-14  8:51       ` Tummala, Sivaprasad
  2023-04-14 11:48       ` Ferruh Yigit
  0 siblings, 2 replies; 51+ messages in thread
From: David Marchand @ 2023-04-14  7:05 UTC (permalink / raw)
  To: Tummala, Sivaprasad; +Cc: david.hunt, dev, Thomas Monjalon, Burakov, Anatoly

On Thu, Apr 13, 2023 at 7:51 PM Tummala, Sivaprasad
<Sivaprasad.Tummala@amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> Hi David,
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Thursday, April 13, 2023 5:30 PM
> > To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> > Cc: david.hunt@intel.com; dev@dpdk.org; Thomas Monjalon
> > <thomas@monjalon.net>; Burakov, Anatoly <anatoly.burakov@intel.com>
> > Subject: Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
> >
> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > On Thu, Apr 13, 2023 at 1:54 PM Sivaprasad Tummala
> > <sivaprasad.tummala@amd.com> wrote:
> > >
> > > Add a new CPUID flag to indicate support for monitorx instruction on
> > > AMD Epyc processors.
> > >
> > > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > > ---
> > >  lib/eal/include/generic/rte_cpuflags.h | 2 ++
> > >  lib/eal/x86/include/rte_cpuflags.h     | 1 +
> > >  lib/eal/x86/rte_cpuflags.c             | 3 +++
> > >  3 files changed, 6 insertions(+)
> > >
> > > diff --git a/lib/eal/include/generic/rte_cpuflags.h
> > > b/lib/eal/include/generic/rte_cpuflags.h
> > > index d35551e931..db653a8dd7 100644
> > > --- a/lib/eal/include/generic/rte_cpuflags.h
> > > +++ b/lib/eal/include/generic/rte_cpuflags.h
> > > @@ -26,6 +26,8 @@ struct rte_cpu_intrinsics {
> > >         /**< indicates support for rte_power_pause function */
> > >         uint32_t power_monitor_multi : 1;
> > >         /**< indicates support for rte_power_monitor_multi function */
> > > +       uint32_t amd_power_monitorx : 1;
> > > +       /**< indicates amd support for rte_power_monitor function */
> >
> > I did not look at the patch detail, I just stopped at this part.
> > What makes the AMD monitorx stuff special that it needs to be exposed in the
> > generic API?
>
> Monitorx is different ISA /opcode (0F 01 FA) as compared to UMonitor (0F 01 C8). This need to be distinguished
> on specific x86 platforms. Hence in the current power intrinsics, for x86 we require a new flag to
> distinguish MonitorX and UMonitor and invoke the appropriate x86 ISA in the datapath.

Requiring a new x86 cpuflag to identify this ISA presence is ok.


But here, I am talking about the generic power instrinsic API.
Let me phrase my comment differently...

As described in the API:
        uint32_t power_monitor : 1;
        /**< indicates support for rte_power_monitor function */

Does AMD thing behave completely different from the x86?
Looking at patch 3, I understand this is not the case.

So we don't need a "amd" flag in the generic flags.
The indirection for calling the right ISA should be hidden in
rte_power_* helpers implemented for x86.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
  2023-04-14  7:05     ` David Marchand
@ 2023-04-14  8:51       ` Tummala, Sivaprasad
  2023-04-14 11:48       ` Ferruh Yigit
  1 sibling, 0 replies; 51+ messages in thread
From: Tummala, Sivaprasad @ 2023-04-14  8:51 UTC (permalink / raw)
  To: David Marchand; +Cc: david.hunt, dev, Thomas Monjalon, Burakov, Anatoly

[AMD Official Use Only - General]



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, April 14, 2023 12:36 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; dev@dpdk.org; Thomas Monjalon
> <thomas@monjalon.net>; Burakov, Anatoly <anatoly.burakov@intel.com>
> Subject: Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> On Thu, Apr 13, 2023 at 7:51 PM Tummala, Sivaprasad
> <Sivaprasad.Tummala@amd.com> wrote:
> >
> > [AMD Official Use Only - General]
> >
> > Hi David,
> >
> > > -----Original Message-----
> > > From: David Marchand <david.marchand@redhat.com>
> > > Sent: Thursday, April 13, 2023 5:30 PM
> > > To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> > > Cc: david.hunt@intel.com; dev@dpdk.org; Thomas Monjalon
> > > <thomas@monjalon.net>; Burakov, Anatoly <anatoly.burakov@intel.com>
> > > Subject: Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
> > >
> > > Caution: This message originated from an External Source. Use proper
> > > caution when opening attachments, clicking links, or responding.
> > >
> > >
> > > On Thu, Apr 13, 2023 at 1:54 PM Sivaprasad Tummala
> > > <sivaprasad.tummala@amd.com> wrote:
> > > >
> > > > Add a new CPUID flag to indicate support for monitorx instruction
> > > > on AMD Epyc processors.
> > > >
> > > > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > > > ---
> > > >  lib/eal/include/generic/rte_cpuflags.h | 2 ++
> > > >  lib/eal/x86/include/rte_cpuflags.h     | 1 +
> > > >  lib/eal/x86/rte_cpuflags.c             | 3 +++
> > > >  3 files changed, 6 insertions(+)
> > > >
> > > > diff --git a/lib/eal/include/generic/rte_cpuflags.h
> > > > b/lib/eal/include/generic/rte_cpuflags.h
> > > > index d35551e931..db653a8dd7 100644
> > > > --- a/lib/eal/include/generic/rte_cpuflags.h
> > > > +++ b/lib/eal/include/generic/rte_cpuflags.h
> > > > @@ -26,6 +26,8 @@ struct rte_cpu_intrinsics {
> > > >         /**< indicates support for rte_power_pause function */
> > > >         uint32_t power_monitor_multi : 1;
> > > >         /**< indicates support for rte_power_monitor_multi
> > > > function */
> > > > +       uint32_t amd_power_monitorx : 1;
> > > > +       /**< indicates amd support for rte_power_monitor function
> > > > + */
> > >
> > > I did not look at the patch detail, I just stopped at this part.
> > > What makes the AMD monitorx stuff special that it needs to be
> > > exposed in the generic API?
> >
> > Monitorx is different ISA /opcode (0F 01 FA) as compared to UMonitor
> > (0F 01 C8). This need to be distinguished on specific x86 platforms.
> > Hence in the current power intrinsics, for x86 we require a new flag to distinguish
> MonitorX and UMonitor and invoke the appropriate x86 ISA in the datapath.
> 
> Requiring a new x86 cpuflag to identify this ISA presence is ok.
> 
> 
> But here, I am talking about the generic power instrinsic API.
> Let me phrase my comment differently...
> 
> As described in the API:
>         uint32_t power_monitor : 1;
>         /**< indicates support for rte_power_monitor function */
> 
> Does AMD thing behave completely different from the x86?
> Looking at patch 3, I understand this is not the case.
> 
> So we don't need a "amd" flag in the generic flags.
> The indirection for calling the right ISA should be hidden in
> rte_power_* helpers implemented for x86.
> 

Thanks for your feedback. I will fix this and send an updated patch. 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
  2023-04-14  7:05     ` David Marchand
  2023-04-14  8:51       ` Tummala, Sivaprasad
@ 2023-04-14 11:48       ` Ferruh Yigit
  1 sibling, 0 replies; 51+ messages in thread
From: Ferruh Yigit @ 2023-04-14 11:48 UTC (permalink / raw)
  To: David Marchand, Tummala, Sivaprasad, david.hunt
  Cc: dev, Thomas Monjalon, Burakov, Anatoly

On 4/14/2023 8:05 AM, David Marchand wrote:
> On Thu, Apr 13, 2023 at 7:51 PM Tummala, Sivaprasad
> <Sivaprasad.Tummala@amd.com> wrote:
>>
>> [AMD Official Use Only - General]
>>
>> Hi David,
>>
>>> -----Original Message-----
>>> From: David Marchand <david.marchand@redhat.com>
>>> Sent: Thursday, April 13, 2023 5:30 PM
>>> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
>>> Cc: david.hunt@intel.com; dev@dpdk.org; Thomas Monjalon
>>> <thomas@monjalon.net>; Burakov, Anatoly <anatoly.burakov@intel.com>
>>> Subject: Re: [PATCH v2 1/3] eal: add x86 cpuid support for monitorx
>>>
>>> Caution: This message originated from an External Source. Use proper caution
>>> when opening attachments, clicking links, or responding.
>>>
>>>
>>> On Thu, Apr 13, 2023 at 1:54 PM Sivaprasad Tummala
>>> <sivaprasad.tummala@amd.com> wrote:
>>>>
>>>> Add a new CPUID flag to indicate support for monitorx instruction on
>>>> AMD Epyc processors.
>>>>
>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>> ---
>>>>  lib/eal/include/generic/rte_cpuflags.h | 2 ++
>>>>  lib/eal/x86/include/rte_cpuflags.h     | 1 +
>>>>  lib/eal/x86/rte_cpuflags.c             | 3 +++
>>>>  3 files changed, 6 insertions(+)
>>>>
>>>> diff --git a/lib/eal/include/generic/rte_cpuflags.h
>>>> b/lib/eal/include/generic/rte_cpuflags.h
>>>> index d35551e931..db653a8dd7 100644
>>>> --- a/lib/eal/include/generic/rte_cpuflags.h
>>>> +++ b/lib/eal/include/generic/rte_cpuflags.h
>>>> @@ -26,6 +26,8 @@ struct rte_cpu_intrinsics {
>>>>         /**< indicates support for rte_power_pause function */
>>>>         uint32_t power_monitor_multi : 1;
>>>>         /**< indicates support for rte_power_monitor_multi function */
>>>> +       uint32_t amd_power_monitorx : 1;
>>>> +       /**< indicates amd support for rte_power_monitor function */
>>>
>>> I did not look at the patch detail, I just stopped at this part.
>>> What makes the AMD monitorx stuff special that it needs to be exposed in the
>>> generic API?
>>
>> Monitorx is different ISA /opcode (0F 01 FA) as compared to UMonitor (0F 01 C8). This need to be distinguished
>> on specific x86 platforms. Hence in the current power intrinsics, for x86 we require a new flag to
>> distinguish MonitorX and UMonitor and invoke the appropriate x86 ISA in the datapath.
> 
> Requiring a new x86 cpuflag to identify this ISA presence is ok.
> 
> 
> But here, I am talking about the generic power instrinsic API.
> Let me phrase my comment differently...
> 
> As described in the API:
>         uint32_t power_monitor : 1;
>         /**< indicates support for rte_power_monitor function */
> 
> Does AMD thing behave completely different from the x86?
> Looking at patch 3, I understand this is not the case.
> 
> So we don't need a "amd" flag in the generic flags.
> The indirection for calling the right ISA should be hidden in
> rte_power_* helpers implemented for x86.
> 
> 


The 'rte_cpu_get_intrinsics_support()' API and "struct
rte_cpu_intrinsics" struct seems intended to get power features in
generic way, agree to keep it generic.

But also there is a need to run architecture specific instructions, so
need to know the architecture within power library, for this what do you
think to check 'MONITORX' support again in 'rte_power_intrinsics_init()'
function?


And most of the 'amd_power_monitorx()' function is duplicate of the
'rte_power_monitor()' API, only difference is the asm calls, what do you
think to extract these calls to function pointers for AMD and Intel, so
that 'rte_power_monitor()' can become a x86 generic function?

As architecture will be known in the 'rte_power_intrinsics_init()', we
can set the function pointers properly for architecture in this init stage.

Only concern is possible performance impact of pointer dereference
instead of direct call, I hope @David Hunt can help us to test the
performance impact of it in Intel platforms if this approach is reasonable.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 1/4] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-13 11:53 ` [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
@ 2023-04-17  4:31   ` Sivaprasad Tummala
  2023-04-18  8:14     ` [PATCH v4 0/4] power: monitor support for AMD EPYC processors Sivaprasad Tummala
  2023-04-18  8:25     ` Sivaprasad Tummala
  0 siblings, 2 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-17  4:31 UTC (permalink / raw)
  To: david.hunt; +Cc: dev

A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
DPDK 23.07 release to support monitorx instruction on EPYC processors.
This results in ABI breakage for legacy apps.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 doc/guides/rel_notes/deprecation.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index dcc1ca1696..831713983f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -163,3 +163,6 @@ Deprecation Notices
   The new port library API (functions rte_swx_port_*)
   will gradually transition from experimental to stable status
   starting with DPDK 23.07 release.
+
+* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
+  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on EPYC processors.
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 2/4] eal: add x86 cpuid support for monitorx
  2023-04-13 11:53 [PATCH v2 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
                   ` (2 preceding siblings ...)
  2023-04-13 11:59 ` [PATCH v2 1/3] eal: add x86 cpuid support for monitorx David Marchand
@ 2023-04-17  4:32 ` Sivaprasad Tummala
  3 siblings, 0 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-17  4:32 UTC (permalink / raw)
  To: david.hunt; +Cc: dev

Add a new CPUID flag to indicate support for monitorx instruction
on AMD EPYC processors.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/x86/include/rte_cpuflags.h | 1 +
 lib/eal/x86/rte_cpuflags.c         | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/lib/eal/x86/include/rte_cpuflags.h b/lib/eal/x86/include/rte_cpuflags.h
index 92e90fb6e0..d3e608533a 100644
--- a/lib/eal/x86/include/rte_cpuflags.h
+++ b/lib/eal/x86/include/rte_cpuflags.h
@@ -133,6 +133,7 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_AVX512VP2INTERSECT,     /**< AVX512 Two Register Intersection */
 
 	RTE_CPUFLAG_WAITPKG,                /**< UMONITOR/UMWAIT/TPAUSE */
+	RTE_CPUFLAG_MONITORX,               /**< MONITORX */
 
 	/* The last item */
 	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c
index d6b518251b..22f3bed412 100644
--- a/lib/eal/x86/rte_cpuflags.c
+++ b/lib/eal/x86/rte_cpuflags.c
@@ -133,6 +133,7 @@ const struct feature_entry rte_cpu_feature_table[] = {
 
 	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX,  0)
 	FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX,  4)
+	FEAT_DEF(MONITORX, 0x80000001, 0, RTE_REG_ECX,  29)
 
 	FEAT_DEF(SYSCALL, 0x80000001, 0, RTE_REG_EDX, 11)
 	FEAT_DEF(XD, 0x80000001, 0, RTE_REG_EDX, 20)
@@ -191,5 +192,7 @@ rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
 		intrinsics->power_pause = 1;
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RTM))
 			intrinsics->power_monitor_multi = 1;
+	} else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) {
+		intrinsics->power_monitor = 1;
 	}
 }
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 4/4] power: amd power monitor support
  2023-04-13 11:53 ` [PATCH v2 3/3] " Sivaprasad Tummala
@ 2023-04-17  4:34   ` Sivaprasad Tummala
  0 siblings, 0 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-17  4:34 UTC (permalink / raw)
  To: david.hunt; +Cc: dev

mwaitx allows EPYC processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 77 +++++++++++++++++++++++++-----
 1 file changed, 66 insertions(+), 11 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 6eb9e50807..27055bab52 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -17,6 +17,60 @@ static struct power_wait_status {
 	volatile void *monitor_addr; /**< NULL if not currently sleeping */
 } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
 
+/**
+ * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
+ * For more information about usage of these instructions, please refer to
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
+ */
+static void intel_umonitor(volatile void *addr)
+{
+	/* UMONITOR */
+	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
+			:
+			: "D"(addr));
+}
+
+static void intel_umwait(const uint64_t timeout)
+{
+	const uint32_t tsc_l = (uint32_t)timeout;
+	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
+	/* UMWAIT */
+	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
+			: /* ignore rflags */
+			: "D"(0), /* enter C0.2 */
+			"a"(tsc_l), "d"(tsc_h));
+}
+
+/**
+ * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
+ * For more information about usage of these instructions, please refer to
+ * AMD64 Architecture Programmer’s Manual.
+ */
+static void amd_monitorx(volatile void *addr)
+{
+	/* MONITORX */
+	asm volatile(".byte 0x0f, 0x01, 0xfa;"
+			:
+			: "a"(addr),
+			"c"(0),  /* no extensions */
+			"d"(0)); /* no hints */
+}
+
+static void amd_mwaitx(const uint64_t timeout)
+{
+	/* MWAITX */
+	asm volatile(".byte 0x0f, 0x01, 0xfb;"
+			: /* ignore rflags */
+			: "a"(0), /* enter C1 */
+			"c"(2), /* enable timer */
+			"b"(timeout));
+}
+
+static struct {
+	void (*mmonitor)(volatile void *addr);
+	void (*mwait)(const uint64_t timeout);
+} __rte_cache_aligned power_monitor_ops;
+
 static inline void
 __umwait_wakeup(volatile void *addr)
 {
@@ -75,8 +129,6 @@ int
 rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 		const uint64_t tsc_timestamp)
 {
-	const uint32_t tsc_l = (uint32_t)tsc_timestamp;
-	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
 	uint64_t cur_value;
@@ -109,10 +161,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	 * versions support this instruction natively.
 	 */
 
-	/* set address for UMONITOR */
-	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
-			:
-			: "D"(pmc->addr));
+	/* set address for mmonitor */
+	power_monitor_ops.mmonitor(pmc->addr);
 
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
@@ -123,11 +173,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (pmc->fn(cur_value, pmc->opaque) != 0)
 		goto end;
 
-	/* execute UMWAIT */
-	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-			: /* ignore rflags */
-			: "D"(0), /* enter C0.2 */
-			  "a"(tsc_l), "d"(tsc_h));
+	/* execute mwait */
+	power_monitor_ops.mwait(tsc_timestamp);
 
 end:
 	/* erase sleep address */
@@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_multi_supported = 1;
 	if (i.power_monitor)
 		monitor_supported = 1;
+
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
+		power_monitor_ops.mmonitor = &amd_monitorx;
+		power_monitor_ops.mwait = &amd_mwaitx;
+	} else { /* Intel */
+		power_monitor_ops.mmonitor = &intel_umonitor;
+		power_monitor_ops.mwait = &intel_umwait;
+	}
 }
 
 int
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 0/4] power: monitor support for AMD EPYC processors
  2023-04-17  4:31   ` [PATCH v3 1/4] " Sivaprasad Tummala
@ 2023-04-18  8:14     ` Sivaprasad Tummala
  2023-04-18  8:25     ` Sivaprasad Tummala
  1 sibling, 0 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-18  8:14 UTC (permalink / raw)
  To: david.hunt; +Cc: dev, david.marchand, ferruh.yigit

mwaitx allows EPYC processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Sivaprasad Tummala (4):
  doc: announce new cpu flag added to rte_cpu_flag_t
  eal: add x86 cpuid support for monitorx
  eal: removed unnecessary checks in x86 power monitor APIs
  power: amd power monitor support

 doc/guides/rel_notes/deprecation.rst |  3 +
 lib/eal/x86/include/rte_cpuflags.h   |  1 +
 lib/eal/x86/rte_cpuflags.c           |  3 +
 lib/eal/x86/rte_power_intrinsics.c   | 84 +++++++++++++++++++++++-----
 4 files changed, 78 insertions(+), 13 deletions(-)

-- 
2.34.1
Cc: dev@dpdk.org
Cc: david.marchand@redhat.com
Cc: ferruh.yigit@amd.com

v3:
 * patch rework based on review comments

v4: 
 * no code changes

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 0/4] power: monitor support for AMD EPYC processors
  2023-04-17  4:31   ` [PATCH v3 1/4] " Sivaprasad Tummala
  2023-04-18  8:14     ` [PATCH v4 0/4] power: monitor support for AMD EPYC processors Sivaprasad Tummala
@ 2023-04-18  8:25     ` Sivaprasad Tummala
  2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
                         ` (3 more replies)
  1 sibling, 4 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-18  8:25 UTC (permalink / raw)
  To: david.hunt; +Cc: dev, david.marchand, ferruh.yigit

mwaitx allows EPYC processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Sivaprasad Tummala (4):
  doc: announce new cpu flag added to rte_cpu_flag_t
  eal: add x86 cpuid support for monitorx
  eal: removed unnecessary checks in x86 power monitor APIs
  power: amd power monitor support

 doc/guides/rel_notes/deprecation.rst |  3 +
 lib/eal/x86/include/rte_cpuflags.h   |  1 +
 lib/eal/x86/rte_cpuflags.c           |  3 +
 lib/eal/x86/rte_power_intrinsics.c   | 84 +++++++++++++++++++++++-----
 4 files changed, 78 insertions(+), 13 deletions(-)

-- 
2.34.1
Cc: dev@dpdk.org
Cc: david.marchand@redhat.com
Cc: ferruh.yigit@amd.com

v3:
 * patch rework based on review comments

v4: 
 * no code changes. only for patchworks

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-18  8:25     ` Sivaprasad Tummala
@ 2023-04-18  8:25       ` Sivaprasad Tummala
  2023-04-18  8:52         ` Ferruh Yigit
                           ` (2 more replies)
  2023-04-18  8:25       ` [PATCH v4 2/4] " Sivaprasad Tummala
                         ` (2 subsequent siblings)
  3 siblings, 3 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-18  8:25 UTC (permalink / raw)
  To: david.hunt; +Cc: dev, david.marchand, ferruh.yigit

A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
DPDK 23.07 release to support monitorx instruction on EPYC processors.
This results in ABI breakage for legacy apps.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 doc/guides/rel_notes/deprecation.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index dcc1ca1696..831713983f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -163,3 +163,6 @@ Deprecation Notices
   The new port library API (functions rte_swx_port_*)
   will gradually transition from experimental to stable status
   starting with DPDK 23.07 release.
+
+* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
+  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on EPYC processors.
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 2/4] eal: add x86 cpuid support for monitorx
  2023-04-18  8:25     ` Sivaprasad Tummala
  2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
@ 2023-04-18  8:25       ` Sivaprasad Tummala
  2023-06-14 13:15         ` Burakov, Anatoly
  2023-04-18  8:25       ` [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
  2023-04-18  8:25       ` [PATCH v4 4/4] power: amd power monitor support Sivaprasad Tummala
  3 siblings, 1 reply; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-18  8:25 UTC (permalink / raw)
  To: david.hunt; +Cc: dev, david.marchand, ferruh.yigit

Add a new CPUID flag to indicate support for monitorx instruction
on AMD EPYC processors.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/x86/include/rte_cpuflags.h | 1 +
 lib/eal/x86/rte_cpuflags.c         | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/lib/eal/x86/include/rte_cpuflags.h b/lib/eal/x86/include/rte_cpuflags.h
index 92e90fb6e0..d3e608533a 100644
--- a/lib/eal/x86/include/rte_cpuflags.h
+++ b/lib/eal/x86/include/rte_cpuflags.h
@@ -133,6 +133,7 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_AVX512VP2INTERSECT,     /**< AVX512 Two Register Intersection */
 
 	RTE_CPUFLAG_WAITPKG,                /**< UMONITOR/UMWAIT/TPAUSE */
+	RTE_CPUFLAG_MONITORX,               /**< MONITORX */
 
 	/* The last item */
 	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c
index d6b518251b..22f3bed412 100644
--- a/lib/eal/x86/rte_cpuflags.c
+++ b/lib/eal/x86/rte_cpuflags.c
@@ -133,6 +133,7 @@ const struct feature_entry rte_cpu_feature_table[] = {
 
 	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX,  0)
 	FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX,  4)
+	FEAT_DEF(MONITORX, 0x80000001, 0, RTE_REG_ECX,  29)
 
 	FEAT_DEF(SYSCALL, 0x80000001, 0, RTE_REG_EDX, 11)
 	FEAT_DEF(XD, 0x80000001, 0, RTE_REG_EDX, 20)
@@ -191,5 +192,7 @@ rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
 		intrinsics->power_pause = 1;
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RTM))
 			intrinsics->power_monitor_multi = 1;
+	} else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) {
+		intrinsics->power_monitor = 1;
 	}
 }
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs
  2023-04-18  8:25     ` Sivaprasad Tummala
  2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
  2023-04-18  8:25       ` [PATCH v4 2/4] " Sivaprasad Tummala
@ 2023-04-18  8:25       ` Sivaprasad Tummala
  2023-06-14 13:14         ` Burakov, Anatoly
  2023-04-18  8:25       ` [PATCH v4 4/4] power: amd power monitor support Sivaprasad Tummala
  3 siblings, 1 reply; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-18  8:25 UTC (permalink / raw)
  To: david.hunt; +Cc: dev, david.marchand, ferruh.yigit

current x86 power monitor implementation fails on platforms
with only monitor supported and not power_pause.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index f749da9b85..6eb9e50807 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -30,6 +30,7 @@ __umwait_wakeup(volatile void *addr)
 
 static bool wait_supported;
 static bool wait_multi_supported;
+static bool monitor_supported;
 
 static inline uint64_t
 __get_umwait_val(const volatile void *p, const uint8_t sz)
@@ -81,7 +82,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!monitor_supported)
 		return -ENOTSUP;
 
 	/* prevent non-EAL thread from using this API */
@@ -170,6 +171,8 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_supported = 1;
 	if (i.power_monitor_multi)
 		wait_multi_supported = 1;
+	if (i.power_monitor)
+		monitor_supported = 1;
 }
 
 int
@@ -178,7 +181,7 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
 	struct power_wait_status *s;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!monitor_supported)
 		return -ENOTSUP;
 
 	/* prevent buffer overrun */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 4/4] power: amd power monitor support
  2023-04-18  8:25     ` Sivaprasad Tummala
                         ` (2 preceding siblings ...)
  2023-04-18  8:25       ` [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
@ 2023-04-18  8:25       ` Sivaprasad Tummala
  2023-06-14 13:14         ` Burakov, Anatoly
  3 siblings, 1 reply; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-04-18  8:25 UTC (permalink / raw)
  To: david.hunt; +Cc: dev, david.marchand, ferruh.yigit

mwaitx allows EPYC processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 77 +++++++++++++++++++++++++-----
 1 file changed, 66 insertions(+), 11 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 6eb9e50807..27055bab52 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -17,6 +17,60 @@ static struct power_wait_status {
 	volatile void *monitor_addr; /**< NULL if not currently sleeping */
 } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
 
+/**
+ * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
+ * For more information about usage of these instructions, please refer to
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
+ */
+static void intel_umonitor(volatile void *addr)
+{
+	/* UMONITOR */
+	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
+			:
+			: "D"(addr));
+}
+
+static void intel_umwait(const uint64_t timeout)
+{
+	const uint32_t tsc_l = (uint32_t)timeout;
+	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
+	/* UMWAIT */
+	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
+			: /* ignore rflags */
+			: "D"(0), /* enter C0.2 */
+			"a"(tsc_l), "d"(tsc_h));
+}
+
+/**
+ * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
+ * For more information about usage of these instructions, please refer to
+ * AMD64 Architecture Programmer’s Manual.
+ */
+static void amd_monitorx(volatile void *addr)
+{
+	/* MONITORX */
+	asm volatile(".byte 0x0f, 0x01, 0xfa;"
+			:
+			: "a"(addr),
+			"c"(0),  /* no extensions */
+			"d"(0)); /* no hints */
+}
+
+static void amd_mwaitx(const uint64_t timeout)
+{
+	/* MWAITX */
+	asm volatile(".byte 0x0f, 0x01, 0xfb;"
+			: /* ignore rflags */
+			: "a"(0), /* enter C1 */
+			"c"(2), /* enable timer */
+			"b"(timeout));
+}
+
+static struct {
+	void (*mmonitor)(volatile void *addr);
+	void (*mwait)(const uint64_t timeout);
+} __rte_cache_aligned power_monitor_ops;
+
 static inline void
 __umwait_wakeup(volatile void *addr)
 {
@@ -75,8 +129,6 @@ int
 rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 		const uint64_t tsc_timestamp)
 {
-	const uint32_t tsc_l = (uint32_t)tsc_timestamp;
-	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
 	uint64_t cur_value;
@@ -109,10 +161,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	 * versions support this instruction natively.
 	 */
 
-	/* set address for UMONITOR */
-	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
-			:
-			: "D"(pmc->addr));
+	/* set address for mmonitor */
+	power_monitor_ops.mmonitor(pmc->addr);
 
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
@@ -123,11 +173,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (pmc->fn(cur_value, pmc->opaque) != 0)
 		goto end;
 
-	/* execute UMWAIT */
-	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-			: /* ignore rflags */
-			: "D"(0), /* enter C0.2 */
-			  "a"(tsc_l), "d"(tsc_h));
+	/* execute mwait */
+	power_monitor_ops.mwait(tsc_timestamp);
 
 end:
 	/* erase sleep address */
@@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_multi_supported = 1;
 	if (i.power_monitor)
 		monitor_supported = 1;
+
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
+		power_monitor_ops.mmonitor = &amd_monitorx;
+		power_monitor_ops.mwait = &amd_mwaitx;
+	} else { /* Intel */
+		power_monitor_ops.mmonitor = &intel_umonitor;
+		power_monitor_ops.mwait = &intel_umwait;
+	}
 }
 
 int
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
@ 2023-04-18  8:52         ` Ferruh Yigit
  2023-04-18  9:22           ` Bruce Richardson
  2023-07-05 11:32         ` Konstantin Ananyev
  2023-08-16 18:59         ` [PATCH v5 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2 siblings, 1 reply; 51+ messages in thread
From: Ferruh Yigit @ 2023-04-18  8:52 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.hunt
  Cc: dev, david.marchand, Bruce Richardson, Thomas Monjalon

On 4/18/2023 9:25 AM, Sivaprasad Tummala wrote:
> A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
> DPDK 23.07 release to support monitorx instruction on EPYC processors.
> This results in ABI breakage for legacy apps.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index dcc1ca1696..831713983f 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -163,3 +163,6 @@ Deprecation Notices
>    The new port library API (functions rte_swx_port_*)
>    will gradually transition from experimental to stable status
>    starting with DPDK 23.07 release.
> +
> +* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
> +  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on EPYC processors.


OK to add new CPU flag,
Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>


But @David, @Bruce, is it OK to break ABI whenever a new CPU flag is
added, should we hide CPU flags better?

Or other option can be drop the 'RTE_CPUFLAG_NUMFLAGS' and allow
appending new flags to the end although this may lead enum become more
messy by time.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-18  8:52         ` Ferruh Yigit
@ 2023-04-18  9:22           ` Bruce Richardson
  2023-06-01  9:23             ` David Marchand
  0 siblings, 1 reply; 51+ messages in thread
From: Bruce Richardson @ 2023-04-18  9:22 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Sivaprasad Tummala, david.hunt, dev, david.marchand, Thomas Monjalon

On Tue, Apr 18, 2023 at 09:52:49AM +0100, Ferruh Yigit wrote:
> On 4/18/2023 9:25 AM, Sivaprasad Tummala wrote:
> > A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
> > DPDK 23.07 release to support monitorx instruction on EPYC processors.
> > This results in ABI breakage for legacy apps.
> > 
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> > index dcc1ca1696..831713983f 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -163,3 +163,6 @@ Deprecation Notices
> >    The new port library API (functions rte_swx_port_*)
> >    will gradually transition from experimental to stable status
> >    starting with DPDK 23.07 release.
> > +
> > +* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
> > +  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on EPYC processors.
> 
> 
> OK to add new CPU flag,
> Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>
> 
> 
> But @David, @Bruce, is it OK to break ABI whenever a new CPU flag is
> added, should we hide CPU flags better?
> 
> Or other option can be drop the 'RTE_CPUFLAG_NUMFLAGS' and allow
> appending new flags to the end although this may lead enum become more
> messy by time.

+1 top drop the NUMFLAGS value. We should not break ABI each time we need a
new flag.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-18  9:22           ` Bruce Richardson
@ 2023-06-01  9:23             ` David Marchand
  0 siblings, 0 replies; 51+ messages in thread
From: David Marchand @ 2023-06-01  9:23 UTC (permalink / raw)
  To: Bruce Richardson, Ferruh Yigit, Sivaprasad Tummala
  Cc: david.hunt, dev, Thomas Monjalon

On Tue, Apr 18, 2023 at 11:22 AM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Tue, Apr 18, 2023 at 09:52:49AM +0100, Ferruh Yigit wrote:
> > On 4/18/2023 9:25 AM, Sivaprasad Tummala wrote:
> > > A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
> > > DPDK 23.07 release to support monitorx instruction on EPYC processors.
> > > This results in ABI breakage for legacy apps.
> > >
> > > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > > ---
> > >  doc/guides/rel_notes/deprecation.rst | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> > > index dcc1ca1696..831713983f 100644
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > @@ -163,3 +163,6 @@ Deprecation Notices
> > >    The new port library API (functions rte_swx_port_*)
> > >    will gradually transition from experimental to stable status
> > >    starting with DPDK 23.07 release.
> > > +
> > > +* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
> > > +  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on EPYC processors.

There is no need for announcing an addition.
The problem is moving/removing other elements of an enum.


> >
> >
> > OK to add new CPU flag,
> > Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>
> >
> >
> > But @David, @Bruce, is it OK to break ABI whenever a new CPU flag is
> > added, should we hide CPU flags better?
> >
> > Or other option can be drop the 'RTE_CPUFLAG_NUMFLAGS' and allow
> > appending new flags to the end although this may lead enum become more
> > messy by time.
>
> +1 top drop the NUMFLAGS value. We should not break ABI each time we need a
> new flag.

+1.
So in 23.07 we need an announce for this removal to happen in 23.11.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs
  2023-04-18  8:25       ` [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
@ 2023-06-14 13:14         ` Burakov, Anatoly
  0 siblings, 0 replies; 51+ messages in thread
From: Burakov, Anatoly @ 2023-06-14 13:14 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.hunt; +Cc: dev, david.marchand, ferruh.yigit

On 4/18/2023 9:25 AM, Sivaprasad Tummala wrote:
> current x86 power monitor implementation fails on platforms
> with only monitor supported and not power_pause.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

-- 
Thanks,
Anatoly


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 4/4] power: amd power monitor support
  2023-04-18  8:25       ` [PATCH v4 4/4] power: amd power monitor support Sivaprasad Tummala
@ 2023-06-14 13:14         ` Burakov, Anatoly
  0 siblings, 0 replies; 51+ messages in thread
From: Burakov, Anatoly @ 2023-06-14 13:14 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.hunt; +Cc: dev, david.marchand, ferruh.yigit

On 4/18/2023 9:25 AM, Sivaprasad Tummala wrote:
> mwaitx allows EPYC processors to enter a implementation dependent
> power/performance optimized state (C1 state) for a specific period
> or until a store to the monitored address range.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---

Bar one fix below,

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

>   lib/eal/x86/rte_power_intrinsics.c | 77 +++++++++++++++++++++++++-----
>   1 file changed, 66 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
> index 6eb9e50807..27055bab52 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -17,6 +17,60 @@ static struct power_wait_status {
>   	volatile void *monitor_addr; /**< NULL if not currently sleeping */
>   } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>   
> +/**
> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
> + * For more information about usage of these instructions, please refer to
> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> + */
> +static void intel_umonitor(volatile void *addr)
> +{
> +	/* UMONITOR */
> +	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> +			:
> +			: "D"(addr));
> +}
> +
> +static void intel_umwait(const uint64_t timeout)
> +{
> +	const uint32_t tsc_l = (uint32_t)timeout;
> +	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> +	/* UMWAIT */
> +	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> +			: /* ignore rflags */
> +			: "D"(0), /* enter C0.2 */
> +			"a"(tsc_l), "d"(tsc_h));
> +}
> +
> +/**
> + * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
> + * For more information about usage of these instructions, please refer to
> + * AMD64 Architecture Programmer’s Manual.

The quote sign is wrong :)

-- 
Thanks,
Anatoly


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 2/4] eal: add x86 cpuid support for monitorx
  2023-04-18  8:25       ` [PATCH v4 2/4] " Sivaprasad Tummala
@ 2023-06-14 13:15         ` Burakov, Anatoly
  0 siblings, 0 replies; 51+ messages in thread
From: Burakov, Anatoly @ 2023-06-14 13:15 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.hunt; +Cc: dev, david.marchand, ferruh.yigit

On 4/18/2023 9:25 AM, Sivaprasad Tummala wrote:
> Add a new CPUID flag to indicate support for monitorx instruction
> on AMD EPYC processors.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

-- 
Thanks,
Anatoly


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t
  2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
  2023-04-18  8:52         ` Ferruh Yigit
@ 2023-07-05 11:32         ` Konstantin Ananyev
  2023-08-16 18:59         ` [PATCH v5 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2 siblings, 0 replies; 51+ messages in thread
From: Konstantin Ananyev @ 2023-07-05 11:32 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.hunt; +Cc: dev, david.marchand, ferruh.yigit

18/04/2023 09:25, Sivaprasad Tummala пишет:
> A new flag RTE_CPUFLAG_MONITORX is added to rte_cpu_flag_t in
> DPDK 23.07 release to support monitorx instruction on EPYC processors.
> This results in ABI breakage for legacy apps.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
>   doc/guides/rel_notes/deprecation.rst | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index dcc1ca1696..831713983f 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -163,3 +163,6 @@ Deprecation Notices
>     The new port library API (functions rte_swx_port_*)
>     will gradually transition from experimental to stable status
>     starting with DPDK 23.07 release.
> +
> +* eal/x86: The enum ``rte_cpu_flag_t`` will be extended with a new cpu flag
> +  ``RTE_CPUFLAG_MONITORX`` to support monitorx instruction on EPYC processors.

Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v5 1/3] eal: add x86 cpuid support for monitorx
  2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
  2023-04-18  8:52         ` Ferruh Yigit
  2023-07-05 11:32         ` Konstantin Ananyev
@ 2023-08-16 18:59         ` Sivaprasad Tummala
  2023-08-16 18:59           ` [PATCH v5 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
  2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
  2 siblings, 2 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-08-16 18:59 UTC (permalink / raw)
  To: david.hunt, anatoly.burakov, ferruh.yigit, david.marchand, thomas; +Cc: dev

Add a new CPUID flag to indicate support for monitorx instruction
on AMD EPYC processors.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
Depends-on: series-29173 ("test/cpuflags: removed test for NUMFLAGS")
---
 lib/eal/x86/include/rte_cpuflags.h | 2 +-
 lib/eal/x86/rte_cpuflags.c         | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/eal/x86/include/rte_cpuflags.h b/lib/eal/x86/include/rte_cpuflags.h
index 7fc6117243..402281f2b1 100644
--- a/lib/eal/x86/include/rte_cpuflags.h
+++ b/lib/eal/x86/include/rte_cpuflags.h
@@ -133,7 +133,7 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_AVX512VP2INTERSECT,     /**< AVX512 Two Register Intersection */
 
 	RTE_CPUFLAG_WAITPKG,                /**< UMONITOR/UMWAIT/TPAUSE */
-
+	RTE_CPUFLAG_MONITORX,               /**< MONITORX */
 	/* The last item */
 };
 
diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c
index 22061cb6d3..063dc7a624 100644
--- a/lib/eal/x86/rte_cpuflags.c
+++ b/lib/eal/x86/rte_cpuflags.c
@@ -133,6 +133,7 @@ const struct feature_entry rte_cpu_feature_table[] = {
 
 	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX,  0)
 	FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX,  4)
+	FEAT_DEF(MONITORX, 0x80000001, 0, RTE_REG_ECX,  29)
 
 	FEAT_DEF(SYSCALL, 0x80000001, 0, RTE_REG_EDX, 11)
 	FEAT_DEF(XD, 0x80000001, 0, RTE_REG_EDX, 20)
@@ -194,5 +195,7 @@ rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
 		intrinsics->power_pause = 1;
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RTM))
 			intrinsics->power_monitor_multi = 1;
+	} else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) {
+		intrinsics->power_monitor = 1;
 	}
 }
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v5 2/3] eal: removed unnecessary checks in x86 power monitor APIs
  2023-08-16 18:59         ` [PATCH v5 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
@ 2023-08-16 18:59           ` Sivaprasad Tummala
  2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
  1 sibling, 0 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-08-16 18:59 UTC (permalink / raw)
  To: david.hunt, anatoly.burakov, ferruh.yigit, david.marchand, thomas; +Cc: dev

current x86 power monitor implementation fails on platforms
with only monitor supported and not power_pause.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index f749da9b85..6eb9e50807 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -30,6 +30,7 @@ __umwait_wakeup(volatile void *addr)
 
 static bool wait_supported;
 static bool wait_multi_supported;
+static bool monitor_supported;
 
 static inline uint64_t
 __get_umwait_val(const volatile void *p, const uint8_t sz)
@@ -81,7 +82,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!monitor_supported)
 		return -ENOTSUP;
 
 	/* prevent non-EAL thread from using this API */
@@ -170,6 +171,8 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_supported = 1;
 	if (i.power_monitor_multi)
 		wait_multi_supported = 1;
+	if (i.power_monitor)
+		monitor_supported = 1;
 }
 
 int
@@ -178,7 +181,7 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
 	struct power_wait_status *s;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!monitor_supported)
 		return -ENOTSUP;
 
 	/* prevent buffer overrun */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v5 3/3] power: amd power monitor support
  2023-08-16 18:59         ` [PATCH v5 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2023-08-16 18:59           ` [PATCH v5 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
@ 2023-08-16 18:59           ` Sivaprasad Tummala
  2023-08-16 19:27             ` Tyler Retzlaff
                               ` (2 more replies)
  1 sibling, 3 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-08-16 18:59 UTC (permalink / raw)
  To: david.hunt, anatoly.burakov, ferruh.yigit, david.marchand, thomas; +Cc: dev

mwaitx allows EPYC processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 77 +++++++++++++++++++++++++-----
 1 file changed, 66 insertions(+), 11 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 6eb9e50807..b4754e17da 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -17,6 +17,60 @@ static struct power_wait_status {
 	volatile void *monitor_addr; /**< NULL if not currently sleeping */
 } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
 
+/**
+ * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
+ * For more information about usage of these instructions, please refer to
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
+ */
+static void intel_umonitor(volatile void *addr)
+{
+	/* UMONITOR */
+	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
+			:
+			: "D"(addr));
+}
+
+static void intel_umwait(const uint64_t timeout)
+{
+	const uint32_t tsc_l = (uint32_t)timeout;
+	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
+	/* UMWAIT */
+	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
+			: /* ignore rflags */
+			: "D"(0), /* enter C0.2 */
+			"a"(tsc_l), "d"(tsc_h));
+}
+
+/**
+ * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
+ * For more information about usage of these instructions, please refer to
+ * AMD64 Architecture Programmer's Manual.
+ */
+static void amd_monitorx(volatile void *addr)
+{
+	/* MONITORX */
+	asm volatile(".byte 0x0f, 0x01, 0xfa;"
+			:
+			: "a"(addr),
+			"c"(0),  /* no extensions */
+			"d"(0)); /* no hints */
+}
+
+static void amd_mwaitx(const uint64_t timeout)
+{
+	/* MWAITX */
+	asm volatile(".byte 0x0f, 0x01, 0xfb;"
+			: /* ignore rflags */
+			: "a"(0), /* enter C1 */
+			"c"(2), /* enable timer */
+			"b"(timeout));
+}
+
+static struct {
+	void (*mmonitor)(volatile void *addr);
+	void (*mwait)(const uint64_t timeout);
+} __rte_cache_aligned power_monitor_ops;
+
 static inline void
 __umwait_wakeup(volatile void *addr)
 {
@@ -75,8 +129,6 @@ int
 rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 		const uint64_t tsc_timestamp)
 {
-	const uint32_t tsc_l = (uint32_t)tsc_timestamp;
-	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
 	uint64_t cur_value;
@@ -109,10 +161,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	 * versions support this instruction natively.
 	 */
 
-	/* set address for UMONITOR */
-	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
-			:
-			: "D"(pmc->addr));
+	/* set address for mmonitor */
+	power_monitor_ops.mmonitor(pmc->addr);
 
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
@@ -123,11 +173,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (pmc->fn(cur_value, pmc->opaque) != 0)
 		goto end;
 
-	/* execute UMWAIT */
-	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-			: /* ignore rflags */
-			: "D"(0), /* enter C0.2 */
-			  "a"(tsc_l), "d"(tsc_h));
+	/* execute mwait */
+	power_monitor_ops.mwait(tsc_timestamp);
 
 end:
 	/* erase sleep address */
@@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_multi_supported = 1;
 	if (i.power_monitor)
 		monitor_supported = 1;
+
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
+		power_monitor_ops.mmonitor = &amd_monitorx;
+		power_monitor_ops.mwait = &amd_mwaitx;
+	} else { /* Intel */
+		power_monitor_ops.mmonitor = &intel_umonitor;
+		power_monitor_ops.mwait = &intel_umwait;
+	}
 }
 
 int
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
@ 2023-08-16 19:27             ` Tyler Retzlaff
  2023-08-17 11:34               ` Tummala, Sivaprasad
  2023-10-06  8:26             ` David Marchand
  2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2 siblings, 1 reply; 51+ messages in thread
From: Tyler Retzlaff @ 2023-08-16 19:27 UTC (permalink / raw)
  To: Sivaprasad Tummala
  Cc: david.hunt, anatoly.burakov, ferruh.yigit, david.marchand, thomas, dev

On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> mwaitx allows EPYC processors to enter a implementation dependent
> power/performance optimized state (C1 state) for a specific period
> or until a store to the monitored address range.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>  lib/eal/x86/rte_power_intrinsics.c | 77 +++++++++++++++++++++++++-----
>  1 file changed, 66 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
> index 6eb9e50807..b4754e17da 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -17,6 +17,60 @@ static struct power_wait_status {
>  	volatile void *monitor_addr; /**< NULL if not currently sleeping */
>  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>  
> +/**
> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
> + * For more information about usage of these instructions, please refer to
> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> + */
> +static void intel_umonitor(volatile void *addr)
> +{
> +	/* UMONITOR */
> +	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> +			:
> +			: "D"(addr));
> +}
> +
> +static void intel_umwait(const uint64_t timeout)
> +{
> +	const uint32_t tsc_l = (uint32_t)timeout;
> +	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> +	/* UMWAIT */
> +	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> +			: /* ignore rflags */
> +			: "D"(0), /* enter C0.2 */
> +			"a"(tsc_l), "d"(tsc_h));
> +}

question and perhaps Anatoly Burakov can chime in with expertise.

gcc/clang have built-in intrinsics for umonitor and umwait i believe as
per our other thread of discussion is there a benefit to also providing
inline assembly over just using the intrinsics? I understand that the
intrinsics may not exist for the monitorx and mwaitx below so it is probably
necessary for amd.

so the suggestion here is when they are available just use the
intrinsics.

thanks

> +
> +/**
> + * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
> + * For more information about usage of these instructions, please refer to
> + * AMD64 Architecture Programmer's Manual.
> + */
> +static void amd_monitorx(volatile void *addr)
> +{
> +	/* MONITORX */
> +	asm volatile(".byte 0x0f, 0x01, 0xfa;"
> +			:
> +			: "a"(addr),
> +			"c"(0),  /* no extensions */
> +			"d"(0)); /* no hints */
> +}
> +
> +static void amd_mwaitx(const uint64_t timeout)
> +{
> +	/* MWAITX */
> +	asm volatile(".byte 0x0f, 0x01, 0xfb;"
> +			: /* ignore rflags */
> +			: "a"(0), /* enter C1 */
> +			"c"(2), /* enable timer */
> +			"b"(timeout));
> +}
> +
> +static struct {
> +	void (*mmonitor)(volatile void *addr);
> +	void (*mwait)(const uint64_t timeout);
> +} __rte_cache_aligned power_monitor_ops;
> +
>  static inline void
>  __umwait_wakeup(volatile void *addr)
>  {
> @@ -75,8 +129,6 @@ int
>  rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>  		const uint64_t tsc_timestamp)
>  {
> -	const uint32_t tsc_l = (uint32_t)tsc_timestamp;
> -	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
>  	const unsigned int lcore_id = rte_lcore_id();
>  	struct power_wait_status *s;
>  	uint64_t cur_value;
> @@ -109,10 +161,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>  	 * versions support this instruction natively.
>  	 */
>  
> -	/* set address for UMONITOR */
> -	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> -			:
> -			: "D"(pmc->addr));
> +	/* set address for mmonitor */
> +	power_monitor_ops.mmonitor(pmc->addr);
>  
>  	/* now that we've put this address into monitor, we can unlock */
>  	rte_spinlock_unlock(&s->lock);
> @@ -123,11 +173,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>  	if (pmc->fn(cur_value, pmc->opaque) != 0)
>  		goto end;
>  
> -	/* execute UMWAIT */
> -	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> -			: /* ignore rflags */
> -			: "D"(0), /* enter C0.2 */
> -			  "a"(tsc_l), "d"(tsc_h));
> +	/* execute mwait */
> +	power_monitor_ops.mwait(tsc_timestamp);
>  
>  end:
>  	/* erase sleep address */
> @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
>  		wait_multi_supported = 1;
>  	if (i.power_monitor)
>  		monitor_supported = 1;
> +
> +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
> +		power_monitor_ops.mmonitor = &amd_monitorx;
> +		power_monitor_ops.mwait = &amd_mwaitx;
> +	} else { /* Intel */
> +		power_monitor_ops.mmonitor = &intel_umonitor;
> +		power_monitor_ops.mwait = &intel_umwait;
> +	}
>  }
>  
>  int
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 3/3] power: amd power monitor support
  2023-08-16 19:27             ` Tyler Retzlaff
@ 2023-08-17 11:34               ` Tummala, Sivaprasad
  2023-08-17 14:18                 ` Konstantin Ananyev
  0 siblings, 1 reply; 51+ messages in thread
From: Tummala, Sivaprasad @ 2023-08-17 11:34 UTC (permalink / raw)
  To: Tyler Retzlaff
  Cc: david.hunt, anatoly.burakov, Yigit, Ferruh, david.marchand, thomas, dev

[AMD Official Use Only - General]

Hi Tyler,

> -----Original Message-----
> From: Tyler Retzlaff <roretzla@linux.microsoft.com>
> Sent: Thursday, August 17, 2023 12:58 AM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; anatoly.burakov@intel.com; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; david.marchand@redhat.com; thomas@monjalon.net;
> dev@dpdk.org
> Subject: Re: [PATCH v5 3/3] power: amd power monitor support
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> > mwaitx allows EPYC processors to enter a implementation dependent
> > power/performance optimized state (C1 state) for a specific period or
> > until a store to the monitored address range.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > ---
> >  lib/eal/x86/rte_power_intrinsics.c | 77
> > +++++++++++++++++++++++++-----
> >  1 file changed, 66 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/eal/x86/rte_power_intrinsics.c
> > b/lib/eal/x86/rte_power_intrinsics.c
> > index 6eb9e50807..b4754e17da 100644
> > --- a/lib/eal/x86/rte_power_intrinsics.c
> > +++ b/lib/eal/x86/rte_power_intrinsics.c
> > @@ -17,6 +17,60 @@ static struct power_wait_status {
> >       volatile void *monitor_addr; /**< NULL if not currently sleeping
> > */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> >
> > +/**
> > + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2
> state.
> > + * For more information about usage of these instructions, please
> > +refer to
> > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> > + */
> > +static void intel_umonitor(volatile void *addr) {
> > +     /* UMONITOR */
> > +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > +                     :
> > +                     : "D"(addr));
> > +}
> > +
> > +static void intel_umwait(const uint64_t timeout) {
> > +     const uint32_t tsc_l = (uint32_t)timeout;
> > +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> > +     /* UMWAIT */
> > +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > +                     : /* ignore rflags */
> > +                     : "D"(0), /* enter C0.2 */
> > +                     "a"(tsc_l), "d"(tsc_h)); }
>
> question and perhaps Anatoly Burakov can chime in with expertise.
>
> gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other
> thread of discussion is there a benefit to also providing inline assembly over just
> using the intrinsics? I understand that the intrinsics may not exist for the monitorx
> and mwaitx below so it is probably necessary for amd.
>
> so the suggestion here is when they are available just use the intrinsics.
>
> thanks
>
The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx
is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline
assembly is required here.

> > +
> > +/**
> > + * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
> > + * For more information about usage of these instructions, please
> > +refer to
> > + * AMD64 Architecture Programmer's Manual.
> > + */
> > +static void amd_monitorx(volatile void *addr) {
> > +     /* MONITORX */
> > +     asm volatile(".byte 0x0f, 0x01, 0xfa;"
> > +                     :
> > +                     : "a"(addr),
> > +                     "c"(0),  /* no extensions */
> > +                     "d"(0)); /* no hints */ }
> > +
> > +static void amd_mwaitx(const uint64_t timeout) {
> > +     /* MWAITX */
> > +     asm volatile(".byte 0x0f, 0x01, 0xfb;"
> > +                     : /* ignore rflags */
> > +                     : "a"(0), /* enter C1 */
> > +                     "c"(2), /* enable timer */
> > +                     "b"(timeout));
> > +}
> > +
> > +static struct {
> > +     void (*mmonitor)(volatile void *addr);
> > +     void (*mwait)(const uint64_t timeout); } __rte_cache_aligned
> > +power_monitor_ops;
> > +
> >  static inline void
> >  __umwait_wakeup(volatile void *addr)
> >  {
> > @@ -75,8 +129,6 @@ int
> >  rte_power_monitor(const struct rte_power_monitor_cond *pmc,
> >               const uint64_t tsc_timestamp)  {
> > -     const uint32_t tsc_l = (uint32_t)tsc_timestamp;
> > -     const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
> >       const unsigned int lcore_id = rte_lcore_id();
> >       struct power_wait_status *s;
> >       uint64_t cur_value;
> > @@ -109,10 +161,8 @@ rte_power_monitor(const struct
> rte_power_monitor_cond *pmc,
> >        * versions support this instruction natively.
> >        */
> >
> > -     /* set address for UMONITOR */
> > -     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > -                     :
> > -                     : "D"(pmc->addr));
> > +     /* set address for mmonitor */
> > +     power_monitor_ops.mmonitor(pmc->addr);
> >
> >       /* now that we've put this address into monitor, we can unlock */
> >       rte_spinlock_unlock(&s->lock);
> > @@ -123,11 +173,8 @@ rte_power_monitor(const struct
> rte_power_monitor_cond *pmc,
> >       if (pmc->fn(cur_value, pmc->opaque) != 0)
> >               goto end;
> >
> > -     /* execute UMWAIT */
> > -     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > -                     : /* ignore rflags */
> > -                     : "D"(0), /* enter C0.2 */
> > -                       "a"(tsc_l), "d"(tsc_h));
> > +     /* execute mwait */
> > +     power_monitor_ops.mwait(tsc_timestamp);
> >
> >  end:
> >       /* erase sleep address */
> > @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
> >               wait_multi_supported = 1;
> >       if (i.power_monitor)
> >               monitor_supported = 1;
> > +
> > +     if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
> > +             power_monitor_ops.mmonitor = &amd_monitorx;
> > +             power_monitor_ops.mwait = &amd_mwaitx;
> > +     } else { /* Intel */
> > +             power_monitor_ops.mmonitor = &intel_umonitor;
> > +             power_monitor_ops.mwait = &intel_umwait;
> > +     }
> >  }
> >
> >  int
> > --
> > 2.34.1

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 3/3] power: amd power monitor support
  2023-08-17 11:34               ` Tummala, Sivaprasad
@ 2023-08-17 14:18                 ` Konstantin Ananyev
  2023-08-18 13:25                   ` Ferruh Yigit
  0 siblings, 1 reply; 51+ messages in thread
From: Konstantin Ananyev @ 2023-08-17 14:18 UTC (permalink / raw)
  To: Tummala, Sivaprasad, Tyler Retzlaff
  Cc: david.hunt, anatoly.burakov, Yigit, Ferruh, david.marchand, thomas, dev


> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> > > mwaitx allows EPYC processors to enter a implementation dependent
> > > power/performance optimized state (C1 state) for a specific period or
> > > until a store to the monitored address range.
> > >
> > > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > > ---
> > >  lib/eal/x86/rte_power_intrinsics.c | 77
> > > +++++++++++++++++++++++++-----
> > >  1 file changed, 66 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/lib/eal/x86/rte_power_intrinsics.c
> > > b/lib/eal/x86/rte_power_intrinsics.c
> > > index 6eb9e50807..b4754e17da 100644
> > > --- a/lib/eal/x86/rte_power_intrinsics.c
> > > +++ b/lib/eal/x86/rte_power_intrinsics.c
> > > @@ -17,6 +17,60 @@ static struct power_wait_status {
> > >       volatile void *monitor_addr; /**< NULL if not currently sleeping
> > > */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> > >
> > > +/**
> > > + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2
> > state.
> > > + * For more information about usage of these instructions, please
> > > +refer to
> > > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> > > + */
> > > +static void intel_umonitor(volatile void *addr) {
> > > +     /* UMONITOR */
> > > +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > > +                     :
> > > +                     : "D"(addr));
> > > +}
> > > +
> > > +static void intel_umwait(const uint64_t timeout) {
> > > +     const uint32_t tsc_l = (uint32_t)timeout;
> > > +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> > > +     /* UMWAIT */
> > > +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > > +                     : /* ignore rflags */
> > > +                     : "D"(0), /* enter C0.2 */
> > > +                     "a"(tsc_l), "d"(tsc_h)); }
> >
> > question and perhaps Anatoly Burakov can chime in with expertise.
> >
> > gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other
> > thread of discussion is there a benefit to also providing inline assembly over just
> > using the intrinsics? I understand that the intrinsics may not exist for the monitorx
> > and mwaitx below so it is probably necessary for amd.
> >
> > so the suggestion here is when they are available just use the intrinsics.
> >
> > thanks
> >
> The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx
> is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline
> assembly is required here.

Ok... but we can probably put them into a separate .c file that will be compiled with that specific flag?
Same thing can be probably done for Intel specific instructions.
In general, I think it is much more preferable to use built-ins vs inline assembly
(if possible off-course). 
Konstantin

> 
> > > +
> > > +/**
> > > + * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
> > > + * For more information about usage of these instructions, please
> > > +refer to
> > > + * AMD64 Architecture Programmer's Manual.
> > > + */
> > > +static void amd_monitorx(volatile void *addr) {
> > > +     /* MONITORX */
> > > +     asm volatile(".byte 0x0f, 0x01, 0xfa;"
> > > +                     :
> > > +                     : "a"(addr),
> > > +                     "c"(0),  /* no extensions */
> > > +                     "d"(0)); /* no hints */ }
> > > +
> > > +static void amd_mwaitx(const uint64_t timeout) {
> > > +     /* MWAITX */
> > > +     asm volatile(".byte 0x0f, 0x01, 0xfb;"
> > > +                     : /* ignore rflags */
> > > +                     : "a"(0), /* enter C1 */
> > > +                     "c"(2), /* enable timer */
> > > +                     "b"(timeout));
> > > +}
> > > +
> > > +static struct {
> > > +     void (*mmonitor)(volatile void *addr);
> > > +     void (*mwait)(const uint64_t timeout); } __rte_cache_aligned
> > > +power_monitor_ops;
> > > +
> > >  static inline void
> > >  __umwait_wakeup(volatile void *addr)
> > >  {
> > > @@ -75,8 +129,6 @@ int
> > >  rte_power_monitor(const struct rte_power_monitor_cond *pmc,
> > >               const uint64_t tsc_timestamp)  {
> > > -     const uint32_t tsc_l = (uint32_t)tsc_timestamp;
> > > -     const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
> > >       const unsigned int lcore_id = rte_lcore_id();
> > >       struct power_wait_status *s;
> > >       uint64_t cur_value;
> > > @@ -109,10 +161,8 @@ rte_power_monitor(const struct
> > rte_power_monitor_cond *pmc,
> > >        * versions support this instruction natively.
> > >        */
> > >
> > > -     /* set address for UMONITOR */
> > > -     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > > -                     :
> > > -                     : "D"(pmc->addr));
> > > +     /* set address for mmonitor */
> > > +     power_monitor_ops.mmonitor(pmc->addr);
> > >
> > >       /* now that we've put this address into monitor, we can unlock */
> > >       rte_spinlock_unlock(&s->lock);
> > > @@ -123,11 +173,8 @@ rte_power_monitor(const struct
> > rte_power_monitor_cond *pmc,
> > >       if (pmc->fn(cur_value, pmc->opaque) != 0)
> > >               goto end;
> > >
> > > -     /* execute UMWAIT */
> > > -     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > > -                     : /* ignore rflags */
> > > -                     : "D"(0), /* enter C0.2 */
> > > -                       "a"(tsc_l), "d"(tsc_h));
> > > +     /* execute mwait */
> > > +     power_monitor_ops.mwait(tsc_timestamp);
> > >
> > >  end:
> > >       /* erase sleep address */
> > > @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
> > >               wait_multi_supported = 1;
> > >       if (i.power_monitor)
> > >               monitor_supported = 1;
> > > +
> > > +     if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
> > > +             power_monitor_ops.mmonitor = &amd_monitorx;
> > > +             power_monitor_ops.mwait = &amd_mwaitx;
> > > +     } else { /* Intel */
> > > +             power_monitor_ops.mmonitor = &intel_umonitor;
> > > +             power_monitor_ops.mwait = &intel_umwait;
> > > +     }
> > >  }
> > >
> > >  int
> > > --
> > > 2.34.1

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-17 14:18                 ` Konstantin Ananyev
@ 2023-08-18 13:25                   ` Ferruh Yigit
  2023-08-18 13:48                     ` Bruce Richardson
  0 siblings, 1 reply; 51+ messages in thread
From: Ferruh Yigit @ 2023-08-18 13:25 UTC (permalink / raw)
  To: Konstantin Ananyev, Tummala, Sivaprasad, Tyler Retzlaff
  Cc: david.hunt, anatoly.burakov, david.marchand, thomas, dev

On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> 
>>> Caution: This message originated from an External Source. Use proper caution
>>> when opening attachments, clicking links, or responding.
>>>
>>>
>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
>>>> mwaitx allows EPYC processors to enter a implementation dependent
>>>> power/performance optimized state (C1 state) for a specific period or
>>>> until a store to the monitored address range.
>>>>
>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>> ---
>>>>  lib/eal/x86/rte_power_intrinsics.c | 77
>>>> +++++++++++++++++++++++++-----
>>>>  1 file changed, 66 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
>>>> b/lib/eal/x86/rte_power_intrinsics.c
>>>> index 6eb9e50807..b4754e17da 100644
>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
>>>>       volatile void *monitor_addr; /**< NULL if not currently sleeping
>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>>>>
>>>> +/**
>>>> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2
>>> state.
>>>> + * For more information about usage of these instructions, please
>>>> +refer to
>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
>>>> + */
>>>> +static void intel_umonitor(volatile void *addr) {
>>>> +     /* UMONITOR */
>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
>>>> +                     :
>>>> +                     : "D"(addr));
>>>> +}
>>>> +
>>>> +static void intel_umwait(const uint64_t timeout) {
>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
>>>> +     /* UMWAIT */
>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
>>>> +                     : /* ignore rflags */
>>>> +                     : "D"(0), /* enter C0.2 */
>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
>>>
>>> question and perhaps Anatoly Burakov can chime in with expertise.
>>>
>>> gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other
>>> thread of discussion is there a benefit to also providing inline assembly over just
>>> using the intrinsics? I understand that the intrinsics may not exist for the monitorx
>>> and mwaitx below so it is probably necessary for amd.
>>>
>>> so the suggestion here is when they are available just use the intrinsics.
>>>
>>> thanks
>>>
>> The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx
>> is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline
>> assembly is required here.
> 
> Ok... but we can probably put them into a separate .c file that will be compiled with that specific flag?
> Same thing can be probably done for Intel specific instructions.
> In general, I think it is much more preferable to use built-ins vs inline assembly
> (if possible off-course). 
>

We don't compile different set of files for AMD and Intel, but there are
runtime checks, so putting into separate file is not much different.

It may be an option to always enable compiler flag (-mmwaitx), I think
it won't hurt other platforms but I am not sure about implications of
this to other platforms (what was the motivation for the compiler guys
to enable these build-ins with specific flag?).

Also this requires detecting compiler that supports 'mmwaitx' or not, etc..


> Konstantin
> 
>>
>>>> +
>>>> +/**
>>>> + * These functions uses MONITORX/MWAITX instructions and will enter C1 state.
>>>> + * For more information about usage of these instructions, please
>>>> +refer to
>>>> + * AMD64 Architecture Programmer's Manual.
>>>> + */
>>>> +static void amd_monitorx(volatile void *addr) {
>>>> +     /* MONITORX */
>>>> +     asm volatile(".byte 0x0f, 0x01, 0xfa;"
>>>> +                     :
>>>> +                     : "a"(addr),
>>>> +                     "c"(0),  /* no extensions */
>>>> +                     "d"(0)); /* no hints */ }
>>>> +
>>>> +static void amd_mwaitx(const uint64_t timeout) {
>>>> +     /* MWAITX */
>>>> +     asm volatile(".byte 0x0f, 0x01, 0xfb;"
>>>> +                     : /* ignore rflags */
>>>> +                     : "a"(0), /* enter C1 */
>>>> +                     "c"(2), /* enable timer */
>>>> +                     "b"(timeout));
>>>> +}
>>>> +
>>>> +static struct {
>>>> +     void (*mmonitor)(volatile void *addr);
>>>> +     void (*mwait)(const uint64_t timeout); } __rte_cache_aligned
>>>> +power_monitor_ops;
>>>> +
>>>>  static inline void
>>>>  __umwait_wakeup(volatile void *addr)
>>>>  {
>>>> @@ -75,8 +129,6 @@ int
>>>>  rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>>>>               const uint64_t tsc_timestamp)  {
>>>> -     const uint32_t tsc_l = (uint32_t)tsc_timestamp;
>>>> -     const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
>>>>       const unsigned int lcore_id = rte_lcore_id();
>>>>       struct power_wait_status *s;
>>>>       uint64_t cur_value;
>>>> @@ -109,10 +161,8 @@ rte_power_monitor(const struct
>>> rte_power_monitor_cond *pmc,
>>>>        * versions support this instruction natively.
>>>>        */
>>>>
>>>> -     /* set address for UMONITOR */
>>>> -     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
>>>> -                     :
>>>> -                     : "D"(pmc->addr));
>>>> +     /* set address for mmonitor */
>>>> +     power_monitor_ops.mmonitor(pmc->addr);
>>>>
>>>>       /* now that we've put this address into monitor, we can unlock */
>>>>       rte_spinlock_unlock(&s->lock);
>>>> @@ -123,11 +173,8 @@ rte_power_monitor(const struct
>>> rte_power_monitor_cond *pmc,
>>>>       if (pmc->fn(cur_value, pmc->opaque) != 0)
>>>>               goto end;
>>>>
>>>> -     /* execute UMWAIT */
>>>> -     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
>>>> -                     : /* ignore rflags */
>>>> -                     : "D"(0), /* enter C0.2 */
>>>> -                       "a"(tsc_l), "d"(tsc_h));
>>>> +     /* execute mwait */
>>>> +     power_monitor_ops.mwait(tsc_timestamp);
>>>>
>>>>  end:
>>>>       /* erase sleep address */
>>>> @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) {
>>>>               wait_multi_supported = 1;
>>>>       if (i.power_monitor)
>>>>               monitor_supported = 1;
>>>> +
>>>> +     if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
>>>> +             power_monitor_ops.mmonitor = &amd_monitorx;
>>>> +             power_monitor_ops.mwait = &amd_mwaitx;
>>>> +     } else { /* Intel */
>>>> +             power_monitor_ops.mmonitor = &intel_umonitor;
>>>> +             power_monitor_ops.mwait = &intel_umwait;
>>>> +     }
>>>>  }
>>>>
>>>>  int
>>>> --
>>>> 2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-18 13:25                   ` Ferruh Yigit
@ 2023-08-18 13:48                     ` Bruce Richardson
  2023-08-21 15:42                       ` Tyler Retzlaff
  2023-08-22 22:30                       ` Konstantin Ananyev
  0 siblings, 2 replies; 51+ messages in thread
From: Bruce Richardson @ 2023-08-18 13:48 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Konstantin Ananyev, Tummala, Sivaprasad, Tyler Retzlaff,
	david.hunt, anatoly.burakov, david.marchand, thomas, dev

On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> > 
> >>> Caution: This message originated from an External Source. Use proper caution
> >>> when opening attachments, clicking links, or responding.
> >>>
> >>>
> >>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> >>>> mwaitx allows EPYC processors to enter a implementation dependent
> >>>> power/performance optimized state (C1 state) for a specific period or
> >>>> until a store to the monitored address range.
> >>>>
> >>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> >>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>>> ---
> >>>>  lib/eal/x86/rte_power_intrinsics.c | 77
> >>>> +++++++++++++++++++++++++-----
> >>>>  1 file changed, 66 insertions(+), 11 deletions(-)
> >>>>
> >>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> >>>> b/lib/eal/x86/rte_power_intrinsics.c
> >>>> index 6eb9e50807..b4754e17da 100644
> >>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> >>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> >>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> >>>>       volatile void *monitor_addr; /**< NULL if not currently sleeping
> >>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> >>>>
> >>>> +/**
> >>>> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2
> >>> state.
> >>>> + * For more information about usage of these instructions, please
> >>>> +refer to
> >>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> >>>> + */
> >>>> +static void intel_umonitor(volatile void *addr) {
> >>>> +     /* UMONITOR */
> >>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> >>>> +                     :
> >>>> +                     : "D"(addr));
> >>>> +}
> >>>> +
> >>>> +static void intel_umwait(const uint64_t timeout) {
> >>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> >>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> >>>> +     /* UMWAIT */
> >>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> >>>> +                     : /* ignore rflags */
> >>>> +                     : "D"(0), /* enter C0.2 */
> >>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> >>>
> >>> question and perhaps Anatoly Burakov can chime in with expertise.
> >>>
> >>> gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other
> >>> thread of discussion is there a benefit to also providing inline assembly over just
> >>> using the intrinsics? I understand that the intrinsics may not exist for the monitorx
> >>> and mwaitx below so it is probably necessary for amd.
> >>>
> >>> so the suggestion here is when they are available just use the intrinsics.
> >>>
> >>> thanks
> >>>
> >> The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx
> >> is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline
> >> assembly is required here.
> > 
> > Ok... but we can probably put them into a separate .c file that will be compiled with that specific flag?
> > Same thing can be probably done for Intel specific instructions.
> > In general, I think it is much more preferable to use built-ins vs inline assembly
> > (if possible off-course). 
> >
> 
> We don't compile different set of files for AMD and Intel, but there are
> runtime checks, so putting into separate file is not much different.
> 
> It may be an option to always enable compiler flag (-mmwaitx), I think
> it won't hurt other platforms but I am not sure about implications of
> this to other platforms (what was the motivation for the compiler guys
> to enable these build-ins with specific flag?).
> 
> Also this requires detecting compiler that supports 'mmwaitx' or not, etc..
> 
This is the biggest reason why we have in the past added support for these
instructions via asm bytes rather than intrinsics. It takes a long time for
end-user compilers, especially those in LTS releases, to get the necessary
intrinsics. Consider a user running e.g. RHEL 8, who wants to take
advantages of the latest DPDK features; they should not be required to
upgrade their compiler - and possibly binutils/assembler - to do so.

/Bruce

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-18 13:48                     ` Bruce Richardson
@ 2023-08-21 15:42                       ` Tyler Retzlaff
  2023-08-22 22:30                       ` Konstantin Ananyev
  1 sibling, 0 replies; 51+ messages in thread
From: Tyler Retzlaff @ 2023-08-21 15:42 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Ferruh Yigit, Konstantin Ananyev, Tummala, Sivaprasad,
	david.hunt, anatoly.burakov, david.marchand, thomas, dev

On Fri, Aug 18, 2023 at 02:48:42PM +0100, Bruce Richardson wrote:
> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> > On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> > > 
> > >>> Caution: This message originated from an External Source. Use proper caution
> > >>> when opening attachments, clicking links, or responding.
> > >>>
> > >>>
> > >>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> > >>>> mwaitx allows EPYC processors to enter a implementation dependent
> > >>>> power/performance optimized state (C1 state) for a specific period or
> > >>>> until a store to the monitored address range.
> > >>>>
> > >>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > >>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > >>>> ---
> > >>>>  lib/eal/x86/rte_power_intrinsics.c | 77
> > >>>> +++++++++++++++++++++++++-----
> > >>>>  1 file changed, 66 insertions(+), 11 deletions(-)
> > >>>>
> > >>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> > >>>> b/lib/eal/x86/rte_power_intrinsics.c
> > >>>> index 6eb9e50807..b4754e17da 100644
> > >>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> > >>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> > >>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> > >>>>       volatile void *monitor_addr; /**< NULL if not currently sleeping
> > >>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> > >>>>
> > >>>> +/**
> > >>>> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2
> > >>> state.
> > >>>> + * For more information about usage of these instructions, please
> > >>>> +refer to
> > >>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> > >>>> + */
> > >>>> +static void intel_umonitor(volatile void *addr) {
> > >>>> +     /* UMONITOR */
> > >>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > >>>> +                     :
> > >>>> +                     : "D"(addr));
> > >>>> +}
> > >>>> +
> > >>>> +static void intel_umwait(const uint64_t timeout) {
> > >>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> > >>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> > >>>> +     /* UMWAIT */
> > >>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > >>>> +                     : /* ignore rflags */
> > >>>> +                     : "D"(0), /* enter C0.2 */
> > >>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> > >>>
> > >>> question and perhaps Anatoly Burakov can chime in with expertise.
> > >>>
> > >>> gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other
> > >>> thread of discussion is there a benefit to also providing inline assembly over just
> > >>> using the intrinsics? I understand that the intrinsics may not exist for the monitorx
> > >>> and mwaitx below so it is probably necessary for amd.
> > >>>
> > >>> so the suggestion here is when they are available just use the intrinsics.
> > >>>
> > >>> thanks
> > >>>
> > >> The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx
> > >> is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline
> > >> assembly is required here.
> > > 
> > > Ok... but we can probably put them into a separate .c file that will be compiled with that specific flag?
> > > Same thing can be probably done for Intel specific instructions.
> > > In general, I think it is much more preferable to use built-ins vs inline assembly
> > > (if possible off-course). 
> > >
> > 
> > We don't compile different set of files for AMD and Intel, but there are
> > runtime checks, so putting into separate file is not much different.
> > 
> > It may be an option to always enable compiler flag (-mmwaitx), I think
> > it won't hurt other platforms but I am not sure about implications of
> > this to other platforms (what was the motivation for the compiler guys
> > to enable these build-ins with specific flag?).
> > 
> > Also this requires detecting compiler that supports 'mmwaitx' or not, etc..
> > 
> This is the biggest reason why we have in the past added support for these
> instructions via asm bytes rather than intrinsics. It takes a long time for
> end-user compilers, especially those in LTS releases, to get the necessary
> intrinsics. Consider a user running e.g. RHEL 8, who wants to take
> advantages of the latest DPDK features; they should not be required to
> upgrade their compiler - and possibly binutils/assembler - to do so.

yes, this is understood. just for clarity i'm only asking that the
intrinsics be used for intel since they are available. for amd we'll
have to use the inline asm if the older compilers don't have intrinsics.

ty

> 
> /Bruce

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-18 13:48                     ` Bruce Richardson
  2023-08-21 15:42                       ` Tyler Retzlaff
@ 2023-08-22 22:30                       ` Konstantin Ananyev
  2023-08-23  9:19                         ` Ferruh Yigit
  1 sibling, 1 reply; 51+ messages in thread
From: Konstantin Ananyev @ 2023-08-22 22:30 UTC (permalink / raw)
  To: Bruce Richardson, Ferruh Yigit
  Cc: Konstantin Ananyev, Tummala, Sivaprasad, Tyler Retzlaff,
	david.hunt, anatoly.burakov, david.marchand, thomas, dev

18/08/2023 14:48, Bruce Richardson пишет:
> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
>>>
>>>>> Caution: This message originated from an External Source. Use proper caution
>>>>> when opening attachments, clicking links, or responding.
>>>>>
>>>>>
>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
>>>>>> mwaitx allows EPYC processors to enter a implementation dependent
>>>>>> power/performance optimized state (C1 state) for a specific period or
>>>>>> until a store to the monitored address range.
>>>>>>
>>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>> ---
>>>>>>   lib/eal/x86/rte_power_intrinsics.c | 77
>>>>>> +++++++++++++++++++++++++-----
>>>>>>   1 file changed, 66 insertions(+), 11 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
>>>>>> index 6eb9e50807..b4754e17da 100644
>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
>>>>>>        volatile void *monitor_addr; /**< NULL if not currently sleeping
>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>>>>>>
>>>>>> +/**
>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2
>>>>> state.
>>>>>> + * For more information about usage of these instructions, please
>>>>>> +refer to
>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
>>>>>> + */
>>>>>> +static void intel_umonitor(volatile void *addr) {
>>>>>> +     /* UMONITOR */
>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
>>>>>> +                     :
>>>>>> +                     : "D"(addr));
>>>>>> +}
>>>>>> +
>>>>>> +static void intel_umwait(const uint64_t timeout) {
>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
>>>>>> +     /* UMWAIT */
>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
>>>>>> +                     : /* ignore rflags */
>>>>>> +                     : "D"(0), /* enter C0.2 */
>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
>>>>>
>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
>>>>>
>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other
>>>>> thread of discussion is there a benefit to also providing inline assembly over just
>>>>> using the intrinsics? I understand that the intrinsics may not exist for the monitorx
>>>>> and mwaitx below so it is probably necessary for amd.
>>>>>
>>>>> so the suggestion here is when they are available just use the intrinsics.
>>>>>
>>>>> thanks
>>>>>
>>>> The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx
>>>> is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline
>>>> assembly is required here.
>>>
>>> Ok... but we can probably put them into a separate .c file that will be compiled with that specific flag?
>>> Same thing can be probably done for Intel specific instructions.
>>> In general, I think it is much more preferable to use built-ins vs inline assembly
>>> (if possible off-course).
>>>
>>
>> We don't compile different set of files for AMD and Intel, but there are
>> runtime checks, so putting into separate file is not much different.

Well, we probably don't compile .c files for particular vendor, but we
definitely do compile some .c files for particular ISA extensions.
Let say there are files in lib/acl that requires various '-mavx512*' 
flags, same for other libs and PMDs.
So still not clear to me why same approach can't be applied to 
power_instrincts.c?

>>
>> It may be an option to always enable compiler flag (-mmwaitx), I think
>> it won't hurt other platforms but I am not sure about implications of
>> this to other platforms (what was the motivation for the compiler guys
>> to enable these build-ins with specific flag?).
>>
>> Also this requires detecting compiler that supports 'mmwaitx' or not, etc..
>>
> This is the biggest reason why we have in the past added support for these
> instructions via asm bytes rather than intrinsics. It takes a long time for
> end-user compilers, especially those in LTS releases, to get the necessary
> intrinsics. 

Yep, understand.
But why then we can't have both implementations?
Let say if WAITPKG is defined we can use builtins for 
umonitor/umwait/tpause, otherwise we fallback to inline asm implementation.
Same story for MWAITX/monitorx.

> Consider a user running e.g. RHEL 8, who wants to take
> advantages of the latest DPDK features; they should not be required to
> upgrade their compiler - and possibly binutils/assembler - to do so.
> 
> /Bruce


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-22 22:30                       ` Konstantin Ananyev
@ 2023-08-23  9:19                         ` Ferruh Yigit
  2023-08-23 16:03                           ` Tyler Retzlaff
  0 siblings, 1 reply; 51+ messages in thread
From: Ferruh Yigit @ 2023-08-23  9:19 UTC (permalink / raw)
  To: Konstantin Ananyev, Bruce Richardson
  Cc: Konstantin Ananyev, Tummala, Sivaprasad, Tyler Retzlaff,
	david.hunt, anatoly.burakov, david.marchand, thomas, dev

On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
> 18/08/2023 14:48, Bruce Richardson пишет:
>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
>>>>
>>>>>> Caution: This message originated from an External Source. Use
>>>>>> proper caution
>>>>>> when opening attachments, clicking links, or responding.
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
>>>>>>> mwaitx allows EPYC processors to enter a implementation dependent
>>>>>>> power/performance optimized state (C1 state) for a specific
>>>>>>> period or
>>>>>>> until a store to the monitored address range.
>>>>>>>
>>>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>> ---
>>>>>>>   lib/eal/x86/rte_power_intrinsics.c | 77
>>>>>>> +++++++++++++++++++++++++-----
>>>>>>>   1 file changed, 66 insertions(+), 11 deletions(-)
>>>>>>>
>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
>>>>>>> index 6eb9e50807..b4754e17da 100644
>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
>>>>>>>        volatile void *monitor_addr; /**< NULL if not currently
>>>>>>> sleeping
>>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>>>>>>>
>>>>>>> +/**
>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will
>>>>>>> enter C0.2
>>>>>> state.
>>>>>>> + * For more information about usage of these instructions, please
>>>>>>> +refer to
>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
>>>>>>> + */
>>>>>>> +static void intel_umonitor(volatile void *addr) {
>>>>>>> +     /* UMONITOR */
>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
>>>>>>> +                     :
>>>>>>> +                     : "D"(addr));
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void intel_umwait(const uint64_t timeout) {
>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
>>>>>>> +     /* UMWAIT */
>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
>>>>>>> +                     : /* ignore rflags */
>>>>>>> +                     : "D"(0), /* enter C0.2 */
>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
>>>>>>
>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
>>>>>>
>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
>>>>>> believe as per our other
>>>>>> thread of discussion is there a benefit to also providing inline
>>>>>> assembly over just
>>>>>> using the intrinsics? I understand that the intrinsics may not
>>>>>> exist for the monitorx
>>>>>> and mwaitx below so it is probably necessary for amd.
>>>>>>
>>>>>> so the suggestion here is when they are available just use the
>>>>>> intrinsics.
>>>>>>
>>>>>> thanks
>>>>>>
>>>>> The gcc built-in functions
>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only
>>>>> when -mmwaitx
>>>>> is used specific for AMD platforms. On generic builds, these
>>>>> built-ins are not available and hence inline
>>>>> assembly is required here.
>>>>
>>>> Ok... but we can probably put them into a separate .c file that will
>>>> be compiled with that specific flag?
>>>> Same thing can be probably done for Intel specific instructions.
>>>> In general, I think it is much more preferable to use built-ins vs
>>>> inline assembly
>>>> (if possible off-course).
>>>>
>>>
>>> We don't compile different set of files for AMD and Intel, but there are
>>> runtime checks, so putting into separate file is not much different.
> 
> Well, we probably don't compile .c files for particular vendor, but we
> definitely do compile some .c files for particular ISA extensions.
> Let say there are files in lib/acl that requires various '-mavx512*'
> flags, same for other libs and PMDs.
> So still not clear to me why same approach can't be applied to
> power_instrincts.c?
> 
>>>
>>> It may be an option to always enable compiler flag (-mmwaitx), I think
>>> it won't hurt other platforms but I am not sure about implications of
>>> this to other platforms (what was the motivation for the compiler guys
>>> to enable these build-ins with specific flag?).
>>>
>>> Also this requires detecting compiler that supports 'mmwaitx' or not,
>>> etc..
>>>
>> This is the biggest reason why we have in the past added support for
>> these
>> instructions via asm bytes rather than intrinsics. It takes a long
>> time for
>> end-user compilers, especially those in LTS releases, to get the
>> necessary
>> intrinsics. 
> 
> Yep, understand.
> But why then we can't have both implementations?
> Let say if WAITPKG is defined we can use builtins for
> umonitor/umwait/tpause, otherwise we fallback to inline asm implementation.
> Same story for MWAITX/monitorx.
> 

Yes this can be done,
it can be done either as different .c files per implementation, or as
#ifdef in same file.

But eventually asm implementation is required, as fallback, and if we
will rely on asm implementation anyway, does it worth to have the
additional checks to be able to use built-in intrinsic?

Does it helps to comment name of the built-in function to inline
assembly code, to document intention and another possible implementation?

>> Consider a user running e.g. RHEL 8, who wants to take
>> advantages of the latest DPDK features; they should not be required to
>> upgrade their compiler - and possibly binutils/assembler - to do so.
>>
>> /Bruce
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-23  9:19                         ` Ferruh Yigit
@ 2023-08-23 16:03                           ` Tyler Retzlaff
  2023-08-24  9:04                             ` Ferruh Yigit
  0 siblings, 1 reply; 51+ messages in thread
From: Tyler Retzlaff @ 2023-08-23 16:03 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Konstantin Ananyev, Bruce Richardson, Konstantin Ananyev,
	Tummala, Sivaprasad, david.hunt, anatoly.burakov, david.marchand,
	thomas, dev

On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
> > 18/08/2023 14:48, Bruce Richardson пишет:
> >> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> >>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> >>>>
> >>>>>> Caution: This message originated from an External Source. Use
> >>>>>> proper caution
> >>>>>> when opening attachments, clicking links, or responding.
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> >>>>>>> mwaitx allows EPYC processors to enter a implementation dependent
> >>>>>>> power/performance optimized state (C1 state) for a specific
> >>>>>>> period or
> >>>>>>> until a store to the monitored address range.
> >>>>>>>
> >>>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> >>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>>>>>> ---
> >>>>>>>   lib/eal/x86/rte_power_intrinsics.c | 77
> >>>>>>> +++++++++++++++++++++++++-----
> >>>>>>>   1 file changed, 66 insertions(+), 11 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>> index 6eb9e50807..b4754e17da 100644
> >>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> >>>>>>>        volatile void *monitor_addr; /**< NULL if not currently
> >>>>>>> sleeping
> >>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> >>>>>>>
> >>>>>>> +/**
> >>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will
> >>>>>>> enter C0.2
> >>>>>> state.
> >>>>>>> + * For more information about usage of these instructions, please
> >>>>>>> +refer to
> >>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> >>>>>>> + */
> >>>>>>> +static void intel_umonitor(volatile void *addr) {
> >>>>>>> +     /* UMONITOR */
> >>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> >>>>>>> +                     :
> >>>>>>> +                     : "D"(addr));
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +static void intel_umwait(const uint64_t timeout) {
> >>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> >>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> >>>>>>> +     /* UMWAIT */
> >>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> >>>>>>> +                     : /* ignore rflags */
> >>>>>>> +                     : "D"(0), /* enter C0.2 */
> >>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> >>>>>>
> >>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
> >>>>>>
> >>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
> >>>>>> believe as per our other
> >>>>>> thread of discussion is there a benefit to also providing inline
> >>>>>> assembly over just
> >>>>>> using the intrinsics? I understand that the intrinsics may not
> >>>>>> exist for the monitorx
> >>>>>> and mwaitx below so it is probably necessary for amd.
> >>>>>>
> >>>>>> so the suggestion here is when they are available just use the
> >>>>>> intrinsics.
> >>>>>>
> >>>>>> thanks
> >>>>>>
> >>>>> The gcc built-in functions
> >>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only
> >>>>> when -mmwaitx
> >>>>> is used specific for AMD platforms. On generic builds, these
> >>>>> built-ins are not available and hence inline
> >>>>> assembly is required here.
> >>>>
> >>>> Ok... but we can probably put them into a separate .c file that will
> >>>> be compiled with that specific flag?
> >>>> Same thing can be probably done for Intel specific instructions.
> >>>> In general, I think it is much more preferable to use built-ins vs
> >>>> inline assembly
> >>>> (if possible off-course).
> >>>>
> >>>
> >>> We don't compile different set of files for AMD and Intel, but there are
> >>> runtime checks, so putting into separate file is not much different.
> > 
> > Well, we probably don't compile .c files for particular vendor, but we
> > definitely do compile some .c files for particular ISA extensions.
> > Let say there are files in lib/acl that requires various '-mavx512*'
> > flags, same for other libs and PMDs.
> > So still not clear to me why same approach can't be applied to
> > power_instrincts.c?
> > 
> >>>
> >>> It may be an option to always enable compiler flag (-mmwaitx), I think
> >>> it won't hurt other platforms but I am not sure about implications of
> >>> this to other platforms (what was the motivation for the compiler guys
> >>> to enable these build-ins with specific flag?).
> >>>
> >>> Also this requires detecting compiler that supports 'mmwaitx' or not,
> >>> etc..
> >>>
> >> This is the biggest reason why we have in the past added support for
> >> these
> >> instructions via asm bytes rather than intrinsics. It takes a long
> >> time for
> >> end-user compilers, especially those in LTS releases, to get the
> >> necessary
> >> intrinsics. 
> > 
> > Yep, understand.
> > But why then we can't have both implementations?
> > Let say if WAITPKG is defined we can use builtins for
> > umonitor/umwait/tpause, otherwise we fallback to inline asm implementation.
> > Same story for MWAITX/monitorx.
> > 
> 
> Yes this can be done,
> it can be done either as different .c files per implementation, or as
> #ifdef in same file.
> 
> But eventually asm implementation is required, as fallback, and if we
> will rely on asm implementation anyway, does it worth to have the
> additional checks to be able to use built-in intrinsic?
>
> Does it helps to comment name of the built-in function to inline
> assembly code, to document intention and another possible implementation?

the main value of preferring intrinsics is that when they are available
they also work with msvc/windows. the msvc toolchain does not support
inline asm. so some of the targets have to use intrinsics because that's all
there is.

> 
> >> Consider a user running e.g. RHEL 8, who wants to take
> >> advantages of the latest DPDK features; they should not be required to
> >> upgrade their compiler - and possibly binutils/assembler - to do so.
> >>
> >> /Bruce
> > 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-23 16:03                           ` Tyler Retzlaff
@ 2023-08-24  9:04                             ` Ferruh Yigit
  2023-08-25 16:00                               ` Tyler Retzlaff
  0 siblings, 1 reply; 51+ messages in thread
From: Ferruh Yigit @ 2023-08-24  9:04 UTC (permalink / raw)
  To: Tyler Retzlaff
  Cc: Konstantin Ananyev, Bruce Richardson, Konstantin Ananyev,
	Tummala, Sivaprasad, david.hunt, anatoly.burakov, david.marchand,
	thomas, dev

On 8/23/2023 5:03 PM, Tyler Retzlaff wrote:
> On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
>> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
>>> 18/08/2023 14:48, Bruce Richardson пишет:
>>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
>>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
>>>>>>
>>>>>>>> Caution: This message originated from an External Source. Use
>>>>>>>> proper caution
>>>>>>>> when opening attachments, clicking links, or responding.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
>>>>>>>>> mwaitx allows EPYC processors to enter a implementation dependent
>>>>>>>>> power/performance optimized state (C1 state) for a specific
>>>>>>>>> period or
>>>>>>>>> until a store to the monitored address range.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>>>> ---
>>>>>>>>>   lib/eal/x86/rte_power_intrinsics.c | 77
>>>>>>>>> +++++++++++++++++++++++++-----
>>>>>>>>>   1 file changed, 66 insertions(+), 11 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>> index 6eb9e50807..b4754e17da 100644
>>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
>>>>>>>>>        volatile void *monitor_addr; /**< NULL if not currently
>>>>>>>>> sleeping
>>>>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will
>>>>>>>>> enter C0.2
>>>>>>>> state.
>>>>>>>>> + * For more information about usage of these instructions, please
>>>>>>>>> +refer to
>>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
>>>>>>>>> + */
>>>>>>>>> +static void intel_umonitor(volatile void *addr) {
>>>>>>>>> +     /* UMONITOR */
>>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
>>>>>>>>> +                     :
>>>>>>>>> +                     : "D"(addr));
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void intel_umwait(const uint64_t timeout) {
>>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
>>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
>>>>>>>>> +     /* UMWAIT */
>>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
>>>>>>>>> +                     : /* ignore rflags */
>>>>>>>>> +                     : "D"(0), /* enter C0.2 */
>>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
>>>>>>>>
>>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
>>>>>>>>
>>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
>>>>>>>> believe as per our other
>>>>>>>> thread of discussion is there a benefit to also providing inline
>>>>>>>> assembly over just
>>>>>>>> using the intrinsics? I understand that the intrinsics may not
>>>>>>>> exist for the monitorx
>>>>>>>> and mwaitx below so it is probably necessary for amd.
>>>>>>>>
>>>>>>>> so the suggestion here is when they are available just use the
>>>>>>>> intrinsics.
>>>>>>>>
>>>>>>>> thanks
>>>>>>>>
>>>>>>> The gcc built-in functions
>>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only
>>>>>>> when -mmwaitx
>>>>>>> is used specific for AMD platforms. On generic builds, these
>>>>>>> built-ins are not available and hence inline
>>>>>>> assembly is required here.
>>>>>>
>>>>>> Ok... but we can probably put them into a separate .c file that will
>>>>>> be compiled with that specific flag?
>>>>>> Same thing can be probably done for Intel specific instructions.
>>>>>> In general, I think it is much more preferable to use built-ins vs
>>>>>> inline assembly
>>>>>> (if possible off-course).
>>>>>>
>>>>>
>>>>> We don't compile different set of files for AMD and Intel, but there are
>>>>> runtime checks, so putting into separate file is not much different.
>>>
>>> Well, we probably don't compile .c files for particular vendor, but we
>>> definitely do compile some .c files for particular ISA extensions.
>>> Let say there are files in lib/acl that requires various '-mavx512*'
>>> flags, same for other libs and PMDs.
>>> So still not clear to me why same approach can't be applied to
>>> power_instrincts.c?
>>>
>>>>>
>>>>> It may be an option to always enable compiler flag (-mmwaitx), I think
>>>>> it won't hurt other platforms but I am not sure about implications of
>>>>> this to other platforms (what was the motivation for the compiler guys
>>>>> to enable these build-ins with specific flag?).
>>>>>
>>>>> Also this requires detecting compiler that supports 'mmwaitx' or not,
>>>>> etc..
>>>>>
>>>> This is the biggest reason why we have in the past added support for
>>>> these
>>>> instructions via asm bytes rather than intrinsics. It takes a long
>>>> time for
>>>> end-user compilers, especially those in LTS releases, to get the
>>>> necessary
>>>> intrinsics. 
>>>
>>> Yep, understand.
>>> But why then we can't have both implementations?
>>> Let say if WAITPKG is defined we can use builtins for
>>> umonitor/umwait/tpause, otherwise we fallback to inline asm implementation.
>>> Same story for MWAITX/monitorx.
>>>
>>
>> Yes this can be done,
>> it can be done either as different .c files per implementation, or as
>> #ifdef in same file.
>>
>> But eventually asm implementation is required, as fallback, and if we
>> will rely on asm implementation anyway, does it worth to have the
>> additional checks to be able to use built-in intrinsic?
>>
>> Does it helps to comment name of the built-in function to inline
>> assembly code, to document intention and another possible implementation?
> 
> the main value of preferring intrinsics is that when they are available
> they also work with msvc/windows. the msvc toolchain does not support
> inline asm. so some of the targets have to use intrinsics because that's all
> there is.
> 

How windows handles current power APIs without inline asm support, like
rte_power_intrinsics.c one?

Also will using both built-in and inline assembly work for Windows,
since there may be compiler versions that doesn't support built-in
functions, they should disable APIs altogether, and this can create a
scenario that list of exposed APIs changes based on compiler version.

>>
>>>> Consider a user running e.g. RHEL 8, who wants to take
>>>> advantages of the latest DPDK features; they should not be required to
>>>> upgrade their compiler - and possibly binutils/assembler - to do so.
>>>>
>>>> /Bruce
>>>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-24  9:04                             ` Ferruh Yigit
@ 2023-08-25 16:00                               ` Tyler Retzlaff
  2023-08-30 22:45                                 ` Konstantin Ananyev
  0 siblings, 1 reply; 51+ messages in thread
From: Tyler Retzlaff @ 2023-08-25 16:00 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Konstantin Ananyev, Bruce Richardson, Konstantin Ananyev,
	Tummala, Sivaprasad, david.hunt, anatoly.burakov, david.marchand,
	thomas, dev

On Thu, Aug 24, 2023 at 10:04:42AM +0100, Ferruh Yigit wrote:
> On 8/23/2023 5:03 PM, Tyler Retzlaff wrote:
> > On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
> >> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
> >>> 18/08/2023 14:48, Bruce Richardson пишет:
> >>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> >>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> >>>>>>
> >>>>>>>> Caution: This message originated from an External Source. Use
> >>>>>>>> proper caution
> >>>>>>>> when opening attachments, clicking links, or responding.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
> >>>>>>>>> mwaitx allows EPYC processors to enter a implementation dependent
> >>>>>>>>> power/performance optimized state (C1 state) for a specific
> >>>>>>>>> period or
> >>>>>>>>> until a store to the monitored address range.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> >>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>>>>>>>> ---
> >>>>>>>>>   lib/eal/x86/rte_power_intrinsics.c | 77
> >>>>>>>>> +++++++++++++++++++++++++-----
> >>>>>>>>>   1 file changed, 66 insertions(+), 11 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>> index 6eb9e50807..b4754e17da 100644
> >>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> >>>>>>>>>        volatile void *monitor_addr; /**< NULL if not currently
> >>>>>>>>> sleeping
> >>>>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> >>>>>>>>>
> >>>>>>>>> +/**
> >>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will
> >>>>>>>>> enter C0.2
> >>>>>>>> state.
> >>>>>>>>> + * For more information about usage of these instructions, please
> >>>>>>>>> +refer to
> >>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> >>>>>>>>> + */
> >>>>>>>>> +static void intel_umonitor(volatile void *addr) {
> >>>>>>>>> +     /* UMONITOR */
> >>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> >>>>>>>>> +                     :
> >>>>>>>>> +                     : "D"(addr));
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> +static void intel_umwait(const uint64_t timeout) {
> >>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> >>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> >>>>>>>>> +     /* UMWAIT */
> >>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> >>>>>>>>> +                     : /* ignore rflags */
> >>>>>>>>> +                     : "D"(0), /* enter C0.2 */
> >>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> >>>>>>>>
> >>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
> >>>>>>>>
> >>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
> >>>>>>>> believe as per our other
> >>>>>>>> thread of discussion is there a benefit to also providing inline
> >>>>>>>> assembly over just
> >>>>>>>> using the intrinsics? I understand that the intrinsics may not
> >>>>>>>> exist for the monitorx
> >>>>>>>> and mwaitx below so it is probably necessary for amd.
> >>>>>>>>
> >>>>>>>> so the suggestion here is when they are available just use the
> >>>>>>>> intrinsics.
> >>>>>>>>
> >>>>>>>> thanks
> >>>>>>>>
> >>>>>>> The gcc built-in functions
> >>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only
> >>>>>>> when -mmwaitx
> >>>>>>> is used specific for AMD platforms. On generic builds, these
> >>>>>>> built-ins are not available and hence inline
> >>>>>>> assembly is required here.
> >>>>>>
> >>>>>> Ok... but we can probably put them into a separate .c file that will
> >>>>>> be compiled with that specific flag?
> >>>>>> Same thing can be probably done for Intel specific instructions.
> >>>>>> In general, I think it is much more preferable to use built-ins vs
> >>>>>> inline assembly
> >>>>>> (if possible off-course).
> >>>>>>
> >>>>>
> >>>>> We don't compile different set of files for AMD and Intel, but there are
> >>>>> runtime checks, so putting into separate file is not much different.
> >>>
> >>> Well, we probably don't compile .c files for particular vendor, but we
> >>> definitely do compile some .c files for particular ISA extensions.
> >>> Let say there are files in lib/acl that requires various '-mavx512*'
> >>> flags, same for other libs and PMDs.
> >>> So still not clear to me why same approach can't be applied to
> >>> power_instrincts.c?
> >>>
> >>>>>
> >>>>> It may be an option to always enable compiler flag (-mmwaitx), I think
> >>>>> it won't hurt other platforms but I am not sure about implications of
> >>>>> this to other platforms (what was the motivation for the compiler guys
> >>>>> to enable these build-ins with specific flag?).
> >>>>>
> >>>>> Also this requires detecting compiler that supports 'mmwaitx' or not,
> >>>>> etc..
> >>>>>
> >>>> This is the biggest reason why we have in the past added support for
> >>>> these
> >>>> instructions via asm bytes rather than intrinsics. It takes a long
> >>>> time for
> >>>> end-user compilers, especially those in LTS releases, to get the
> >>>> necessary
> >>>> intrinsics. 
> >>>
> >>> Yep, understand.
> >>> But why then we can't have both implementations?
> >>> Let say if WAITPKG is defined we can use builtins for
> >>> umonitor/umwait/tpause, otherwise we fallback to inline asm implementation.
> >>> Same story for MWAITX/monitorx.
> >>>
> >>
> >> Yes this can be done,
> >> it can be done either as different .c files per implementation, or as
> >> #ifdef in same file.
> >>
> >> But eventually asm implementation is required, as fallback, and if we
> >> will rely on asm implementation anyway, does it worth to have the
> >> additional checks to be able to use built-in intrinsic?
> >>
> >> Does it helps to comment name of the built-in function to inline
> >> assembly code, to document intention and another possible implementation?
> > 
> > the main value of preferring intrinsics is that when they are available
> > they also work with msvc/windows. the msvc toolchain does not support
> > inline asm. so some of the targets have to use intrinsics because that's all
> > there is.
> > 
> 
> How windows handles current power APIs without inline asm support, like
> rte_power_intrinsics.c one?

so this is a windows vs toolchain entanglement.

> Also will using both built-in and inline assembly work for Windows,
> since there may be compiler versions that doesn't support built-in
> functions, they should disable APIs altogether, and this can create a
> scenario that list of exposed APIs changes based on compiler version.

so I don't intend to disable apis, theres usually a way to make them
work and there should not be any api changes when done correctly.

windows/clang/mingw
    * inline asm may be used, but for me isn't preferred

windows/msvc
    * intrinsics (when available)
    * non-inline asm in a .s (when no intrinsics available)
    * keeping in mind that the compiler version isn't tied to windows
      OS release so it's easier for me to document that you need a
      newer compiler arbitrarily. The periods where there are no intrinsics
      end up being short-lived.
   
I'm on the hook for windows/msvc any stickyness dealing with it ends up
being my problem.

> 
> >>
> >>>> Consider a user running e.g. RHEL 8, who wants to take
> >>>> advantages of the latest DPDK features; they should not be required to
> >>>> upgrade their compiler - and possibly binutils/assembler - to do so.
> >>>>
> >>>> /Bruce
> >>>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-25 16:00                               ` Tyler Retzlaff
@ 2023-08-30 22:45                                 ` Konstantin Ananyev
  2023-09-27 10:38                                   ` Tummala, Sivaprasad
  0 siblings, 1 reply; 51+ messages in thread
From: Konstantin Ananyev @ 2023-08-30 22:45 UTC (permalink / raw)
  To: Tyler Retzlaff, Ferruh Yigit
  Cc: Bruce Richardson, Konstantin Ananyev, Tummala, Sivaprasad,
	david.hunt, anatoly.burakov, david.marchand, thomas, dev

25/08/2023 17:00, Tyler Retzlaff пишет:
> On Thu, Aug 24, 2023 at 10:04:42AM +0100, Ferruh Yigit wrote:
>> On 8/23/2023 5:03 PM, Tyler Retzlaff wrote:
>>> On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
>>>> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
>>>>> 18/08/2023 14:48, Bruce Richardson пишет:
>>>>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
>>>>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
>>>>>>>>
>>>>>>>>>> Caution: This message originated from an External Source. Use
>>>>>>>>>> proper caution
>>>>>>>>>> when opening attachments, clicking links, or responding.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote:
>>>>>>>>>>> mwaitx allows EPYC processors to enter a implementation dependent
>>>>>>>>>>> power/performance optimized state (C1 state) for a specific
>>>>>>>>>>> period or
>>>>>>>>>>> until a store to the monitored address range.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
>>>>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>    lib/eal/x86/rte_power_intrinsics.c | 77
>>>>>>>>>>> +++++++++++++++++++++++++-----
>>>>>>>>>>>    1 file changed, 66 insertions(+), 11 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>>>> index 6eb9e50807..b4754e17da 100644
>>>>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
>>>>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
>>>>>>>>>>>         volatile void *monitor_addr; /**< NULL if not currently
>>>>>>>>>>> sleeping
>>>>>>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>>>>>>>>>>>
>>>>>>>>>>> +/**
>>>>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will
>>>>>>>>>>> enter C0.2
>>>>>>>>>> state.
>>>>>>>>>>> + * For more information about usage of these instructions, please
>>>>>>>>>>> +refer to
>>>>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
>>>>>>>>>>> + */
>>>>>>>>>>> +static void intel_umonitor(volatile void *addr) {
>>>>>>>>>>> +     /* UMONITOR */
>>>>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
>>>>>>>>>>> +                     :
>>>>>>>>>>> +                     : "D"(addr));
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static void intel_umwait(const uint64_t timeout) {
>>>>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
>>>>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
>>>>>>>>>>> +     /* UMWAIT */
>>>>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
>>>>>>>>>>> +                     : /* ignore rflags */
>>>>>>>>>>> +                     : "D"(0), /* enter C0.2 */
>>>>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
>>>>>>>>>>
>>>>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
>>>>>>>>>>
>>>>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
>>>>>>>>>> believe as per our other
>>>>>>>>>> thread of discussion is there a benefit to also providing inline
>>>>>>>>>> assembly over just
>>>>>>>>>> using the intrinsics? I understand that the intrinsics may not
>>>>>>>>>> exist for the monitorx
>>>>>>>>>> and mwaitx below so it is probably necessary for amd.
>>>>>>>>>>
>>>>>>>>>> so the suggestion here is when they are available just use the
>>>>>>>>>> intrinsics.
>>>>>>>>>>
>>>>>>>>>> thanks
>>>>>>>>>>
>>>>>>>>> The gcc built-in functions
>>>>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only
>>>>>>>>> when -mmwaitx
>>>>>>>>> is used specific for AMD platforms. On generic builds, these
>>>>>>>>> built-ins are not available and hence inline
>>>>>>>>> assembly is required here.
>>>>>>>>
>>>>>>>> Ok... but we can probably put them into a separate .c file that will
>>>>>>>> be compiled with that specific flag?
>>>>>>>> Same thing can be probably done for Intel specific instructions.
>>>>>>>> In general, I think it is much more preferable to use built-ins vs
>>>>>>>> inline assembly
>>>>>>>> (if possible off-course).
>>>>>>>>
>>>>>>>
>>>>>>> We don't compile different set of files for AMD and Intel, but there are
>>>>>>> runtime checks, so putting into separate file is not much different.
>>>>>
>>>>> Well, we probably don't compile .c files for particular vendor, but we
>>>>> definitely do compile some .c files for particular ISA extensions.
>>>>> Let say there are files in lib/acl that requires various '-mavx512*'
>>>>> flags, same for other libs and PMDs.
>>>>> So still not clear to me why same approach can't be applied to
>>>>> power_instrincts.c?
>>>>>
>>>>>>>
>>>>>>> It may be an option to always enable compiler flag (-mmwaitx), I think
>>>>>>> it won't hurt other platforms but I am not sure about implications of
>>>>>>> this to other platforms (what was the motivation for the compiler guys
>>>>>>> to enable these build-ins with specific flag?).
>>>>>>>
>>>>>>> Also this requires detecting compiler that supports 'mmwaitx' or not,
>>>>>>> etc..
>>>>>>>
>>>>>> This is the biggest reason why we have in the past added support for
>>>>>> these
>>>>>> instructions via asm bytes rather than intrinsics. It takes a long
>>>>>> time for
>>>>>> end-user compilers, especially those in LTS releases, to get the
>>>>>> necessary
>>>>>> intrinsics.
>>>>>
>>>>> Yep, understand.
>>>>> But why then we can't have both implementations?
>>>>> Let say if WAITPKG is defined we can use builtins for
>>>>> umonitor/umwait/tpause, otherwise we fallback to inline asm implementation.
>>>>> Same story for MWAITX/monitorx.
>>>>>
>>>>
>>>> Yes this can be done,
>>>> it can be done either as different .c files per implementation, or as
>>>> #ifdef in same file.
>>>>
>>>> But eventually asm implementation is required, as fallback, and if we
>>>> will rely on asm implementation anyway, does it worth to have the
>>>> additional checks to be able to use built-in intrinsic?
>>>>
>>>> Does it helps to comment name of the built-in function to inline
>>>> assembly code, to document intention and another possible implementation?
>>>
>>> the main value of preferring intrinsics is that when they are available
>>> they also work with msvc/windows. the msvc toolchain does not support
>>> inline asm. so some of the targets have to use intrinsics because that's all
>>> there is.
>>>
>>
>> How windows handles current power APIs without inline asm support, like
>> rte_power_intrinsics.c one?
> 
> so this is a windows vs toolchain entanglement.
> 
>> Also will using both built-in and inline assembly work for Windows,
>> since there may be compiler versions that doesn't support built-in
>> functions, they should disable APIs altogether, and this can create a
>> scenario that list of exposed APIs changes based on compiler version.
> 
> so I don't intend to disable apis, theres usually a way to make them
> work and there should not be any api changes when done correctly.
> 
> windows/clang/mingw
>      * inline asm may be used, but for me isn't preferred
> 
> windows/msvc
>      * intrinsics (when available)
>      * non-inline asm in a .s (when no intrinsics available)
>      * keeping in mind that the compiler version isn't tied to windows
>        OS release so it's easier for me to document that you need a
>        newer compiler arbitrarily. The periods where there are no intrinsics
>        end up being short-lived.
>     
> I'm on the hook for windows/msvc any stickyness dealing with it ends up
> being my problem.

As I can read rte_power_instrintcts.c, for each set of power instructions
we have related static variable: wait*_supported.
So if we need to support compiler that supports neither
new bultins nor inline-asm, then it probably possible to rearrange the 
code to keep these static vars equal zero for such case.


> 
>>
>>>>
>>>>>> Consider a user running e.g. RHEL 8, who wants to take
>>>>>> advantages of the latest DPDK features; they should not be required to
>>>>>> upgrade their compiler - and possibly binutils/assembler - to do so.
>>>>>>
>>>>>> /Bruce
>>>>>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 3/3] power: amd power monitor support
  2023-08-30 22:45                                 ` Konstantin Ananyev
@ 2023-09-27 10:38                                   ` Tummala, Sivaprasad
  2023-09-28 10:11                                     ` Konstantin Ananyev
  0 siblings, 1 reply; 51+ messages in thread
From: Tummala, Sivaprasad @ 2023-09-27 10:38 UTC (permalink / raw)
  To: Konstantin Ananyev, Tyler Retzlaff, Yigit, Ferruh
  Cc: Bruce Richardson, Konstantin Ananyev, david.hunt,
	anatoly.burakov, david.marchand, thomas, dev

[AMD Official Use Only - General]

> -----Original Message-----
> From: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
> Sent: Thursday, August 31, 2023 4:15 AM
> To: Tyler Retzlaff <roretzla@linux.microsoft.com>; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>
> Cc: Bruce Richardson <bruce.richardson@intel.com>; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; Tummala, Sivaprasad
> <Sivaprasad.Tummala@amd.com>; david.hunt@intel.com;
> anatoly.burakov@intel.com; david.marchand@redhat.com; thomas@monjalon.net;
> dev@dpdk.org
> Subject: Re: [PATCH v5 3/3] power: amd power monitor support
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> 25/08/2023 17:00, Tyler Retzlaff пишет:
> > On Thu, Aug 24, 2023 at 10:04:42AM +0100, Ferruh Yigit wrote:
> >> On 8/23/2023 5:03 PM, Tyler Retzlaff wrote:
> >>> On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
> >>>> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
> >>>>> 18/08/2023 14:48, Bruce Richardson пишет:
> >>>>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> >>>>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> >>>>>>>>
> >>>>>>>>>> Caution: This message originated from an External Source. Use
> >>>>>>>>>> proper caution when opening attachments, clicking links, or
> >>>>>>>>>> responding.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala
> wrote:
> >>>>>>>>>>> mwaitx allows EPYC processors to enter a implementation
> >>>>>>>>>>> dependent power/performance optimized state (C1 state) for a
> >>>>>>>>>>> specific period or until a store to the monitored address
> >>>>>>>>>>> range.
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Sivaprasad Tummala
> >>>>>>>>>>> <sivaprasad.tummala@amd.com>
> >>>>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>    lib/eal/x86/rte_power_intrinsics.c | 77
> >>>>>>>>>>> +++++++++++++++++++++++++-----
> >>>>>>>>>>>    1 file changed, 66 insertions(+), 11 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>>>> index 6eb9e50807..b4754e17da 100644
> >>>>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> >>>>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> >>>>>>>>>>>         volatile void *monitor_addr; /**< NULL if not
> >>>>>>>>>>> currently sleeping */  } __rte_cache_aligned
> >>>>>>>>>>> wait_status[RTE_MAX_LCORE];
> >>>>>>>>>>>
> >>>>>>>>>>> +/**
> >>>>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and
> >>>>>>>>>>> +will
> >>>>>>>>>>> enter C0.2
> >>>>>>>>>> state.
> >>>>>>>>>>> + * For more information about usage of these instructions,
> >>>>>>>>>>> +please refer to
> >>>>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's
> Manual.
> >>>>>>>>>>> + */
> >>>>>>>>>>> +static void intel_umonitor(volatile void *addr) {
> >>>>>>>>>>> +     /* UMONITOR */
> >>>>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> >>>>>>>>>>> +                     :
> >>>>>>>>>>> +                     : "D"(addr)); }
> >>>>>>>>>>> +
> >>>>>>>>>>> +static void intel_umwait(const uint64_t timeout) {
> >>>>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> >>>>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> >>>>>>>>>>> +     /* UMWAIT */
> >>>>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> >>>>>>>>>>> +                     : /* ignore rflags */
> >>>>>>>>>>> +                     : "D"(0), /* enter C0.2 */
> >>>>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> >>>>>>>>>>
> >>>>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
> >>>>>>>>>>
> >>>>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
> >>>>>>>>>> believe as per our other thread of discussion is there a
> >>>>>>>>>> benefit to also providing inline assembly over just using the
> >>>>>>>>>> intrinsics? I understand that the intrinsics may not exist
> >>>>>>>>>> for the monitorx and mwaitx below so it is probably necessary
> >>>>>>>>>> for amd.
> >>>>>>>>>>
> >>>>>>>>>> so the suggestion here is when they are available just use
> >>>>>>>>>> the intrinsics.
> >>>>>>>>>>
> >>>>>>>>>> thanks
> >>>>>>>>>>
> >>>>>>>>> The gcc built-in functions
> >>>>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available
> >>>>>>>>> only when -mmwaitx is used specific for AMD platforms. On
> >>>>>>>>> generic builds, these built-ins are not available and hence
> >>>>>>>>> inline assembly is required here.
> >>>>>>>>
> >>>>>>>> Ok... but we can probably put them into a separate .c file that
> >>>>>>>> will be compiled with that specific flag?
> >>>>>>>> Same thing can be probably done for Intel specific instructions.
> >>>>>>>> In general, I think it is much more preferable to use built-ins
> >>>>>>>> vs inline assembly (if possible off-course).
> >>>>>>>>
> >>>>>>>
> >>>>>>> We don't compile different set of files for AMD and Intel, but
> >>>>>>> there are runtime checks, so putting into separate file is not much
> different.
> >>>>>
> >>>>> Well, we probably don't compile .c files for particular vendor,
> >>>>> but we definitely do compile some .c files for particular ISA extensions.
> >>>>> Let say there are files in lib/acl that requires various '-mavx512*'
> >>>>> flags, same for other libs and PMDs.
> >>>>> So still not clear to me why same approach can't be applied to
> >>>>> power_instrincts.c?
> >>>>>
> >>>>>>>
> >>>>>>> It may be an option to always enable compiler flag (-mmwaitx), I
> >>>>>>> think it won't hurt other platforms but I am not sure about
> >>>>>>> implications of this to other platforms (what was the motivation
> >>>>>>> for the compiler guys to enable these build-ins with specific flag?).
> >>>>>>>
> >>>>>>> Also this requires detecting compiler that supports 'mmwaitx' or
> >>>>>>> not, etc..
> >>>>>>>
> >>>>>> This is the biggest reason why we have in the past added support
> >>>>>> for these instructions via asm bytes rather than intrinsics. It
> >>>>>> takes a long time for end-user compilers, especially those in LTS
> >>>>>> releases, to get the necessary intrinsics.
> >>>>>
> >>>>> Yep, understand.
> >>>>> But why then we can't have both implementations?
> >>>>> Let say if WAITPKG is defined we can use builtins for
> >>>>> umonitor/umwait/tpause, otherwise we fallback to inline asm
> implementation.
> >>>>> Same story for MWAITX/monitorx.
> >>>>>
> >>>>
> >>>> Yes this can be done,
> >>>> it can be done either as different .c files per implementation, or
> >>>> as #ifdef in same file.
> >>>>
> >>>> But eventually asm implementation is required, as fallback, and if
> >>>> we will rely on asm implementation anyway, does it worth to have
> >>>> the additional checks to be able to use built-in intrinsic?
> >>>>
> >>>> Does it helps to comment name of the built-in function to inline
> >>>> assembly code, to document intention and another possible
> implementation?
> >>>
> >>> the main value of preferring intrinsics is that when they are
> >>> available they also work with msvc/windows. the msvc toolchain does
> >>> not support inline asm. so some of the targets have to use
> >>> intrinsics because that's all there is.
> >>>
> >>
> >> How windows handles current power APIs without inline asm support,
> >> like rte_power_intrinsics.c one?
> >
> > so this is a windows vs toolchain entanglement.
> >
> >> Also will using both built-in and inline assembly work for Windows,
> >> since there may be compiler versions that doesn't support built-in
> >> functions, they should disable APIs altogether, and this can create a
> >> scenario that list of exposed APIs changes based on compiler version.
> >
> > so I don't intend to disable apis, theres usually a way to make them
> > work and there should not be any api changes when done correctly.
> >
> > windows/clang/mingw
> >      * inline asm may be used, but for me isn't preferred
> >
> > windows/msvc
> >      * intrinsics (when available)
> >      * non-inline asm in a .s (when no intrinsics available)
> >      * keeping in mind that the compiler version isn't tied to windows
> >        OS release so it's easier for me to document that you need a
> >        newer compiler arbitrarily. The periods where there are no intrinsics
> >        end up being short-lived.
> >
> > I'm on the hook for windows/msvc any stickyness dealing with it ends
> > up being my problem.
>
> As I can read rte_power_instrintcts.c, for each set of power instructions we have
> related static variable: wait*_supported.
> So if we need to support compiler that supports neither new bultins nor inline-asm,
> then it probably possible to rearrange the code to keep these static vars equal zero
> for such case.
>
Konstantin,
I preferred to use the inline asm for the new power instructions with this patch due to:
1. builtins may not always available based on compiler versions
2. additional complexity with versioning checks
3. no additional performance benefit
As an improvement, will add builtins support globally in a separate patch later.

>
> >
> >>
> >>>>
> >>>>>> Consider a user running e.g. RHEL 8, who wants to take advantages
> >>>>>> of the latest DPDK features; they should not be required to
> >>>>>> upgrade their compiler - and possibly binutils/assembler - to do so.
> >>>>>>
> >>>>>> /Bruce
> >>>>>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 3/3] power: amd power monitor support
  2023-09-27 10:38                                   ` Tummala, Sivaprasad
@ 2023-09-28 10:11                                     ` Konstantin Ananyev
  0 siblings, 0 replies; 51+ messages in thread
From: Konstantin Ananyev @ 2023-09-28 10:11 UTC (permalink / raw)
  To: Tummala, Sivaprasad, Konstantin Ananyev, Tyler Retzlaff, Yigit, Ferruh
  Cc: Bruce Richardson, david.hunt, anatoly.burakov, david.marchand,
	thomas, dev


> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > 25/08/2023 17:00, Tyler Retzlaff пишет:
> > > On Thu, Aug 24, 2023 at 10:04:42AM +0100, Ferruh Yigit wrote:
> > >> On 8/23/2023 5:03 PM, Tyler Retzlaff wrote:
> > >>> On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
> > >>>> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
> > >>>>> 18/08/2023 14:48, Bruce Richardson пишет:
> > >>>>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> > >>>>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> > >>>>>>>>
> > >>>>>>>>>> Caution: This message originated from an External Source. Use
> > >>>>>>>>>> proper caution when opening attachments, clicking links, or
> > >>>>>>>>>> responding.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala
> > wrote:
> > >>>>>>>>>>> mwaitx allows EPYC processors to enter a implementation
> > >>>>>>>>>>> dependent power/performance optimized state (C1 state) for a
> > >>>>>>>>>>> specific period or until a store to the monitored address
> > >>>>>>>>>>> range.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Signed-off-by: Sivaprasad Tummala
> > >>>>>>>>>>> <sivaprasad.tummala@amd.com>
> > >>>>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > >>>>>>>>>>> ---
> > >>>>>>>>>>>    lib/eal/x86/rte_power_intrinsics.c | 77
> > >>>>>>>>>>> +++++++++++++++++++++++++-----
> > >>>>>>>>>>>    1 file changed, 66 insertions(+), 11 deletions(-)
> > >>>>>>>>>>>
> > >>>>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> index 6eb9e50807..b4754e17da 100644
> > >>>>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> > >>>>>>>>>>>         volatile void *monitor_addr; /**< NULL if not
> > >>>>>>>>>>> currently sleeping */  } __rte_cache_aligned
> > >>>>>>>>>>> wait_status[RTE_MAX_LCORE];
> > >>>>>>>>>>>
> > >>>>>>>>>>> +/**
> > >>>>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and
> > >>>>>>>>>>> +will
> > >>>>>>>>>>> enter C0.2
> > >>>>>>>>>> state.
> > >>>>>>>>>>> + * For more information about usage of these instructions,
> > >>>>>>>>>>> +please refer to
> > >>>>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's
> > Manual.
> > >>>>>>>>>>> + */
> > >>>>>>>>>>> +static void intel_umonitor(volatile void *addr) {
> > >>>>>>>>>>> +     /* UMONITOR */
> > >>>>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > >>>>>>>>>>> +                     :
> > >>>>>>>>>>> +                     : "D"(addr)); }
> > >>>>>>>>>>> +
> > >>>>>>>>>>> +static void intel_umwait(const uint64_t timeout) {
> > >>>>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> > >>>>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> > >>>>>>>>>>> +     /* UMWAIT */
> > >>>>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > >>>>>>>>>>> +                     : /* ignore rflags */
> > >>>>>>>>>>> +                     : "D"(0), /* enter C0.2 */
> > >>>>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> > >>>>>>>>>>
> > >>>>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
> > >>>>>>>>>>
> > >>>>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
> > >>>>>>>>>> believe as per our other thread of discussion is there a
> > >>>>>>>>>> benefit to also providing inline assembly over just using the
> > >>>>>>>>>> intrinsics? I understand that the intrinsics may not exist
> > >>>>>>>>>> for the monitorx and mwaitx below so it is probably necessary
> > >>>>>>>>>> for amd.
> > >>>>>>>>>>
> > >>>>>>>>>> so the suggestion here is when they are available just use
> > >>>>>>>>>> the intrinsics.
> > >>>>>>>>>>
> > >>>>>>>>>> thanks
> > >>>>>>>>>>
> > >>>>>>>>> The gcc built-in functions
> > >>>>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available
> > >>>>>>>>> only when -mmwaitx is used specific for AMD platforms. On
> > >>>>>>>>> generic builds, these built-ins are not available and hence
> > >>>>>>>>> inline assembly is required here.
> > >>>>>>>>
> > >>>>>>>> Ok... but we can probably put them into a separate .c file that
> > >>>>>>>> will be compiled with that specific flag?
> > >>>>>>>> Same thing can be probably done for Intel specific instructions.
> > >>>>>>>> In general, I think it is much more preferable to use built-ins
> > >>>>>>>> vs inline assembly (if possible off-course).
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> We don't compile different set of files for AMD and Intel, but
> > >>>>>>> there are runtime checks, so putting into separate file is not much
> > different.
> > >>>>>
> > >>>>> Well, we probably don't compile .c files for particular vendor,
> > >>>>> but we definitely do compile some .c files for particular ISA extensions.
> > >>>>> Let say there are files in lib/acl that requires various '-mavx512*'
> > >>>>> flags, same for other libs and PMDs.
> > >>>>> So still not clear to me why same approach can't be applied to
> > >>>>> power_instrincts.c?
> > >>>>>
> > >>>>>>>
> > >>>>>>> It may be an option to always enable compiler flag (-mmwaitx), I
> > >>>>>>> think it won't hurt other platforms but I am not sure about
> > >>>>>>> implications of this to other platforms (what was the motivation
> > >>>>>>> for the compiler guys to enable these build-ins with specific flag?).
> > >>>>>>>
> > >>>>>>> Also this requires detecting compiler that supports 'mmwaitx' or
> > >>>>>>> not, etc..
> > >>>>>>>
> > >>>>>> This is the biggest reason why we have in the past added support
> > >>>>>> for these instructions via asm bytes rather than intrinsics. It
> > >>>>>> takes a long time for end-user compilers, especially those in LTS
> > >>>>>> releases, to get the necessary intrinsics.
> > >>>>>
> > >>>>> Yep, understand.
> > >>>>> But why then we can't have both implementations?
> > >>>>> Let say if WAITPKG is defined we can use builtins for
> > >>>>> umonitor/umwait/tpause, otherwise we fallback to inline asm
> > implementation.
> > >>>>> Same story for MWAITX/monitorx.
> > >>>>>
> > >>>>
> > >>>> Yes this can be done,
> > >>>> it can be done either as different .c files per implementation, or
> > >>>> as #ifdef in same file.
> > >>>>
> > >>>> But eventually asm implementation is required, as fallback, and if
> > >>>> we will rely on asm implementation anyway, does it worth to have
> > >>>> the additional checks to be able to use built-in intrinsic?
> > >>>>
> > >>>> Does it helps to comment name of the built-in function to inline
> > >>>> assembly code, to document intention and another possible
> > implementation?
> > >>>
> > >>> the main value of preferring intrinsics is that when they are
> > >>> available they also work with msvc/windows. the msvc toolchain does
> > >>> not support inline asm. so some of the targets have to use
> > >>> intrinsics because that's all there is.
> > >>>
> > >>
> > >> How windows handles current power APIs without inline asm support,
> > >> like rte_power_intrinsics.c one?
> > >
> > > so this is a windows vs toolchain entanglement.
> > >
> > >> Also will using both built-in and inline assembly work for Windows,
> > >> since there may be compiler versions that doesn't support built-in
> > >> functions, they should disable APIs altogether, and this can create a
> > >> scenario that list of exposed APIs changes based on compiler version.
> > >
> > > so I don't intend to disable apis, theres usually a way to make them
> > > work and there should not be any api changes when done correctly.
> > >
> > > windows/clang/mingw
> > >      * inline asm may be used, but for me isn't preferred
> > >
> > > windows/msvc
> > >      * intrinsics (when available)
> > >      * non-inline asm in a .s (when no intrinsics available)
> > >      * keeping in mind that the compiler version isn't tied to windows
> > >        OS release so it's easier for me to document that you need a
> > >        newer compiler arbitrarily. The periods where there are no intrinsics
> > >        end up being short-lived.
> > >
> > > I'm on the hook for windows/msvc any stickyness dealing with it ends
> > > up being my problem.
> >
> > As I can read rte_power_instrintcts.c, for each set of power instructions we have
> > related static variable: wait*_supported.
> > So if we need to support compiler that supports neither new bultins nor inline-asm,
> > then it probably possible to rearrange the code to keep these static vars equal zero
> > for such case.
> >
> Konstantin,
> I preferred to use the inline asm for the new power instructions with this patch due to:
> 1. builtins may not always available based on compiler versions
> 2. additional complexity with versioning checks

I don't think we'll need to mock with version numbers etc.
AFAIK, each such feature as corresponding define, i.e.:
If WAITPKG is defined we can use builtins for umonitor/umwait/tpause,
If MWAITX is defined we can use mwaitx/monitorx, etc.

> 3. no additional performance benefit
> As an improvement, will add builtins support globally in a separate patch later.

Works for me, as long as we wouldn't forget to do it in future. 

 
> >
> > >
> > >>
> > >>>>
> > >>>>>> Consider a user running e.g. RHEL 8, who wants to take advantages
> > >>>>>> of the latest DPDK features; they should not be required to
> > >>>>>> upgrade their compiler - and possibly binutils/assembler - to do so.
> > >>>>>>
> > >>>>>> /Bruce
> > >>>>>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 3/3] power: amd power monitor support
  2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
  2023-08-16 19:27             ` Tyler Retzlaff
@ 2023-10-06  8:26             ` David Marchand
  2023-10-09  8:02               ` Tummala, Sivaprasad
  2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2 siblings, 1 reply; 51+ messages in thread
From: David Marchand @ 2023-10-06  8:26 UTC (permalink / raw)
  To: Sivaprasad Tummala
  Cc: david.hunt, anatoly.burakov, ferruh.yigit, thomas, dev, Tyler Retzlaff

On Wed, Aug 16, 2023 at 9:00 PM Sivaprasad Tummala
<sivaprasad.tummala@amd.com> wrote:
>
> mwaitx allows EPYC processors to enter a implementation dependent
> power/performance optimized state (C1 state) for a specific period
> or until a store to the monitored address range.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

This series won't apply cleanly (following MSVC related series merge)
and it breaks x86 32bits compilation.
http://mails.dpdk.org/archives/test-report/2023-August/442809.html


-- 
David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 3/3] power: amd power monitor support
  2023-10-06  8:26             ` David Marchand
@ 2023-10-09  8:02               ` Tummala, Sivaprasad
  0 siblings, 0 replies; 51+ messages in thread
From: Tummala, Sivaprasad @ 2023-10-09  8:02 UTC (permalink / raw)
  To: David Marchand
  Cc: david.hunt, anatoly.burakov, Yigit, Ferruh, thomas, dev, Tyler Retzlaff

[AMD Official Use Only - General]

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, October 6, 2023 1:57 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; anatoly.burakov@intel.com; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; thomas@monjalon.net; dev@dpdk.org; Tyler Retzlaff
> <roretzla@linux.microsoft.com>
> Subject: Re: [PATCH v5 3/3] power: amd power monitor support
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Wed, Aug 16, 2023 at 9:00 PM Sivaprasad Tummala
> <sivaprasad.tummala@amd.com> wrote:
> >
> > mwaitx allows EPYC processors to enter a implementation dependent
> > power/performance optimized state (C1 state) for a specific period or
> > until a store to the monitored address range.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> This series won't apply cleanly (following MSVC related series merge) and it breaks
> x86 32bits compilation.
> http://mails.dpdk.org/archives/test-report/2023-August/442809.html
Hi David,

I will send a new version of the patch on the new mainline along with the fix for 32bit compilation.

>
>
> --
> David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v6 1/3] eal: add x86 cpuid support for monitorx
  2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
  2023-08-16 19:27             ` Tyler Retzlaff
  2023-10-06  8:26             ` David Marchand
@ 2023-10-09 14:05             ` Sivaprasad Tummala
  2023-10-09 14:05               ` [PATCH v6 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
                                 ` (2 more replies)
  2 siblings, 3 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-10-09 14:05 UTC (permalink / raw)
  To: david.marchand
  Cc: david.hunt, konstantin.v.ananyev, roretzla, anatoly.burakov,
	thomas, ferruh.yigit, dev

Add a new CPUID flag to indicate support for monitorx instruction
on AMD EPYC processors.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/x86/include/rte_cpuflags.h | 1 +
 lib/eal/x86/rte_cpuflags.c         | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/lib/eal/x86/include/rte_cpuflags.h b/lib/eal/x86/include/rte_cpuflags.h
index d95bf33a16..1ee00e70fe 100644
--- a/lib/eal/x86/include/rte_cpuflags.h
+++ b/lib/eal/x86/include/rte_cpuflags.h
@@ -133,6 +133,7 @@ enum rte_cpu_flag_t {
 	RTE_CPUFLAG_AVX512VP2INTERSECT,     /**< AVX512 Two Register Intersection */
 
 	RTE_CPUFLAG_WAITPKG,                /**< UMONITOR/UMWAIT/TPAUSE */
+	RTE_CPUFLAG_MONITORX,               /**< MONITORX */
 };
 
 #include "generic/rte_cpuflags.h"
diff --git a/lib/eal/x86/rte_cpuflags.c b/lib/eal/x86/rte_cpuflags.c
index 3fb1cb9bab..26163ab746 100644
--- a/lib/eal/x86/rte_cpuflags.c
+++ b/lib/eal/x86/rte_cpuflags.c
@@ -133,6 +133,7 @@ const struct feature_entry rte_cpu_feature_table[] = {
 
 	FEAT_DEF(LAHF_SAHF, 0x80000001, 0, RTE_REG_ECX,  0)
 	FEAT_DEF(LZCNT, 0x80000001, 0, RTE_REG_ECX,  4)
+	FEAT_DEF(MONITORX, 0x80000001, 0, RTE_REG_ECX,  29)
 
 	FEAT_DEF(SYSCALL, 0x80000001, 0, RTE_REG_EDX, 11)
 	FEAT_DEF(XD, 0x80000001, 0, RTE_REG_EDX, 20)
@@ -195,5 +196,7 @@ rte_cpu_get_intrinsics_support(struct rte_cpu_intrinsics *intrinsics)
 		intrinsics->power_pause = 1;
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RTM))
 			intrinsics->power_monitor_multi = 1;
+	} else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) {
+		intrinsics->power_monitor = 1;
 	}
 }
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v6 2/3] eal: removed unnecessary checks in x86 power monitor APIs
  2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
@ 2023-10-09 14:05               ` Sivaprasad Tummala
  2023-10-09 14:05               ` [PATCH v6 3/3] power: amd power monitor support Sivaprasad Tummala
  2023-10-09 16:23               ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Patrick Robb
  2 siblings, 0 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-10-09 14:05 UTC (permalink / raw)
  To: david.marchand
  Cc: david.hunt, konstantin.v.ananyev, roretzla, anatoly.burakov,
	thomas, ferruh.yigit, dev

current x86 power monitor implementation fails on platforms
with only monitor supported and not power_pause.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 1467a32cb3..664cde01e9 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -31,6 +31,7 @@ __umwait_wakeup(volatile void *addr)
 
 static bool wait_supported;
 static bool wait_multi_supported;
+static bool monitor_supported;
 
 static inline uint64_t
 __get_umwait_val(const volatile void *p, const uint8_t sz)
@@ -82,7 +83,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!monitor_supported)
 		return -ENOTSUP;
 
 	/* prevent non-EAL thread from using this API */
@@ -183,6 +184,8 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_supported = 1;
 	if (i.power_monitor_multi)
 		wait_multi_supported = 1;
+	if (i.power_monitor)
+		monitor_supported = 1;
 }
 
 int
@@ -191,7 +194,7 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
 	struct power_wait_status *s;
 
 	/* prevent user from running this instruction if it's not supported */
-	if (!wait_supported)
+	if (!monitor_supported)
 		return -ENOTSUP;
 
 	/* prevent buffer overrun */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v6 3/3] power: amd power monitor support
  2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2023-10-09 14:05               ` [PATCH v6 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
@ 2023-10-09 14:05               ` Sivaprasad Tummala
  2023-10-10  8:59                 ` David Marchand
  2023-10-10 10:14                 ` Konstantin Ananyev
  2023-10-09 16:23               ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Patrick Robb
  2 siblings, 2 replies; 51+ messages in thread
From: Sivaprasad Tummala @ 2023-10-09 14:05 UTC (permalink / raw)
  To: david.marchand
  Cc: david.hunt, konstantin.v.ananyev, roretzla, anatoly.burakov,
	thomas, ferruh.yigit, dev

mwaitx allows EPYC processors to enter a implementation dependent
power/performance optimized state (C1 state) for a specific period
or until a store to the monitored address range.

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/eal/x86/rte_power_intrinsics.c | 108 ++++++++++++++++++++++-------
 1 file changed, 84 insertions(+), 24 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 664cde01e9..0d2953f570 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -17,6 +17,78 @@ static struct power_wait_status {
 	volatile void *monitor_addr; /**< NULL if not currently sleeping */
 } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
 
+/**
+ * This functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
+ * For more information about usage of these instructions, please refer to
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
+ */
+static void intel_umonitor(volatile void *addr)
+{
+#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
+	/* cast away "volatile" when using the intrinsic */
+	_umonitor((void *)(uintptr_t)addr);
+#else
+	/*
+	 * we're using raw byte codes for compiler versions which
+	 * don't support this instruction natively.
+	 */
+	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
+			:
+			: "D"(addr));
+#endif
+}
+
+static void intel_umwait(const uint64_t timeout)
+{
+	const uint32_t tsc_l = (uint32_t)timeout;
+	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
+#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
+	_umwait(tsc_l, tsc_h);
+#else
+	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
+			: /* ignore rflags */
+			: "D"(0), /* enter C0.2 */
+			  "a"(tsc_l), "d"(tsc_h));
+#endif
+}
+
+/**
+ * This functions uses MONITORX/MWAITX instructions and will enter C1 state.
+ * For more information about usage of these instructions, please refer to
+ * AMD64 Architecture Programmer’s Manual.
+ */
+static void amd_monitorx(volatile void *addr)
+{
+#if defined(__MWAITX__)
+	/* cast away "volatile" when using the intrinsic */
+	_mm_monitorx((void *)(uintptr_t)addr, 0, 0);
+#else
+	asm volatile(".byte 0x0f, 0x01, 0xfa;"
+			:
+			: "a"(addr),
+			"c"(0),  /* no extensions */
+			"d"(0)); /* no hints */
+#endif
+}
+
+static void amd_mwaitx(const uint64_t timeout)
+{
+	RTE_SET_USED(timeout);
+#if defined(__MWAITX__)
+	_mm_mwaitx(0, 0, 0);
+#else
+	asm volatile(".byte 0x0f, 0x01, 0xfb;"
+			: /* ignore rflags */
+			: "a"(0), /* enter C1 */
+			"c"(0)); /* no time-out */
+#endif
+}
+
+static struct {
+	void (*mmonitor)(volatile void *addr);
+	void (*mwait)(const uint64_t timeout);
+} __rte_cache_aligned power_monitor_ops;
+
 static inline void
 __umwait_wakeup(volatile void *addr)
 {
@@ -76,8 +148,6 @@ int
 rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 		const uint64_t tsc_timestamp)
 {
-	const uint32_t tsc_l = (uint32_t)tsc_timestamp;
-	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
 	uint64_t cur_value;
@@ -105,19 +175,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	rte_spinlock_lock(&s->lock);
 	s->monitor_addr = pmc->addr;
 
-	/* set address for UMONITOR */
-#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
-	/* cast away "volatile" when using the intrinsic */
-	_umonitor((void *)(uintptr_t)pmc->addr);
-#else
-	/*
-	 * we're using raw byte codes for compiler versions which
-	 * don't support this instruction natively.
-	 */
-	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
-			:
-			: "D"(pmc->addr));
-#endif
+	/* set address for memory monitor */
+	power_monitor_ops.mmonitor(pmc->addr);
 
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
@@ -128,15 +187,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (pmc->fn(cur_value, pmc->opaque) != 0)
 		goto end;
 
-	/* execute UMWAIT */
-#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
-	_umwait(tsc_l, tsc_h);
-#else
-	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-			: /* ignore rflags */
-			: "D"(0), /* enter C0.2 */
-			  "a"(tsc_l), "d"(tsc_h));
-#endif
+	/* execute mwait */
+	power_monitor_ops.mwait(tsc_timestamp);
 
 end:
 	/* erase sleep address */
@@ -186,6 +238,14 @@ RTE_INIT(rte_power_intrinsics_init) {
 		wait_multi_supported = 1;
 	if (i.power_monitor)
 		monitor_supported = 1;
+
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
+		power_monitor_ops.mmonitor = &amd_monitorx;
+		power_monitor_ops.mwait = &amd_mwaitx;
+	} else { /* Intel */
+		power_monitor_ops.mmonitor = &intel_umonitor;
+		power_monitor_ops.mwait = &intel_umwait;
+	}
 }
 
 int
-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 1/3] eal: add x86 cpuid support for monitorx
  2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
  2023-10-09 14:05               ` [PATCH v6 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
  2023-10-09 14:05               ` [PATCH v6 3/3] power: amd power monitor support Sivaprasad Tummala
@ 2023-10-09 16:23               ` Patrick Robb
  2023-10-10  8:21                 ` David Marchand
  2 siblings, 1 reply; 51+ messages in thread
From: Patrick Robb @ 2023-10-09 16:23 UTC (permalink / raw)
  To: Sivaprasad Tummala; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 129 bytes --]

Recheck-request: iol-unit-amd64-testing

Failed for service_autotest on windows. I'm doing a retest to see if it's
reproducible.

[-- Attachment #2: Type: text/html, Size: 589 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 1/3] eal: add x86 cpuid support for monitorx
  2023-10-09 16:23               ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Patrick Robb
@ 2023-10-10  8:21                 ` David Marchand
  0 siblings, 0 replies; 51+ messages in thread
From: David Marchand @ 2023-10-10  8:21 UTC (permalink / raw)
  To: Patrick Robb; +Cc: Sivaprasad Tummala, dev

On Mon, Oct 9, 2023 at 6:24 PM Patrick Robb <probb@iol.unh.edu> wrote:
>
> Recheck-request: iol-unit-amd64-testing
>
> Failed for service_autotest on windows. I'm doing a retest to see if it's reproducible.

Thanks for checking.
This unit test has never been entirely reliable... I see the rerun was fine.
I'll go ahead with this series.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 3/3] power: amd power monitor support
  2023-10-09 14:05               ` [PATCH v6 3/3] power: amd power monitor support Sivaprasad Tummala
@ 2023-10-10  8:59                 ` David Marchand
  2023-10-11  9:33                   ` Tummala, Sivaprasad
  2023-10-10 10:14                 ` Konstantin Ananyev
  1 sibling, 1 reply; 51+ messages in thread
From: David Marchand @ 2023-10-10  8:59 UTC (permalink / raw)
  To: Sivaprasad Tummala
  Cc: david.hunt, konstantin.v.ananyev, roretzla, anatoly.burakov,
	thomas, ferruh.yigit, dev

Hello Siva,

On Mon, Oct 9, 2023 at 4:06 PM Sivaprasad Tummala
<sivaprasad.tummala@amd.com> wrote:
>
> mwaitx allows EPYC processors to enter a implementation dependent
> power/performance optimized state (C1 state) for a specific period
> or until a store to the monitored address range.
>
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---

Please put some changelog to make life easier for reviewers (and me).

I diffed with the previous version to check what had been changed and
I see this:

 static void amd_mwaitx(const uint64_t timeout)
...
-                       "c"(2), /* enable timer */
-                       "b"(timeout));
+                       "c"(0)); /* no time-out */

I will take this series as is, but please confirm why this change was needed.


>  lib/eal/x86/rte_power_intrinsics.c | 108 ++++++++++++++++++++++-------
>  1 file changed, 84 insertions(+), 24 deletions(-)
>
> diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
> index 664cde01e9..0d2953f570 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -17,6 +17,78 @@ static struct power_wait_status {
>         volatile void *monitor_addr; /**< NULL if not currently sleeping */
>  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>
> +/**
> + * This functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.

Fixed while applying, function*

> + * For more information about usage of these instructions, please refer to
> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> + */
> +static void intel_umonitor(volatile void *addr)
> +{
> +#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> +       /* cast away "volatile" when using the intrinsic */
> +       _umonitor((void *)(uintptr_t)addr);
> +#else
> +       /*
> +        * we're using raw byte codes for compiler versions which
> +        * don't support this instruction natively.
> +        */
> +       asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> +                       :
> +                       : "D"(addr));
> +#endif
> +}
> +
> +static void intel_umwait(const uint64_t timeout)
> +{
> +       const uint32_t tsc_l = (uint32_t)timeout;
> +       const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> +#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> +       _umwait(tsc_l, tsc_h);
> +#else
> +       asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> +                       : /* ignore rflags */
> +                       : "D"(0), /* enter C0.2 */
> +                         "a"(tsc_l), "d"(tsc_h));
> +#endif
> +}
> +
> +/**
> + * This functions uses MONITORX/MWAITX instructions and will enter C1 state.
> + * For more information about usage of these instructions, please refer to
> + * AMD64 Architecture Programmer’s Manual.
> + */
> +static void amd_monitorx(volatile void *addr)
> +{
> +#if defined(__MWAITX__)

This part probably breaks build with MSVC.

I could not determine whether MSVC supports this intrinsic.
I'll rely on Tyler to fix it later.


Series applied.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 3/3] power: amd power monitor support
  2023-10-09 14:05               ` [PATCH v6 3/3] power: amd power monitor support Sivaprasad Tummala
  2023-10-10  8:59                 ` David Marchand
@ 2023-10-10 10:14                 ` Konstantin Ananyev
  1 sibling, 0 replies; 51+ messages in thread
From: Konstantin Ananyev @ 2023-10-10 10:14 UTC (permalink / raw)
  To: Sivaprasad Tummala, david.marchand
  Cc: david.hunt, roretzla, anatoly.burakov, thomas, ferruh.yigit, dev

09.10.2023 15:05, Sivaprasad Tummala пишет:
> mwaitx allows EPYC processors to enter a implementation dependent
> power/performance optimized state (C1 state) for a specific period
> or until a store to the monitored address range.
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>   lib/eal/x86/rte_power_intrinsics.c | 108 ++++++++++++++++++++++-------
>   1 file changed, 84 insertions(+), 24 deletions(-)
> 
> diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
> index 664cde01e9..0d2953f570 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -17,6 +17,78 @@ static struct power_wait_status {
>   	volatile void *monitor_addr; /**< NULL if not currently sleeping */
>   } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
>   
> +/**
> + * This functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
> + * For more information about usage of these instructions, please refer to
> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> + */
> +static void intel_umonitor(volatile void *addr)
> +{
> +#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> +	/* cast away "volatile" when using the intrinsic */
> +	_umonitor((void *)(uintptr_t)addr);
> +#else
> +	/*
> +	 * we're using raw byte codes for compiler versions which
> +	 * don't support this instruction natively.
> +	 */
> +	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> +			:
> +			: "D"(addr));
> +#endif
> +}
> +
> +static void intel_umwait(const uint64_t timeout)
> +{
> +	const uint32_t tsc_l = (uint32_t)timeout;
> +	const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> +#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> +	_umwait(tsc_l, tsc_h);
> +#else
> +	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> +			: /* ignore rflags */
> +			: "D"(0), /* enter C0.2 */
> +			  "a"(tsc_l), "d"(tsc_h));
> +#endif
> +}
> +
> +/**
> + * This functions uses MONITORX/MWAITX instructions and will enter C1 state.
> + * For more information about usage of these instructions, please refer to
> + * AMD64 Architecture Programmer’s Manual.
> + */
> +static void amd_monitorx(volatile void *addr)
> +{
> +#if defined(__MWAITX__)
> +	/* cast away "volatile" when using the intrinsic */
> +	_mm_monitorx((void *)(uintptr_t)addr, 0, 0);
> +#else
> +	asm volatile(".byte 0x0f, 0x01, 0xfa;"
> +			:
> +			: "a"(addr),
> +			"c"(0),  /* no extensions */
> +			"d"(0)); /* no hints */
> +#endif
> +}
> +
> +static void amd_mwaitx(const uint64_t timeout)
> +{
> +	RTE_SET_USED(timeout);
> +#if defined(__MWAITX__)
> +	_mm_mwaitx(0, 0, 0);
> +#else
> +	asm volatile(".byte 0x0f, 0x01, 0xfb;"
> +			: /* ignore rflags */
> +			: "a"(0), /* enter C1 */
> +			"c"(0)); /* no time-out */
> +#endif
> +}
> +
> +static struct {
> +	void (*mmonitor)(volatile void *addr);
> +	void (*mwait)(const uint64_t timeout);
> +} __rte_cache_aligned power_monitor_ops;
> +
>   static inline void
>   __umwait_wakeup(volatile void *addr)
>   {
> @@ -76,8 +148,6 @@ int
>   rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>   		const uint64_t tsc_timestamp)
>   {
> -	const uint32_t tsc_l = (uint32_t)tsc_timestamp;
> -	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
>   	const unsigned int lcore_id = rte_lcore_id();
>   	struct power_wait_status *s;
>   	uint64_t cur_value;
> @@ -105,19 +175,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>   	rte_spinlock_lock(&s->lock);
>   	s->monitor_addr = pmc->addr;
>   
> -	/* set address for UMONITOR */
> -#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> -	/* cast away "volatile" when using the intrinsic */
> -	_umonitor((void *)(uintptr_t)pmc->addr);
> -#else
> -	/*
> -	 * we're using raw byte codes for compiler versions which
> -	 * don't support this instruction natively.
> -	 */
> -	asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> -			:
> -			: "D"(pmc->addr));
> -#endif
> +	/* set address for memory monitor */
> +	power_monitor_ops.mmonitor(pmc->addr);
>   
>   	/* now that we've put this address into monitor, we can unlock */
>   	rte_spinlock_unlock(&s->lock);
> @@ -128,15 +187,8 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
>   	if (pmc->fn(cur_value, pmc->opaque) != 0)
>   		goto end;
>   
> -	/* execute UMWAIT */
> -#if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> -	_umwait(tsc_l, tsc_h);
> -#else
> -	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> -			: /* ignore rflags */
> -			: "D"(0), /* enter C0.2 */
> -			  "a"(tsc_l), "d"(tsc_h));
> -#endif
> +	/* execute mwait */
> +	power_monitor_ops.mwait(tsc_timestamp);
>   
>   end:
>   	/* erase sleep address */
> @@ -186,6 +238,14 @@ RTE_INIT(rte_power_intrinsics_init) {
>   		wait_multi_supported = 1;
>   	if (i.power_monitor)
>   		monitor_supported = 1;
> +
> +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */
> +		power_monitor_ops.mmonitor = &amd_monitorx;
> +		power_monitor_ops.mwait = &amd_mwaitx;
> +	} else { /* Intel */
> +		power_monitor_ops.mmonitor = &intel_umonitor;
> +		power_monitor_ops.mwait = &intel_umwait;
> +	}
>   }
>   
>   int

Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v6 3/3] power: amd power monitor support
  2023-10-10  8:59                 ` David Marchand
@ 2023-10-11  9:33                   ` Tummala, Sivaprasad
  0 siblings, 0 replies; 51+ messages in thread
From: Tummala, Sivaprasad @ 2023-10-11  9:33 UTC (permalink / raw)
  To: David Marchand
  Cc: david.hunt, konstantin.v.ananyev, roretzla, anatoly.burakov,
	thomas, Yigit, Ferruh, dev

[AMD Official Use Only - General]

Hi David,

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, October 10, 2023 2:30 PM
> To: Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: david.hunt@intel.com; konstantin.v.ananyev@yandex.ru;
> roretzla@linux.microsoft.com; anatoly.burakov@intel.com; thomas@monjalon.net;
> Yigit, Ferruh <Ferruh.Yigit@amd.com>; dev@dpdk.org
> Subject: Re: [PATCH v6 3/3] power: amd power monitor support
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Hello Siva,
>
> On Mon, Oct 9, 2023 at 4:06 PM Sivaprasad Tummala
> <sivaprasad.tummala@amd.com> wrote:
> >
> > mwaitx allows EPYC processors to enter a implementation dependent
> > power/performance optimized state (C1 state) for a specific period or
> > until a store to the monitored address range.
> >
> > Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > ---
>
> Please put some changelog to make life easier for reviewers (and me).
>
> I diffed with the previous version to check what had been changed and I see this:
>
>  static void amd_mwaitx(const uint64_t timeout) ...
> -                       "c"(2), /* enable timer */
> -                       "b"(timeout));
> +                       "c"(0)); /* no time-out */
> /
> I will take this series as is, but please confirm why this change was needed.
>
Thanks for applying the patch set. The change was needed to fix compilation issue for 32-bit DPDK.

>
> >  lib/eal/x86/rte_power_intrinsics.c | 108
> > ++++++++++++++++++++++-------
> >  1 file changed, 84 insertions(+), 24 deletions(-)
> >
> > diff --git a/lib/eal/x86/rte_power_intrinsics.c
> > b/lib/eal/x86/rte_power_intrinsics.c
> > index 664cde01e9..0d2953f570 100644
> > --- a/lib/eal/x86/rte_power_intrinsics.c
> > +++ b/lib/eal/x86/rte_power_intrinsics.c
> > @@ -17,6 +17,78 @@ static struct power_wait_status {
> >         volatile void *monitor_addr; /**< NULL if not currently
> > sleeping */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE];
> >
> > +/**
> > + * This functions uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
>
> Fixed while applying, function*
>
> > + * For more information about usage of these instructions, please
> > +refer to
> > + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> > + */
> > +static void intel_umonitor(volatile void *addr) { #if
> > +defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> > +       /* cast away "volatile" when using the intrinsic */
> > +       _umonitor((void *)(uintptr_t)addr); #else
> > +       /*
> > +        * we're using raw byte codes for compiler versions which
> > +        * don't support this instruction natively.
> > +        */
> > +       asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > +                       :
> > +                       : "D"(addr));
> > +#endif
> > +}
> > +
> > +static void intel_umwait(const uint64_t timeout) {
> > +       const uint32_t tsc_l = (uint32_t)timeout;
> > +       const uint32_t tsc_h = (uint32_t)(timeout >> 32); #if
> > +defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> > +       _umwait(tsc_l, tsc_h);
> > +#else
> > +       asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > +                       : /* ignore rflags */
> > +                       : "D"(0), /* enter C0.2 */
> > +                         "a"(tsc_l), "d"(tsc_h)); #endif }
> > +
> > +/**
> > + * This functions uses MONITORX/MWAITX instructions and will enter C1 state.
> > + * For more information about usage of these instructions, please
> > +refer to
> > + * AMD64 Architecture Programmer’s Manual.
> > + */
> > +static void amd_monitorx(volatile void *addr) { #if
> > +defined(__MWAITX__)
>
> This part probably breaks build with MSVC.
>
> I could not determine whether MSVC supports this intrinsic.
> I'll rely on Tyler to fix it later.
>
>
> Series applied.
>
> --
> David Marchand


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2023-10-11  9:35 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-13 11:53 [PATCH v2 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
2023-04-13 11:53 ` [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
2023-04-17  4:31   ` [PATCH v3 1/4] " Sivaprasad Tummala
2023-04-18  8:14     ` [PATCH v4 0/4] power: monitor support for AMD EPYC processors Sivaprasad Tummala
2023-04-18  8:25     ` Sivaprasad Tummala
2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
2023-04-18  8:52         ` Ferruh Yigit
2023-04-18  9:22           ` Bruce Richardson
2023-06-01  9:23             ` David Marchand
2023-07-05 11:32         ` Konstantin Ananyev
2023-08-16 18:59         ` [PATCH v5 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
2023-08-16 18:59           ` [PATCH v5 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
2023-08-16 19:27             ` Tyler Retzlaff
2023-08-17 11:34               ` Tummala, Sivaprasad
2023-08-17 14:18                 ` Konstantin Ananyev
2023-08-18 13:25                   ` Ferruh Yigit
2023-08-18 13:48                     ` Bruce Richardson
2023-08-21 15:42                       ` Tyler Retzlaff
2023-08-22 22:30                       ` Konstantin Ananyev
2023-08-23  9:19                         ` Ferruh Yigit
2023-08-23 16:03                           ` Tyler Retzlaff
2023-08-24  9:04                             ` Ferruh Yigit
2023-08-25 16:00                               ` Tyler Retzlaff
2023-08-30 22:45                                 ` Konstantin Ananyev
2023-09-27 10:38                                   ` Tummala, Sivaprasad
2023-09-28 10:11                                     ` Konstantin Ananyev
2023-10-06  8:26             ` David Marchand
2023-10-09  8:02               ` Tummala, Sivaprasad
2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
2023-10-09 14:05               ` [PATCH v6 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
2023-10-09 14:05               ` [PATCH v6 3/3] power: amd power monitor support Sivaprasad Tummala
2023-10-10  8:59                 ` David Marchand
2023-10-11  9:33                   ` Tummala, Sivaprasad
2023-10-10 10:14                 ` Konstantin Ananyev
2023-10-09 16:23               ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Patrick Robb
2023-10-10  8:21                 ` David Marchand
2023-04-18  8:25       ` [PATCH v4 2/4] " Sivaprasad Tummala
2023-06-14 13:15         ` Burakov, Anatoly
2023-04-18  8:25       ` [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
2023-06-14 13:14         ` Burakov, Anatoly
2023-04-18  8:25       ` [PATCH v4 4/4] power: amd power monitor support Sivaprasad Tummala
2023-06-14 13:14         ` Burakov, Anatoly
2023-04-13 11:53 ` [PATCH v2 3/3] " Sivaprasad Tummala
2023-04-17  4:34   ` [PATCH v3 4/4] " Sivaprasad Tummala
2023-04-13 11:59 ` [PATCH v2 1/3] eal: add x86 cpuid support for monitorx David Marchand
2023-04-13 17:50   ` Tummala, Sivaprasad
2023-04-14  7:05     ` David Marchand
2023-04-14  8:51       ` Tummala, Sivaprasad
2023-04-14 11:48       ` Ferruh Yigit
2023-04-17  4:32 ` [PATCH v3 2/4] " Sivaprasad Tummala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).