From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 3D7BBA04BC;
	Fri,  9 Oct 2020 18:09:31 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id A48E01C202;
	Fri,  9 Oct 2020 18:09:29 +0200 (CEST)
Received: from mail-io1-f67.google.com (mail-io1-f67.google.com
 [209.85.166.67]) by dpdk.org (Postfix) with ESMTP id 316E91C1FC
 for <dev@dpdk.org>; Fri,  9 Oct 2020 18:09:27 +0200 (CEST)
Received: by mail-io1-f67.google.com with SMTP id b1so5829781iot.4
 for <dev@dpdk.org>; Fri, 09 Oct 2020 09:09:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=zIfLYBzuP96EGOkPsJxFhKrzv6L98tpc6HaykEE6BFo=;
 b=maZtRqMz6Y0+IQgaPqPbwT27uqW7th+22fICkMwx3PnSOPm6NdbD7mLrNzIyi3JK4Z
 yp6bNUhSUTYq1piWJ5qiu5ir4cq/ujwttAhi5UGb/O9wmCYeMTa0hKwEzRSW+dpvROM+
 I7ndTYBP6DbD5wDz7fYnH5wYL+aJtzYiWV0PLLrhk2iYRE2W/GlzL/1aG37e2Ql0tNU1
 xJYhc7s/mHXcBuHKZ/jgELyRiEZVG9MbwOBeISUsC8aLa9WYGD8Tin34SgPbJhUoKm+M
 xUNiqcASMJd4MMsuZktxuji1JF/rvG4QXT2yad3X62b40G57Si7KxOWfVfvLGX0DTgnK
 EBMg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=zIfLYBzuP96EGOkPsJxFhKrzv6L98tpc6HaykEE6BFo=;
 b=FJO5st8UBqoLg77hxbqQybI4MiV1pB3S/lZ1O00OvdUyZNbGaXXQSmO+npr192jkq5
 xMGvBKAY+lNsyb+fg4IjrI+6qjB4i1rRPI8IC0y7xgXLuAYj5ww74gCcGWnVOwr3MuRK
 dZC6hmcgqsdX+Vnpn89kaOZjlPA1tdtztTP/zf5+EVbdYY4x78hl0HH5vRtPW8G5KoqN
 EPwkvoMCM+echyM6Moisra1cD0ktfmS2kOP3FAuqBkrCbCZ2xiRKJJY8CbDw9ycNEaYM
 kI8FLUJt3Tdzz4aIOR0REBgRdLaGfeCKdc6Yi81cbGwL9I5LZkJmpZI7pvNZrvThBOUR
 vQtw==
X-Gm-Message-State: AOAM531jxbl9mHc9C607odcslc3zvYuNhr0k1fRhjfY9fORTRBs2jsiy
 Lu/c4AOwvPsBmwpExNhY5HChcxkCaVfErvbXaIA=
X-Google-Smtp-Source: ABdhPJx6iVj4RqKalJARg+RdfSQ8WfE3ZZ7gqYFSpbURavIme5DMqJWCpDXxSRy7q66dhjK1P+KCQLetKxlELnwx4us=
X-Received: by 2002:a5e:8347:: with SMTP id y7mr9905758iom.1.1602259765312;
 Fri, 09 Oct 2020 09:09:25 -0700 (PDT)
MIME-Version: 1.0
References: <1601647919-25312-1-git-send-email-liang.j.ma@intel.com>
 <532f45c5d79b4c30a919553d322bb66e91534466.1602258833.git.anatoly.burakov@intel.com>
 <d1f71095230bc5a9646de28b3177b622a1d88d24.1602258833.git.anatoly.burakov@intel.com>
In-Reply-To: <d1f71095230bc5a9646de28b3177b622a1d88d24.1602258833.git.anatoly.burakov@intel.com>
From: Jerin Jacob <jerinjacobk@gmail.com>
Date: Fri, 9 Oct 2020 21:39:08 +0530
Message-ID: <CALBAE1PaSScmM=cz44Qq=pE8mF1qYqm1jtqMOFeem5nd5_YuLA@mail.gmail.com>
To: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: dpdk-dev <dev@dpdk.org>, Liang Ma <liang.j.ma@intel.com>, 
 Jan Viktorin <viktorin@rehivetech.com>, Ruifeng Wang <ruifeng.wang@arm.com>, 
 David Christensen <drc@linux.vnet.ibm.com>,
 Bruce Richardson <bruce.richardson@intel.com>, 
 Konstantin Ananyev <konstantin.ananyev@intel.com>,
 David Hunt <david.hunt@intel.com>, 
 Thomas Monjalon <thomas@monjalon.net>, "McDaniel,
 Timothy" <timothy.mcdaniel@intel.com>, 
 Gage Eads <gage.eads@intel.com>, chris.macnamara@intel.com
Content-Type: text/plain; charset="UTF-8"
Subject: Re: [dpdk-dev] [PATCH v5 02/10] eal: add power management intrinsics
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Fri, Oct 9, 2020 at 9:32 PM Anatoly Burakov
<anatoly.burakov@intel.com> wrote:
>
> From: Liang Ma <liang.j.ma@intel.com>
>
> Add two new power management intrinsics, and provide an implementation
> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions
> are implemented as raw byte opcodes because there is not yet widespread
> compiler support for these instructions.
>
> The power management instructions provide an architecture-specific
> function to either wait until a specified TSC timestamp is reached, or
> optionally wait until either a TSC timestamp is reached or a memory
> location is written to. The monitor function also provides an optional
> comparison, to avoid sleeping when the expected write has already
> happened, and no more writes are expected.
>
> For more details, please refer to Intel(R) 64 and IA-32 Architectures
> Software Developer's Manual, Volume 2.
>
> Signed-off-by: Liang Ma <liang.j.ma@intel.com>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>
> Notes:
>     v5:
>     - Removed return values
>     - Simplified intrinsics and hardcoded C0.2 state
>     - Added other arch stubs
>
>  lib/librte_eal/arm/include/meson.build        |   1 +
>  .../arm/include/rte_power_intrinsics.h        |  62 ++++++++++
>  .../include/generic/rte_power_intrinsics.h    |  61 ++++++++++
>  lib/librte_eal/include/meson.build            |   1 +
>  lib/librte_eal/ppc/include/meson.build        |   1 +
>  .../ppc/include/rte_power_intrinsics.h        |  62 ++++++++++
>  lib/librte_eal/x86/include/meson.build        |   1 +
>  .../x86/include/rte_power_intrinsics.h        | 106 ++++++++++++++++++
>  8 files changed, 295 insertions(+)
>  create mode 100644 lib/librte_eal/arm/include/rte_power_intrinsics.h
>  create mode 100644 lib/librte_eal/include/generic/rte_power_intrinsics.h
>  create mode 100644 lib/librte_eal/ppc/include/rte_power_intrinsics.h
>  create mode 100644 lib/librte_eal/x86/include/rte_power_intrinsics.h
>
> diff --git a/lib/librte_eal/arm/include/meson.build b/lib/librte_eal/arm/include/meson.build
> index 73b750a18f..c6a9f70d73 100644
> --- a/lib/librte_eal/arm/include/meson.build
> +++ b/lib/librte_eal/arm/include/meson.build
> @@ -20,6 +20,7 @@ arch_headers = files(
>         'rte_pause_32.h',
>         'rte_pause_64.h',
>         'rte_pause.h',
> +       'rte_power_intrinsics.h',
>         'rte_prefetch_32.h',
>         'rte_prefetch_64.h',
>         'rte_prefetch.h',
> diff --git a/lib/librte_eal/arm/include/rte_power_intrinsics.h b/lib/librte_eal/arm/include/rte_power_intrinsics.h
> new file mode 100644
> index 0000000000..4aad44a0b9
> --- /dev/null
> +++ b/lib/librte_eal/arm/include/rte_power_intrinsics.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_POWER_INTRINSIC_ARM_H_
> +#define _RTE_POWER_INTRINSIC_ARM_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_atomic.h>
> +#include <rte_common.h>
> +
> +#include "generic/rte_power_intrinsics.h"
> +
> +/**
> + * This function is not supported on ARM.
> + *
> + * @param p
> + *   Address to monitor for changes. Must be aligned on an 64-byte boundary.
> + * @param expected_value
> + *   Before attempting the monitoring, the `p` address may be read and compared
> + *   against this value. If `value_mask` is zero, this step will be skipped.
> + * @param value_mask
> + *   The 64-bit mask to use to extract current value from `p`.
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for.
> + *
> + * @return
> + *   - 0 on success


remove return as it is a void function

> + */
> +static inline void rte_power_monitor(const volatile void *p,
> +               const uint64_t expected_value, const uint64_t value_mask,
> +               const uint64_t tsc_timestamp)
> +{
> +       RTE_SET_USED(p);
> +       RTE_SET_USED(expected_value);
> +       RTE_SET_USED(value_mask);
> +       RTE_SET_USED(tsc_timestamp);
> +}
> +
> +/**
> + * This function is not supported on ARM.
> + *
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for.
> + *
> + * @return
> + *   - 1 if wakeup was due to TSC timeout expiration.
> + *   - 0 if wakeup was due to other reasons.
> + */
> +static inline void rte_power_pause(const uint64_t tsc_timestamp)
> +{
> +       RTE_SET_USED(tsc_timestamp);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_POWER_INTRINSIC_ARM_H_ */
> diff --git a/lib/librte_eal/include/generic/rte_power_intrinsics.h b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> new file mode 100644
> index 0000000000..e36c1f8976
> --- /dev/null
> +++ b/lib/librte_eal/include/generic/rte_power_intrinsics.h
> @@ -0,0 +1,61 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_POWER_INTRINSIC_H_
> +#define _RTE_POWER_INTRINSIC_H_
> +
> +#include <inttypes.h>
> +
> +/**
> + * @file
> + * Advanced power management operations.
> + *
> + * This file define APIs for advanced power management,
> + * which are architecture-dependent.
> + */
> +
> +/**
> + * Monitor specific address for changes. This will cause the CPU to enter an
> + * architecture-defined optimized power state until either the specified
> + * memory address is written to, a certain TSC timestamp is reached, or other
> + * reasons cause the CPU to wake up.
> + *
> + * Additionally, an `expected` 64-bit value and 64-bit mask are provided. If
> + * mask is non-zero, the current value pointed to by the `p` pointer will be
> + * checked against the expected value, and if they match, the entering of
> + * optimized power state may be aborted.
> + *
> + * @param p
> + *   Address to monitor for changes. Must be aligned on an 64-byte boundary.
> + * @param expected_value
> + *   Before attempting the monitoring, the `p` address may be read and compared
> + *   against this value. If `value_mask` is zero, this step will be skipped.
> + * @param value_mask
> + *   The 64-bit mask to use to extract current value from `p`.
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for. Note that the wait behavior is
> + *   architecture-dependent.
> + *
> + * @return
> + *   - 0 on success
> + *   - -ENOTSUP if not supported
> + */
> +static inline void rte_power_monitor(const volatile void *p,
> +               const uint64_t expected_value, const uint64_t value_mask,
> +               const uint64_t tsc_timestamp);
> +
> +/**
> + * Enter an architecture-defined optimized power state until a certain TSC
> + * timestamp is reached.
> + *
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for. Note that the wait behavior is
> + *   architecture-dependent.
> + *
> + * @return
> + *   Architecture-dependent return value.
> + */
> +static inline void rte_power_pause(const uint64_t tsc_timestamp);
> +
> +#endif /* _RTE_POWER_INTRINSIC_H_ */
> diff --git a/lib/librte_eal/include/meson.build b/lib/librte_eal/include/meson.build
> index cd09027958..3a12e87e19 100644
> --- a/lib/librte_eal/include/meson.build
> +++ b/lib/librte_eal/include/meson.build
> @@ -60,6 +60,7 @@ generic_headers = files(
>         'generic/rte_memcpy.h',
>         'generic/rte_pause.h',
>         'generic/rte_prefetch.h',
> +       'generic/rte_power_intrinsics.h',
>         'generic/rte_rwlock.h',
>         'generic/rte_spinlock.h',
>         'generic/rte_ticketlock.h',
> diff --git a/lib/librte_eal/ppc/include/meson.build b/lib/librte_eal/ppc/include/meson.build
> index ab4bd28092..0873b2aecb 100644
> --- a/lib/librte_eal/ppc/include/meson.build
> +++ b/lib/librte_eal/ppc/include/meson.build
> @@ -10,6 +10,7 @@ arch_headers = files(
>         'rte_io.h',
>         'rte_memcpy.h',
>         'rte_pause.h',
> +       'rte_power_intrinsics.h',
>         'rte_prefetch.h',
>         'rte_rwlock.h',
>         'rte_spinlock.h',
> diff --git a/lib/librte_eal/ppc/include/rte_power_intrinsics.h b/lib/librte_eal/ppc/include/rte_power_intrinsics.h
> new file mode 100644
> index 0000000000..70fd7b094f
> --- /dev/null
> +++ b/lib/librte_eal/ppc/include/rte_power_intrinsics.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_POWER_INTRINSIC_PPC_H_
> +#define _RTE_POWER_INTRINSIC_PPC_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_atomic.h>
> +#include <rte_common.h>
> +
> +#include "generic/rte_power_intrinsics.h"
> +
> +/**
> + * This function is not supported on PPC64.
> + *
> + * @param p
> + *   Address to monitor for changes. Must be aligned on an 64-byte boundary.
> + * @param expected_value
> + *   Before attempting the monitoring, the `p` address may be read and compared
> + *   against this value. If `value_mask` is zero, this step will be skipped.
> + * @param value_mask
> + *   The 64-bit mask to use to extract current value from `p`.
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for.
> + *
> + * @return
> + *   - 0 on success
> + */
> +static inline void rte_power_monitor(const volatile void *p,
> +               const uint64_t expected_value, const uint64_t value_mask,
> +               const uint64_t tsc_timestamp)
> +{
> +       RTE_SET_USED(p);
> +       RTE_SET_USED(expected_value);
> +       RTE_SET_USED(value_mask);
> +       RTE_SET_USED(tsc_timestamp);
> +}
> +
> +/**
> + * This function is not supported on PPC64.
> + *
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for.
> + *
> + * @return
> + *   - 1 if wakeup was due to TSC timeout expiration.
> + *   - 0 if wakeup was due to other reasons.
> + */
> +static inline void rte_power_pause(const uint64_t tsc_timestamp)
> +{
> +       RTE_SET_USED(tsc_timestamp);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_POWER_INTRINSIC_PPC_H_ */
> diff --git a/lib/librte_eal/x86/include/meson.build b/lib/librte_eal/x86/include/meson.build
> index f0e998c2fe..494a8142a2 100644
> --- a/lib/librte_eal/x86/include/meson.build
> +++ b/lib/librte_eal/x86/include/meson.build
> @@ -13,6 +13,7 @@ arch_headers = files(
>         'rte_io.h',
>         'rte_memcpy.h',
>         'rte_prefetch.h',
> +       'rte_power_intrinsics.h',
>         'rte_pause.h',
>         'rte_rtm.h',
>         'rte_rwlock.h',
> diff --git a/lib/librte_eal/x86/include/rte_power_intrinsics.h b/lib/librte_eal/x86/include/rte_power_intrinsics.h
> new file mode 100644
> index 0000000000..8d579eaf64
> --- /dev/null
> +++ b/lib/librte_eal/x86/include/rte_power_intrinsics.h
> @@ -0,0 +1,106 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_POWER_INTRINSIC_X86_64_H_
> +#define _RTE_POWER_INTRINSIC_X86_64_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_atomic.h>
> +#include <rte_common.h>
> +
> +#include "generic/rte_power_intrinsics.h"
> +
> +/**
> + * Monitor specific address for changes. This will cause the CPU to enter an
> + * architecture-defined optimized power state until either the specified
> + * memory address is written to, a certain TSC timestamp is reached, or other
> + * reasons cause the CPU to wake up.
> + *
> + * Additionally, an `expected` 64-bit value and 64-bit mask are provided. If
> + * mask is non-zero, the current value pointed to by the `p` pointer will be
> + * checked against the expected value, and if they match, the entering of
> + * optimized power state may be aborted.
> + *
> + * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
> + * For more information about usage of these instructions, please refer to
> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual.
> + *
> + * @param p
> + *   Address to monitor for changes. Must be aligned on an 64-byte boundary.
> + * @param expected_value
> + *   Before attempting the monitoring, the `p` address may be read and compared
> + *   against this value. If `value_mask` is zero, this step will be skipped.
> + * @param value_mask
> + *   The 64-bit mask to use to extract current value from `p`.
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for.
> + *
> + * @return
> + *   - 0 on success
> + */
> +static inline void rte_power_monitor(const volatile void *p,
> +               const uint64_t expected_value, const uint64_t value_mask,
> +               const uint64_t tsc_timestamp)
> +{
> +       const uint32_t tsc_l = (uint32_t)tsc_timestamp;
> +       const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
> +       /*
> +        * we're using raw byte codes for now as only the newest compiler
> +        * versions support this instruction natively.
> +        */
> +
> +       /* set address for UMONITOR */
> +       asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> +                       :
> +                       : "D"(p));
> +
> +       if (value_mask) {
> +               const uint64_t cur_value = *(const volatile uint64_t *)p;
> +               const uint64_t masked = cur_value & value_mask;
> +               /* if the masked value is already matching, abort */
> +               if (masked == expected_value)
> +                       return;
> +       }
> +       /* execute UMWAIT */
> +       asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> +               : /* ignore rflags */
> +               : "D"(0), /* enter C0.2 */
> +                 "a"(tsc_l), "d"(tsc_h));
> +}
> +
> +/**
> + * Enter an architecture-defined optimized power state until a certain TSC
> + * timestamp is reached.
> + *
> + * This function uses TPAUSE instruction  and will enter C0.2 state. For more
> + * information about usage of this instruction, please refer to Intel(R) 64 and
> + * IA-32 Architectures Software Developer's Manual.
> + *
> + * @param tsc_timestamp
> + *   Maximum TSC timestamp to wait for.
> + *
> + * @return
> + *   - 1 if wakeup was due to TSC timeout expiration.
> + *   - 0 if wakeup was due to other reasons.
> + */
> +static inline void rte_power_pause(const uint64_t tsc_timestamp)
> +{
> +       const uint32_t tsc_l = (uint32_t)tsc_timestamp;
> +       const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
> +
> +       /* execute TPAUSE */
> +       asm volatile(".byte 0x66, 0x0f, 0xae, 0xf7;"
> +                    : /* ignore rflags */
> +                    : "D"(0), /* enter C0.2 */
> +                      "a"(tsc_l), "d"(tsc_h));
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_POWER_INTRINSIC_X86_64_H_ */
> --
> 2.17.1