From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D0FE04309C; Fri, 18 Aug 2023 15:25:24 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A605D40ED9; Fri, 18 Aug 2023 15:25:24 +0200 (CEST) Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2083.outbound.protection.outlook.com [40.107.93.83]) by mails.dpdk.org (Postfix) with ESMTP id 0120F40395 for ; Fri, 18 Aug 2023 15:25:22 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ew3XhByOTOfhGlW6+hIxXkJmpdRm3dT6oAk5ceDZXNCbwO6bs3b08yXq6OyfF8DehpHQbBZ8V2og3p8ozWyB0PDfkJcthexK0HwcYj8INd8bWQDwDACMephoAkQhQk97/Nh31D7haE2eYIRtwXU7Pff+2578tBY9jan6iWvx4sRULGOl7Jc+3GStvfYrOiRmiWyJGGi9sr0xLoEbLkid1lntY6juuyzmbTyofsrVYC8KGVDcI/lYAQmuzmsD1+BUVGq7dCahXK+FSHsoLKeoL5okDGxPlX9/xEzlkvg3jvdNATLqJKQDhomhweavdOJR41V1ipQ8+y812IOXU+IjQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+VJMFxUNOc/lkqoSLElJIrdzslviXfOQ5ldTA56ykbo=; b=T4BIbXBK2Fz165H22lfqNrJW6bToXzSssGfKwSTM1IivXOe512stVKy/BnVngYEkQSk5fSL4Db1sMRfI7tAkZ8k7uG28NraTYR2vIcYAvbTuDlZiZfod8kTwbXo8BrMnTSwrPu2d/zoFwT5SG94+ALJV9b40WlJzWCTTFCFzTqKH00zv4QG9Sr1GLza7Kq+/uBIEh6FmJbKwDKzNPb/6+6/f7kmYmnMDjB50Kz07r2+Q45YW6JhzIbjMo+rIDlB3UuptmloJnaNfg5/xljnOBayEn26l3verPLZWVsnIfgdTi2h6Gys1NF/FH27W5SaEwWLFbOS3WP5hFNyeNFKjgw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+VJMFxUNOc/lkqoSLElJIrdzslviXfOQ5ldTA56ykbo=; b=ngX/tyW8FSDtP4CYuKv0Fy6ir1foa9BJ1hQJWUS5+oYkQr4AIzGx0IvWc3XIyiWVZlNVMzd7XUHUPA/gKSqtR4EwHtFykZIKT68Qofeg//Ej4k+Kn3qgZHa+dNHTopbuJCzcRy1qxczRqKAj+x/awuqi9rsJTmFTaLqrDMhXa0o= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; Received: from CH2PR12MB4294.namprd12.prod.outlook.com (2603:10b6:610:a9::11) by DM4PR12MB6231.namprd12.prod.outlook.com (2603:10b6:8:a6::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.20; Fri, 18 Aug 2023 13:25:20 +0000 Received: from CH2PR12MB4294.namprd12.prod.outlook.com ([fe80::49e9:2bf6:7f06:bbbd]) by CH2PR12MB4294.namprd12.prod.outlook.com ([fe80::49e9:2bf6:7f06:bbbd%3]) with mapi id 15.20.6699.020; Fri, 18 Aug 2023 13:25:20 +0000 Message-ID: <92e7bd2e-b799-9658-c90e-f50638c6fdbd@amd.com> Date: Fri, 18 Aug 2023 14:25:14 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Content-Language: en-US To: Konstantin Ananyev , "Tummala, Sivaprasad" , Tyler Retzlaff Cc: "david.hunt@intel.com" , "anatoly.burakov@intel.com" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "dev@dpdk.org" References: <20230418082529.544777-2-sivaprasad.tummala@amd.com> <20230816185959.1331336-1-sivaprasad.tummala@amd.com> <20230816185959.1331336-3-sivaprasad.tummala@amd.com> <20230816192758.GA12453@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> From: Ferruh Yigit Subject: Re: [PATCH v5 3/3] power: amd power monitor support In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: LO3P265CA0024.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:387::19) To CH2PR12MB4294.namprd12.prod.outlook.com (2603:10b6:610:a9::11) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR12MB4294:EE_|DM4PR12MB6231:EE_ X-MS-Office365-Filtering-Correlation-Id: f89588bb-5415-45fc-61a2-08db9fee91b9 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: exipKsXoqTOoGA5YMtQR+cN6iz4ugUEIHdGm+FmEON0mQa4gT7jwEo04MRTNanSw3CmwWAjMlWMUaMNM5Xp3+q0+JMVBpscx+QazgJmXz1jlpW0NFw6titHxslql3DFhY/kiTA3aGKA0i8vY27f3GrbaqnEzB/GOzSZr2TFuiEcTbCBTZYncYSQeP3mxVhsfASLLQKkKSYCWkJWx7qoEkwuEtTgjXecxUXsVX2CP2PT5NR/9Zb59Ps0qO/gEOWfe6oj4rSVNqIouxXuJg5NPYaHTIvkHDWK7JW/gWPHEAIzM21j+Dk4YU6iw3FcUVJUJeR3C2PJJLrezBdZp2g0rJ2rfIuLEf/QJL4I083W68Cat/7WvDgDWPifLX8q5dr5Vt+dmxfzQj63OwsDBh98L53M/cd+NHEpT2R5KiPfJXwwuN6ck3kKzTFhI0n7UMj53w85i8D7AIYN4n/YR1Zqnq8PwJdlT7VTchl+MxsnOACSUC3BMiw256lPLDriMHAH74xAt482O4EKvRawVZRGy/hCYVlQ66rh3/N8yWRT5/Ou/xuxu5utzTbPcbiDFdhlc1hkI18JUDdyj2M0VRpjKwDfAVK3KY7QVxCU1+bCVlULa9ssEKdevEuCMCiNM8iv6FwZ78M7rEq6gDk97jBbjsA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR12MB4294.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(376002)(346002)(396003)(39860400002)(136003)(451199024)(1800799009)(186009)(316002)(2616005)(6666004)(26005)(6486002)(6506007)(53546011)(6512007)(44832011)(83380400001)(4326008)(5660300002)(8936002)(8676002)(2906002)(478600001)(41300700001)(110136005)(54906003)(66476007)(66556008)(66946007)(38100700002)(36756003)(86362001)(31696002)(31686004)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?MzZkZk1rVE02QmRVQlJOZ1dwNUZ0OUpjcUxody9lcVBrWHFKTGJlUHV0WG1i?= =?utf-8?B?UHBmK3hKa2NKMXRNaUk0SEJjSm9XTnhjUTQzOUEzMUliWm1aVWFsKzUrVjdL?= =?utf-8?B?bk1xWjZZZzdMRGQ5TUc1MXAyMTBuQUtQcjdhZnUvVXNGS2MvUDR5cjdlUG53?= =?utf-8?B?UUU3ZFJuSXJoVi85ZHVFWFZjYnQxRHBuSUhjMU41UVBFWm53bmpwTDY5NEdj?= =?utf-8?B?cnJKaDNQRHE3ZkJuNVM5am1WdXBWb0wvRFhrcG9jNmNONnBMM2FibTM0b1E5?= =?utf-8?B?T2J4TkR0K2tzRTRNQVBBbis4RG5Cbk5RdTNNUk5ST0dudFVOZkQzenZCS1pm?= =?utf-8?B?THFOSEMvaDlrNTFrTnFidHVnVmgvNjQ1Y1EzanFjUUdMS00yM09xZmxqYVRH?= =?utf-8?B?ZFhLKy9aRmF2M3FtZUJISXZFOFAyaVlQOUN6aG9wQVI1T3ZtYkNqNC9objIy?= =?utf-8?B?eGErSDZyK3ovcnNiK3ZyK3lVVzVOQ3VpREhMbzNSNXNOVldsMnhEemZFUlVU?= =?utf-8?B?WXRVNmcvdjlhcHJuTU1EUUx1aHZKMm5BenloS1BFVXFYbmpsSnJCNmJqR3Q5?= =?utf-8?B?QkpZWTE5NUVkTEk0Q2wzbHJ2VGpHMDJBOGFDcExpNnF3Wmw0ZkU0NVdXaXhP?= =?utf-8?B?UG1PYVo1TWZoYU82eUp6MXBxejk1OTFmeklyS054akJFMWxURWM5S1B3L3hX?= =?utf-8?B?d1ptZ2ZLMnA2Z1hkM3cvbEFzalVnZThLR2g1M0Y5RGJPZ0N0RFIwRWlvOHRP?= =?utf-8?B?ZW9wRXdHdU01SmVVdHRNdnkvczJoQnpTbEJGdkU4bHNFSFdnU3pXazc2dVZS?= =?utf-8?B?WWZDeWxjaytBcnhaOWQxeDVhQi9YdklzYTIzK2VmNEd2aHc5QTlUS2MxbzJZ?= =?utf-8?B?M1JweUxzYXNSVDY0aGxwQ3FkOHdBM3VRNi9YQS9mQmhYZG5tbnJHRisxRU9H?= =?utf-8?B?cXZBTE16dXFCVnY3S3hhZXd0eCthYUs0ZUtDdUtJOEpIWUFWR2w3RzVnS2Nx?= =?utf-8?B?VWdXTDZTcVRlbFFMWFRlelJHK3ZwejZwTHN5UEhHaitjS2pUQk1mMlVOWHFn?= =?utf-8?B?NTlxQzFvcWxKSHdTV0V1L2NqdDZ6bTN2cEV2UkptRXVxVVlwQU0zUS8xeXRq?= =?utf-8?B?S3lEWWFSR2lhL0tyNDJZdC9jbFM1OVBuaFErMjBIRm9vTk5kT2R5VWlCekw0?= =?utf-8?B?dytEMXRjdk1XTyt0ZUpiTjBHdEYyZEVxMFZUY3d3L2V5S25pQ0hFRmg1OGFR?= =?utf-8?B?cUFpTDdJZnpGeWc4Rkk2WlJ1RGlUTjl2Y2dYZ25sTFU0RERCMHpQc0Z3Q2Q3?= =?utf-8?B?czJOZXZ5RDdtWDYrc1RoWFhZMUgxcW5tcnpvUEl1d2ovd1IzbjdleDdXQkRF?= =?utf-8?B?b2Q2czNxcWhVYkp6QVY3WUIwbHZ6ME5iQlZ0MG82ckZMQTJmWnZla0IvZjVp?= =?utf-8?B?Ym8yVmQvMmhXWHpJUFVDc244SlJLQXovVXdzYVlzYldVTmQvR3hnWXNiMk9S?= =?utf-8?B?cnhFVFFSRmZuTG4wUGRKNmp5ZU5WeTZVRU4rb0xtb3gwVG8vd3hqS0t2Z3p5?= =?utf-8?B?RHVGVzFsL3hqbTZ2ZEFNT3lzYVlWaDRhT2prZmh0TnBva3llNk1wRXNUQUYy?= =?utf-8?B?LzE4cVo1QkMySlpqazJIakN6ZUlSSUwvS20xTUNnU1pNOFk4YitwZkkrVE9l?= =?utf-8?B?dWdyTkU2cmZpNkdtbG9DbDF0S2thVFdMbDFuWkQ5dm9DWmw1RjB4aUo5eTZr?= =?utf-8?B?Z3c2d1pRUWhYWEFMOGdzSWRzbjFiUnc4Z1hZREFyb1h3RHlBZHBWNW52VitO?= =?utf-8?B?NXpLZENWUkgzYitISFlBOHo1aUNCTFBhSzF4akVMSUU4c0ZBVWU1T1FyQ3Jz?= =?utf-8?B?ak5PRnljRXlJM1BaaGFoUHh4K3h4SEVKWC9NaU1oVEozeEV6M2lERVFEeWN0?= =?utf-8?B?bXExSyszTUI2T0o0RjlkaHVoRDR2emZ3TWpGY0tFelB4TGV0cjE5SGZjNkNG?= =?utf-8?B?MnVrb3Z5R0d1RGZCM1NST29OQjl0dERwdGdDTG1ETUFRQ3BRektCMUdzR3Iz?= =?utf-8?B?SmxMSEpaV3JvVytGY0YrN2tucUNqRmcyclZhYjE2aExORk8wMjhpSHJuQ3FX?= =?utf-8?Q?A0tYACKnxdqV3pTUW2lLun7Mp?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: f89588bb-5415-45fc-61a2-08db9fee91b9 X-MS-Exchange-CrossTenant-AuthSource: CH2PR12MB4294.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Aug 2023 13:25:20.1807 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: feWxgQ77AAs4RxFX8TTuGrgial++J/cnvAYGXenc/sxaOA8VN6DRmdqxuDgYf6ju X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6231 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 8/17/2023 3:18 PM, Konstantin Ananyev wrote: > >>> Caution: This message originated from an External Source. Use proper caution >>> when opening attachments, clicking links, or responding. >>> >>> >>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote: >>>> mwaitx allows EPYC processors to enter a implementation dependent >>>> power/performance optimized state (C1 state) for a specific period or >>>> until a store to the monitored address range. >>>> >>>> Signed-off-by: Sivaprasad Tummala >>>> Acked-by: Anatoly Burakov >>>> --- >>>> lib/eal/x86/rte_power_intrinsics.c | 77 >>>> +++++++++++++++++++++++++----- >>>> 1 file changed, 66 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c >>>> b/lib/eal/x86/rte_power_intrinsics.c >>>> index 6eb9e50807..b4754e17da 100644 >>>> --- a/lib/eal/x86/rte_power_intrinsics.c >>>> +++ b/lib/eal/x86/rte_power_intrinsics.c >>>> @@ -17,6 +17,60 @@ static struct power_wait_status { >>>> volatile void *monitor_addr; /**< NULL if not currently sleeping >>>> */ } __rte_cache_aligned wait_status[RTE_MAX_LCORE]; >>>> >>>> +/** >>>> + * These functions uses UMONITOR/UMWAIT instructions and will enter C0.2 >>> state. >>>> + * For more information about usage of these instructions, please >>>> +refer to >>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. >>>> + */ >>>> +static void intel_umonitor(volatile void *addr) { >>>> + /* UMONITOR */ >>>> + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" >>>> + : >>>> + : "D"(addr)); >>>> +} >>>> + >>>> +static void intel_umwait(const uint64_t timeout) { >>>> + const uint32_t tsc_l = (uint32_t)timeout; >>>> + const uint32_t tsc_h = (uint32_t)(timeout >> 32); >>>> + /* UMWAIT */ >>>> + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" >>>> + : /* ignore rflags */ >>>> + : "D"(0), /* enter C0.2 */ >>>> + "a"(tsc_l), "d"(tsc_h)); } >>> >>> question and perhaps Anatoly Burakov can chime in with expertise. >>> >>> gcc/clang have built-in intrinsics for umonitor and umwait i believe as per our other >>> thread of discussion is there a benefit to also providing inline assembly over just >>> using the intrinsics? I understand that the intrinsics may not exist for the monitorx >>> and mwaitx below so it is probably necessary for amd. >>> >>> so the suggestion here is when they are available just use the intrinsics. >>> >>> thanks >>> >> The gcc built-in functions __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only when -mmwaitx >> is used specific for AMD platforms. On generic builds, these built-ins are not available and hence inline >> assembly is required here. > > Ok... but we can probably put them into a separate .c file that will be compiled with that specific flag? > Same thing can be probably done for Intel specific instructions. > In general, I think it is much more preferable to use built-ins vs inline assembly > (if possible off-course). > We don't compile different set of files for AMD and Intel, but there are runtime checks, so putting into separate file is not much different. It may be an option to always enable compiler flag (-mmwaitx), I think it won't hurt other platforms but I am not sure about implications of this to other platforms (what was the motivation for the compiler guys to enable these build-ins with specific flag?). Also this requires detecting compiler that supports 'mmwaitx' or not, etc.. > Konstantin > >> >>>> + >>>> +/** >>>> + * These functions uses MONITORX/MWAITX instructions and will enter C1 state. >>>> + * For more information about usage of these instructions, please >>>> +refer to >>>> + * AMD64 Architecture Programmer's Manual. >>>> + */ >>>> +static void amd_monitorx(volatile void *addr) { >>>> + /* MONITORX */ >>>> + asm volatile(".byte 0x0f, 0x01, 0xfa;" >>>> + : >>>> + : "a"(addr), >>>> + "c"(0), /* no extensions */ >>>> + "d"(0)); /* no hints */ } >>>> + >>>> +static void amd_mwaitx(const uint64_t timeout) { >>>> + /* MWAITX */ >>>> + asm volatile(".byte 0x0f, 0x01, 0xfb;" >>>> + : /* ignore rflags */ >>>> + : "a"(0), /* enter C1 */ >>>> + "c"(2), /* enable timer */ >>>> + "b"(timeout)); >>>> +} >>>> + >>>> +static struct { >>>> + void (*mmonitor)(volatile void *addr); >>>> + void (*mwait)(const uint64_t timeout); } __rte_cache_aligned >>>> +power_monitor_ops; >>>> + >>>> static inline void >>>> __umwait_wakeup(volatile void *addr) >>>> { >>>> @@ -75,8 +129,6 @@ int >>>> rte_power_monitor(const struct rte_power_monitor_cond *pmc, >>>> const uint64_t tsc_timestamp) { >>>> - const uint32_t tsc_l = (uint32_t)tsc_timestamp; >>>> - const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32); >>>> const unsigned int lcore_id = rte_lcore_id(); >>>> struct power_wait_status *s; >>>> uint64_t cur_value; >>>> @@ -109,10 +161,8 @@ rte_power_monitor(const struct >>> rte_power_monitor_cond *pmc, >>>> * versions support this instruction natively. >>>> */ >>>> >>>> - /* set address for UMONITOR */ >>>> - asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" >>>> - : >>>> - : "D"(pmc->addr)); >>>> + /* set address for mmonitor */ >>>> + power_monitor_ops.mmonitor(pmc->addr); >>>> >>>> /* now that we've put this address into monitor, we can unlock */ >>>> rte_spinlock_unlock(&s->lock); >>>> @@ -123,11 +173,8 @@ rte_power_monitor(const struct >>> rte_power_monitor_cond *pmc, >>>> if (pmc->fn(cur_value, pmc->opaque) != 0) >>>> goto end; >>>> >>>> - /* execute UMWAIT */ >>>> - asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" >>>> - : /* ignore rflags */ >>>> - : "D"(0), /* enter C0.2 */ >>>> - "a"(tsc_l), "d"(tsc_h)); >>>> + /* execute mwait */ >>>> + power_monitor_ops.mwait(tsc_timestamp); >>>> >>>> end: >>>> /* erase sleep address */ >>>> @@ -173,6 +220,14 @@ RTE_INIT(rte_power_intrinsics_init) { >>>> wait_multi_supported = 1; >>>> if (i.power_monitor) >>>> monitor_supported = 1; >>>> + >>>> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_MONITORX)) { /* AMD */ >>>> + power_monitor_ops.mmonitor = &amd_monitorx; >>>> + power_monitor_ops.mwait = &amd_mwaitx; >>>> + } else { /* Intel */ >>>> + power_monitor_ops.mmonitor = &intel_umonitor; >>>> + power_monitor_ops.mwait = &intel_umwait; >>>> + } >>>> } >>>> >>>> int >>>> -- >>>> 2.34.1