From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D184D430EE; Thu, 24 Aug 2023 11:04:53 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B025D410EE; Thu, 24 Aug 2023 11:04:53 +0200 (CEST) Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2085.outbound.protection.outlook.com [40.107.223.85]) by mails.dpdk.org (Postfix) with ESMTP id D0A9340EE1 for ; Thu, 24 Aug 2023 11:04:51 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jHpwbusmMIeDQpHvxGj+qEKJmOD+I2mL9vFfHigkDF6oEZdfYBTmvQG4K0Io1hkNHkjt25x8tOppdtOFKsu3Q3WIyIYsPJWBMoeDlKVlhXvSZfm77/gAy8N8a1sDc4j2U2Yq48LeGUyP4jVY/AsGgd4OBcI8Y+rGvwrGtP0cchUsAQnp8am8bAEGAkwie1S5SlIaipIj74d2dwimP3by0mItWKCNFsfdHYCUgqT6qwC74yR25vhjR2/J8I95bgf4UDNMxndCREldVDAhylrfYP+Ebu2/1Zf4kgVoP8gISOOxZbXCtNd1l9pl08SdO/UAKcuOdlj+AskNn67y2R3AQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tFP5+jzO3QS3aeMnFUHvXj4e3F7Wc07O0d4nXCRxsMA=; b=nMoDgoxyWiasvHK8yTi8KW3KGmuRaU15jSFnqvEf3tcGNawf3zBYcKq7ZNOzvxXDSq+sXwMiYf+IWt0sbM4qj+ITg0D+ZPHBqvHtU8slwxyQYR/YwSuTzsGCkfaT0T8kjmTwiZb5dB4vGNe1C34jzG7CvpRQFxBO8N0N+PCIVWGv6b94uNujCWKrDyKB5XBaFPJQnx2YBYGr/hkzT83X9HUhF1GtEItq4sIqy8yx0HJCIM5PhlinlbTUdp/gQf/tX9CrV8nmVbgzYN+vwwsoXgZBt3IfFybTn+9AGPGuTmndTFcGkAtvOoMkfWOLEdTKz1qBAW2QF95B2E1/OaXvZw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tFP5+jzO3QS3aeMnFUHvXj4e3F7Wc07O0d4nXCRxsMA=; b=Dpk05XMvFOnd6zw0+r59lG2XnSc9IOkGoRjLJFkmvjMkJjxmRa+2R+yO/wfij94IrZLUMTaWPJefNdN9FdFImfjCzYkqr7j00LPmBjz33enbhIUYHgUpYK7eCUvTu8+c/+fUdKQ2eNkzlp5fny7lH91w8OzaMLykLPWEM+98LH8= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; Received: from CH2PR12MB4294.namprd12.prod.outlook.com (2603:10b6:610:a9::11) by BL1PR12MB5303.namprd12.prod.outlook.com (2603:10b6:208:317::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.27; Thu, 24 Aug 2023 09:04:48 +0000 Received: from CH2PR12MB4294.namprd12.prod.outlook.com ([fe80::49e9:2bf6:7f06:bbbd]) by CH2PR12MB4294.namprd12.prod.outlook.com ([fe80::49e9:2bf6:7f06:bbbd%3]) with mapi id 15.20.6699.027; Thu, 24 Aug 2023 09:04:47 +0000 Message-ID: <05cb0c66-18e2-6635-6c89-b7edeb1573c5@amd.com> Date: Thu, 24 Aug 2023 10:04:42 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Content-Language: en-US To: Tyler Retzlaff Cc: Konstantin Ananyev , Bruce Richardson , Konstantin Ananyev , "Tummala, Sivaprasad" , "david.hunt@intel.com" , "anatoly.burakov@intel.com" , "david.marchand@redhat.com" , "thomas@monjalon.net" , "dev@dpdk.org" References: <20230418082529.544777-2-sivaprasad.tummala@amd.com> <20230816185959.1331336-1-sivaprasad.tummala@amd.com> <20230816185959.1331336-3-sivaprasad.tummala@amd.com> <20230816192758.GA12453@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <92e7bd2e-b799-9658-c90e-f50638c6fdbd@amd.com> <35925f36-ccde-f632-8df2-f7d20e0f95cb@amd.com> <20230823160304.GA22267@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> From: Ferruh Yigit Subject: Re: [PATCH v5 3/3] power: amd power monitor support In-Reply-To: <20230823160304.GA22267@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: LO4P123CA0368.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18e::13) To CH2PR12MB4294.namprd12.prod.outlook.com (2603:10b6:610:a9::11) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR12MB4294:EE_|BL1PR12MB5303:EE_ X-MS-Office365-Filtering-Correlation-Id: b13b35ce-ff5e-438f-40e4-08dba4812a7f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CRsADrxw4PJFxbpXvjp4XG/5ReCNRzTEO2NyhurVoud1dk5ScR7q12cfpb0s0as8K72VF1YIS81rFilrF0eU+f+r2zKkbqaVPGdFwoJzMH70JHVFhWqvHjZbHMhkOr+vbSl3jyxORBX9w4WfqPb0ROXy/io2qlCaolV0NxbyQY8+ufFHr24rrN0LrWoJNfp5KQeIdWj8YqzYOLdiT6H/1KexwmjtyW3YVfqMNARb6E5VajairPXbDGysx80pOp2F5NGr9RsspTqcBs4PdDD3b4K0w69n8JO980G7ZX+8ecX7ymorHZftU82fQBiGD1w++ixdh7wQViRN0bp9JvT3gsF0p++i0D4P0SZZ8b5Qvk/oFfffS3i37mVz0uBkvbn8CgBL5oJkzRc0X4vwQLjV7iXkxlAehm45lTJy2H1k5d9/4HUKB/xHyuIn07DgdsbhZH/o3nglJ8j2TVd93A7HdsoDsqXLJvrK/d40X2ny9eK9ImUiJZ3Am+KZAiyKX9OlFr5jclCA3xG8ZSlAcYRW/cS0GTmsc+D8sT0IfgBIM0cps1QYPoH+fVITq/7I6H38OJhtigWyzAW3dIudXBPnvf/++7V1Ig106pQgKSRbry08uJDBRvS5jo6zUP/e9sD85Gwnopx6iih6QqJP7IzMdQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR12MB4294.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(39860400002)(346002)(366004)(136003)(396003)(186009)(1800799009)(451199024)(2616005)(5660300002)(4326008)(8676002)(8936002)(36756003)(83380400001)(44832011)(26005)(6666004)(38100700002)(66946007)(66556008)(66476007)(54906003)(316002)(6916009)(478600001)(31686004)(53546011)(41300700001)(6512007)(2906002)(6506007)(86362001)(31696002)(6486002)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Q3lITWFZbDlVQU5JVCthYzVucUU3U2Jvc3FVTExzbk9oazhsOXJQdGptdjU1?= =?utf-8?B?R3I3bkdydkdhRWg4czNtajlDcjFxeEVlQXhsc0NSdEZ1aUZFT0tvUVBmdnFW?= =?utf-8?B?VXJyWGpJd2dDSjNpbGxiTG5SY1NRNFRTK21xZHVTd2JXWEtkcUFCNU5wSUM2?= =?utf-8?B?Q3luWkdWSXhhcUt4eXBMeFhTNzYydSs5elQyQkpjRnljT2Y5dzdjSTI5REFP?= =?utf-8?B?bFZtZXJGUWFybnpUZkc3VHdsRzBvL3lidytmVmMvU1k0dytSNDAyTzBrZzRH?= =?utf-8?B?ZU41TDFHaFlETDhybzB3U2MrVCtpaWFIUmpSMEtVODM5TUdzd3pvWWdZTUVD?= =?utf-8?B?R05wOW1VTDN4b1Ayb3dRc0hSbzkrbGpFclR0ZmE2L3lwT0pOb1dJRW9aeGlB?= =?utf-8?B?S0FIZks2NjNRRHRmRndqSThmdnNXakZJNCs0d2tEaVFyRFgveDJPNml3c1JE?= =?utf-8?B?UTkwbnNjYkJjUWxFanFpRHNRbU9uODc1RzhjM2VaSFVVWGNkQmN5cGpLcXJU?= =?utf-8?B?M2gxNXZzOWFBcm5zdFg0d2dyZkh3N3AzVGc4ekxOTnlmcml4Q3pRTW11NU5t?= =?utf-8?B?NUtuMncrMGZDeWpoRHpLYjdkeDdOY2ZTVlV2bkxGbnVPYjBuY1F2UkhpYmQ1?= =?utf-8?B?QThwTElTTDVuRGMvbE9lYW56SWJKSlB1ZGo3TFI3TFRHdFNyOE1xYWlQMkor?= =?utf-8?B?TE8xZitwQlJsd2JRZGhoVWE5Wm4rN0pYNGszSUlWYkd5RUl1YjhDMkJWb00z?= =?utf-8?B?b25hRUlCTFV2R2piS0ZoVkNFUzM1dmJMQmJSd0lsamNkOFB3MzJIeU1PeHN0?= =?utf-8?B?ZVVPeU1jd1ppM2g3NVlmdTlvU2Y2TUtzMHk0bGhKRkVoREFFbXAxNlB1djJF?= =?utf-8?B?di95MTRYRkk5cWlNcGpUQVQwdWV4cEFtWDN5Y1dRUTRZN29GZnIxOTllRmFn?= =?utf-8?B?ZDBUNWZtMmttMk5sRkdOVCthQi9XZWNHZ0VnKzhGcCtQcmZyVlRNMHNxWWlq?= =?utf-8?B?dWJMVGNtb1Vla2FnOWlCbUZjMjRSK3FKbkpzcUU1emQwdkl3QmtBTVhLMU45?= =?utf-8?B?M0NnUUp4SVpoeTUyTmJnSUhDTnRLNTE0dWhwOTJ4ZjJGdlJZb0hsVzg2Wjdl?= =?utf-8?B?ODhKazY1UXpJVTBmUXV5UXdTdytZanE2Q0N0VnFDMUcvbWIxYXlwa3FvWEJk?= =?utf-8?B?UDBhUldETEttU2VGNnowV1BpT0VhL1JjMmM3M2lNYmxrcDdpbzhyRFBkUkZK?= =?utf-8?B?MEFPMVBqTHFHeUVIclFqUlNOQityRWVvWkdzcnZzNlBQV0xnOXVpZ3NQa1Rm?= =?utf-8?B?RDRvOVZnNU5rUWhadFBRejZQRTI4ekgwVGtrOTJrQ2VrVk5scDN1Zk5neHNK?= =?utf-8?B?eElONFR5S1ZQVCtranpEQ052KzA3ak44UStxMHU3RVM2a0xFbUJKZ1Vmam1C?= =?utf-8?B?ODFoUmw5L3FJWDVRNlFrMFVkQlhGL1NVMGlnQ3dzenV6OHJsYnN0ZVVmQVpl?= =?utf-8?B?eWxMRlEyMXBkZnY2UFZRa3Q3YXQrelNZQnZHUkZxMjViRFZtOGVBZnlabk9E?= =?utf-8?B?WVJhcVNmNGdVYkdkRDhaTWViTlcwV1NvQWttZUkyT3ExZnp6ci9pNWgreFNF?= =?utf-8?B?TEtNNVdjZUNUSjVJSDljajNDZXZUUzJ5SklINXpDZmtvUnMrS2hBZmdBVFQy?= =?utf-8?B?QjZWUEt5TFh6c3pXM0xNbHI5UEkvQXZPQis5NWREeUoySUFRQ0R4TG4yeW1r?= =?utf-8?B?Y2thWEFaK1JXdGUyQ1hueE92d00xbkRYQml5ZVVMU2xILzc2V2NmMW45Wktp?= =?utf-8?B?aEd1T0FudTVyMGh0SjV1WTM4Y3hHRlA3MXlrREZDZnd5bjFwUGZGY1JlczlS?= =?utf-8?B?UlRtdzV0eTRWaXpncGNRTXBYSmtvRDRWeWgrdlRDZ1hGdENkeG5yQVFrQlJW?= =?utf-8?B?OFliRkRNQlRwaDB6MEFoejBjWVJYYk40dXB6RlZaOWtGZy9iSjdZZkh3OEdV?= =?utf-8?B?TkRXd3VYRlk4cWRORVpLR2hWaEJnclJjMi9YYWN5MTVLcXYwTmQ3NWhvTGlw?= =?utf-8?B?d0ZZS3FSTHdSVVExTENES2xsa2V0b3Q2N1hQZjVRelNZaGVSeUljWmNmSGsw?= =?utf-8?Q?HdRUDbiZuOsC7XryblLB6cJWH?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: b13b35ce-ff5e-438f-40e4-08dba4812a7f X-MS-Exchange-CrossTenant-AuthSource: CH2PR12MB4294.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Aug 2023 09:04:47.7017 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ypHCb1ttWR3ZKZPc2w4ZC1150CTMFgBMCwgez6F56Yu18X6M2FgLlgaz7/26NHQ+ X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL1PR12MB5303 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 8/23/2023 5:03 PM, Tyler Retzlaff wrote: > On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote: >> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote: >>> 18/08/2023 14:48, Bruce Richardson пишет: >>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote: >>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote: >>>>>> >>>>>>>> Caution: This message originated from an External Source. Use >>>>>>>> proper caution >>>>>>>> when opening attachments, clicking links, or responding. >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala wrote: >>>>>>>>> mwaitx allows EPYC processors to enter a implementation dependent >>>>>>>>> power/performance optimized state (C1 state) for a specific >>>>>>>>> period or >>>>>>>>> until a store to the monitored address range. >>>>>>>>> >>>>>>>>> Signed-off-by: Sivaprasad Tummala >>>>>>>>> Acked-by: Anatoly Burakov >>>>>>>>> --- >>>>>>>>>   lib/eal/x86/rte_power_intrinsics.c | 77 >>>>>>>>> +++++++++++++++++++++++++----- >>>>>>>>>   1 file changed, 66 insertions(+), 11 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c >>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c >>>>>>>>> index 6eb9e50807..b4754e17da 100644 >>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c >>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c >>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status { >>>>>>>>>        volatile void *monitor_addr; /**< NULL if not currently >>>>>>>>> sleeping >>>>>>>>> */  } __rte_cache_aligned wait_status[RTE_MAX_LCORE]; >>>>>>>>> >>>>>>>>> +/** >>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and will >>>>>>>>> enter C0.2 >>>>>>>> state. >>>>>>>>> + * For more information about usage of these instructions, please >>>>>>>>> +refer to >>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's Manual. >>>>>>>>> + */ >>>>>>>>> +static void intel_umonitor(volatile void *addr) { >>>>>>>>> +     /* UMONITOR */ >>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;" >>>>>>>>> +                     : >>>>>>>>> +                     : "D"(addr)); >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +static void intel_umwait(const uint64_t timeout) { >>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout; >>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32); >>>>>>>>> +     /* UMWAIT */ >>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;" >>>>>>>>> +                     : /* ignore rflags */ >>>>>>>>> +                     : "D"(0), /* enter C0.2 */ >>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); } >>>>>>>> >>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise. >>>>>>>> >>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i >>>>>>>> believe as per our other >>>>>>>> thread of discussion is there a benefit to also providing inline >>>>>>>> assembly over just >>>>>>>> using the intrinsics? I understand that the intrinsics may not >>>>>>>> exist for the monitorx >>>>>>>> and mwaitx below so it is probably necessary for amd. >>>>>>>> >>>>>>>> so the suggestion here is when they are available just use the >>>>>>>> intrinsics. >>>>>>>> >>>>>>>> thanks >>>>>>>> >>>>>>> The gcc built-in functions >>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available only >>>>>>> when -mmwaitx >>>>>>> is used specific for AMD platforms. On generic builds, these >>>>>>> built-ins are not available and hence inline >>>>>>> assembly is required here. >>>>>> >>>>>> Ok... but we can probably put them into a separate .c file that will >>>>>> be compiled with that specific flag? >>>>>> Same thing can be probably done for Intel specific instructions. >>>>>> In general, I think it is much more preferable to use built-ins vs >>>>>> inline assembly >>>>>> (if possible off-course). >>>>>> >>>>> >>>>> We don't compile different set of files for AMD and Intel, but there are >>>>> runtime checks, so putting into separate file is not much different. >>> >>> Well, we probably don't compile .c files for particular vendor, but we >>> definitely do compile some .c files for particular ISA extensions. >>> Let say there are files in lib/acl that requires various '-mavx512*' >>> flags, same for other libs and PMDs. >>> So still not clear to me why same approach can't be applied to >>> power_instrincts.c? >>> >>>>> >>>>> It may be an option to always enable compiler flag (-mmwaitx), I think >>>>> it won't hurt other platforms but I am not sure about implications of >>>>> this to other platforms (what was the motivation for the compiler guys >>>>> to enable these build-ins with specific flag?). >>>>> >>>>> Also this requires detecting compiler that supports 'mmwaitx' or not, >>>>> etc.. >>>>> >>>> This is the biggest reason why we have in the past added support for >>>> these >>>> instructions via asm bytes rather than intrinsics. It takes a long >>>> time for >>>> end-user compilers, especially those in LTS releases, to get the >>>> necessary >>>> intrinsics. >>> >>> Yep, understand. >>> But why then we can't have both implementations? >>> Let say if WAITPKG is defined we can use builtins for >>> umonitor/umwait/tpause, otherwise we fallback to inline asm implementation. >>> Same story for MWAITX/monitorx. >>> >> >> Yes this can be done, >> it can be done either as different .c files per implementation, or as >> #ifdef in same file. >> >> But eventually asm implementation is required, as fallback, and if we >> will rely on asm implementation anyway, does it worth to have the >> additional checks to be able to use built-in intrinsic? >> >> Does it helps to comment name of the built-in function to inline >> assembly code, to document intention and another possible implementation? > > the main value of preferring intrinsics is that when they are available > they also work with msvc/windows. the msvc toolchain does not support > inline asm. so some of the targets have to use intrinsics because that's all > there is. > How windows handles current power APIs without inline asm support, like rte_power_intrinsics.c one? Also will using both built-in and inline assembly work for Windows, since there may be compiler versions that doesn't support built-in functions, they should disable APIs altogether, and this can create a scenario that list of exposed APIs changes based on compiler version. >> >>>> Consider a user running e.g. RHEL 8, who wants to take >>>> advantages of the latest DPDK features; they should not be required to >>>> upgrade their compiler - and possibly binutils/assembler - to do so. >>>> >>>> /Bruce >>>