From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D19A2A04BC; Fri, 9 Oct 2020 13:42:27 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AFADA1D536; Fri, 9 Oct 2020 13:42:26 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 8AE491D454 for ; Fri, 9 Oct 2020 13:42:24 +0200 (CEST) IronPort-SDR: G1p40b5/JkMDeMPXo7B2XmQlmRQcPfpZuraqxBVj+l+YZwbiLyJ+6UDdDQ4r1Jm4p5ri9ZNCEf fC9UT2Ck9E8g== X-IronPort-AV: E=McAfee;i="6000,8403,9768"; a="152392857" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="152392857" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 04:42:22 -0700 IronPort-SDR: 4TKPSfV87KIbnO2ohuSqYctQPE3gidQObzA8y2MYuKVl8JZ102xKBMVUr9H+UWgSpv7YYdvsJU tSrVRMGH0a5A== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="462158921" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.3.170]) ([10.213.3.170]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 04:42:18 -0700 To: Bruce Richardson Cc: "Ananyev, Konstantin" , Thomas Monjalon , Jerin Jacob , David Marchand , "Ma, Liang J" , dpdk-dev , "Hunt, David" , Stephen Hemminger , Honnappa Nagarahalli , "Ruifeng Wang (Arm Technology China)" , David Christensen , Jerin Jacob References: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> <92b20fa1-f37a-9938-90a7-72956a206b62@intel.com> <1744543.fjzj2HbIQ4@thomas> <66b285ab-066e-f1d0-cbf8-f31d0f896806@intel.com> <20201009113606.GC1474@bricha3-MOBL.ger.corp.intel.com> From: "Burakov, Anatoly" Message-ID: <33fe4aeb-d617-b27d-0550-c6415cc43763@intel.com> Date: Fri, 9 Oct 2020 12:42:15 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <20201009113606.GC1474@bricha3-MOBL.ger.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 09-Oct-20 12:36 PM, Bruce Richardson wrote: > On Fri, Oct 09, 2020 at 12:12:56PM +0100, Burakov, Anatoly wrote: >> On 09-Oct-20 11:48 AM, Ananyev, Konstantin wrote: >>>> On 09-Oct-20 11:17 AM, Thomas Monjalon wrote: >>>>> 09/10/2020 12:03, Burakov, Anatoly: >>>>>> On 09-Oct-20 10:54 AM, Jerin Jacob wrote: >>>>>>> On Fri, Oct 9, 2020 at 3:10 PM Burakov, Anatoly >>>>>>> wrote: >>>>>>>> On 09-Oct-20 10:29 AM, Thomas Monjalon wrote: >>>>>>>>> 09/10/2020 11:25, Burakov, Anatoly: >>>>>>>>>> On 09-Oct-20 6:42 AM, Jerin Jacob wrote: >>>>>>>>>>> On Thu, Oct 8, 2020 at 10:38 PM Ananyev, Konstantin >>>>>>>>>>> wrote: >>>>>>>>>>>>> On Thu, Oct 8, 2020 at 6:57 PM Burakov, Anatoly >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> On 08-Oct-20 9:44 AM, Jerin Jacob wrote: >>>>>>>>>>>>>>> On Thu, Oct 8, 2020 at 2:04 PM Thomas Monjalon wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Add two new power management intrinsics, and provide an implementation >>>>>>>>>>>>>>>>> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions >>>>>>>>>>>>>>>>> are implemented as raw byte opcodes because there is not yet widespread >>>>>>>>>>>>>>>>> compiler support for these instructions. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The power management instructions provide an architecture-specific >>>>>>>>>>>>>>>>> function to either wait until a specified TSC timestamp is reached, or >>>>>>>>>>>>>>>>> optionally wait until either a TSC timestamp is reached or a memory >>>>>>>>>>>>>>>>> location is written to. The monitor function also provides an optional >>>>>>>>>>>>>>>>> comparison, to avoid sleeping when the expected write has already >>>>>>>>>>>>>>>>> happened, and no more writes are expected. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For more details, Please reference Intel SDM Volume 2. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I really would like to see feedbacks from other arch maintainers. >>>>>>>>>>>>>>>> Unfortunately they were not Cc'ed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Shared the feedback from the arm64 perspective here. Yet to get a reply on this. >>>>>>>>>>>>>>> http://mails.dpdk.org/archives/dev/2020-September/181646.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Also please mark the new functions as experimental. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Jerin, >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Anatoly, >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> > IMO, We must introduce some arch feature-capability _get_ scheme to tell >>>>>>>>>>>>>> > the consumer of this API is only supported on x86. Probably as >>>>>>>>>>>>>> functions[1] >>>>>>>>>>>>>> > or macro flags scheme and have a stub for the other architectures as the >>>>>>>>>>>>>> > API marked as generic ie rte_power_* not rte_x86_.. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > This will help the consumer to create workers based on the >>>>>>>>>>>>>> instruction features >>>>>>>>>>>>>> > which can NOT be abstracted as a generic feature across the >>>>>>>>>>>>>> architectures. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm not entirely sure what you mean by that. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I mean, yes, we should have added stubs for other architectures, and we >>>>>>>>>>>>>> will add those in future revisions, but what does your proposed runtime >>>>>>>>>>>>>> check accomplish that cannot currently be done with CPUID flags? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> RTE_CPUFLAG_WAITPKG flag definition is not available in other architectures. >>>>>>>>>>>>> i.e RTE_CPUFLAG_WAITPKG defined in lib/librte_eal/x86/include/rte_cpuflags.h >>>>>>>>>>>>> and it is used in http://patches.dpdk.org/patch/79540/ as generic API. >>>>>>>>>>>>> I doubt http://patches.dpdk.org/patch/79540/ would compile on non-x86. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I am agree with Jerin, that we need some generic way to >>>>>>>>>>>> figure-out does platform supports power_monitor() or not. >>>>>>>>>>>> Though not sure do we need to create a new feature-get framework here... >>>>>>>>>>> >>>>>>>>>>> That's works too. Some means of generic probing is fine. Following >>>>>>>>>>> schemed needs >>>>>>>>>>> more documentation on that usage, as, it is not straight forward compare to >>>>>>>>>>> feature-get framework. Also, on the other thread, we are adding the >>>>>>>>>>> new instructions like >>>>>>>>>>> demote cacheline etc, maybe if the user wants to KNOW if the arch >>>>>>>>>>> supports it then >>>>>>>>>>> the feature-get framework is good. >>>>>>>>>>> If we think, there is no other usecase for generic arch feature-get >>>>>>>>>>> framework then >>>>>>>>>>> we can keep the below scheme else generic arch feature is better for >>>>>>>>>>> more forward >>>>>>>>>>> looking use cases. >>>>>>>>>>> >>>>>>>>>>>> Might be just something like: >>>>>>>>>>>> rte_power_monitor(...) == -ENOTSUP >>>>>>>>>>>> be enough indication for that? >>>>>>>>>>>> So user can just do: >>>>>>>>>>>> if (rte_power_monitor(NULL, 0, 0, 0, 0) == -ENOTSUP) { >>>>>>>>>>>> /* not supported path */ >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> To check is that feature supported or not. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Looking at CLDEMOTE patches, CLDEMOTE is a noop on other archs. I think >>>>>>>>>> we can safely make this intrinsic as a noop on other archs as well, as >>>>>>>>>> it's functionally identical to waking up immediately. >>>>>>>>>> >>>>>>>>>> If we're not creating this for CLDEMOTE, we don't need it here as well. >>>>>>>>>> If we do need it for this, then we arguably need it for CLDEMOTE too. >>>>>>>>> >>>>>>>>> Sorry I don't understand what you mean, too many "it" and "this" :) >>>>>>>>> >>>>>>>> >>>>>>>> Sorry, i meant "the generic feature-get framework". CLDEMOTE doesn't >>>>>>>> exist on other archs, this doesn't too, so it's a fairly similar >>>>>>>> situation. Stubbing UMWAIT with a noop is a valid approach because it's >>>>>>>> equivalent to sleeping and then immediately waking up (which can happen >>>>>>>> for a host of reasons unrelated to the code itself). >>>>>>> >>>>>>> If we are keeping the following return in the public API then it can not be NOP >>>>>>> + * @return >>>>>>> + * - 1 if wakeup was due to TSC timeout expiration. >>>>>>> + * - 0 if wakeup was due to memory write or other reasons. >>>>>>> + */ >>>>>>> >>>>>> >>>>>> In the generic header, it is specified that return value is >>>>>> implementation-defined (i.e. arch-specific). >>>>> >>>>> Obviously an API definition should *never* be "implementation-defined". >>>> >>>> If there isn't a meaningful return value, we could either make it a >>>> void, or return 0/-ENOTSUP so. I'm OK with either as nop is a valid >>>> result for a UMWAIT, and there are no side-effects to the intrinsic >>>> itself (it's basically a fancy rte_pause). >>>> >>>>> >>>>> >>>>>> I guess we could remove >>>>>> that and set return value to either 0 or -ENOTSUP if that would resolve >>>>>> the issue? >>>>>> >>>>>>> Also, we need to fix compilation issue if any with >>>>>>> http://patches.dpdk.org/patch/79540/ >>>>>>> as it has direct reference to if >>>>>>> (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) { >>>>>>> Either we need to add -ENOTSUP return or generic feature-get framework. >>>>>> >>>>>> IIRC power library isn't compiled on anything other than x86, so this >>>>>> code wouldn't get compiled. >>>>> >>>>> It is not call "power-x86", so we must assume it could work >>>>> on any architecture. >>>> >>>> #ifdef it is! >>>> >>>>> >>>>> >>>>>>>> I'm not against a generic feature-get framework, i'm just pointing out >>>>>>>> that if this is what's preventing the merge, it should prevent the merge >>>>>>>> of CLDEMOTE as well >>> >>> I wouldn't consider these two as totally equal. >>> Yes, both are just hints to CPU, that can be ignored, >>> but if not, then consequences of executing are quite different. >>> If UMWAIT is not supported by cpu at all, then user might want to use some >>> different power saving mechanism (pause, frequence scaling, etc.). >>> Without information is UMWAIT supported or not, user can't make >>> such choice. >> >> After some attempts at implementing this, i actually came to the conclusion >> that some generic way to check support for this feature is necessary, >> because we end up with a usability inconsistency: >> >> 1) on non-x86, if you call the function, it returns -ENOTSUP >> 2) on x86, since we're not checking CPUID flags on every single call, it'll >> either succeed, or crash the process - the burden is on the user to check >> for CPUID flags, but it can't be done in an arch agnostic way because the >> CPUID flags are only defined for x86, thus requiring a special code path for >> x86 >> >> Where would be the best place to add such an infrastructure? I'm thinking >> rte_cpuflags.h? >> > Time to relook at some of the contents of patchset: > http://patches.dpdk.org/project/dpdk/list/?series=4811&archive=both&state=* > > The idea of that set (IIRC) was to replace the per-architecture enums with > just strings to avoid situations like this - or at least make them less > awkward. > > /Bruce > Yes, that patchset looks like it would work nicely in this case. If there is consensus to resurrect it and make it work, i'll drop the "generic feature get" thing, but for now i'll continue working on it - easier to remove code than to add it :D -- Thanks, Anatoly