From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1B35FA04BC; Fri, 9 Oct 2020 12:26:53 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 617AB1D420; Fri, 9 Oct 2020 12:23:08 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id AC3CB1C2F0 for ; Fri, 9 Oct 2020 12:23:05 +0200 (CEST) IronPort-SDR: 2OT7rlZaqT/AFPfRAHmqtLO4srOnKjwDfxCFF/WoiRkiJukZWUhE9vmm+YNKpU0Nn9lGrKaVxu eRBuxIgLGhDw== X-IronPort-AV: E=McAfee;i="6000,8403,9768"; a="145332625" X-IronPort-AV: E=Sophos;i="5.77,354,1596524400"; d="scan'208";a="145332625" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 03:23:04 -0700 IronPort-SDR: ZhngBAIriRvFkSzRkyuOOE/Ezd4pXCBfjvLFwhYc10dbj209hQpiTCnfbWWtpwpj4N1+OurLod VwpzNL03u8rA== X-IronPort-AV: E=Sophos;i="5.77,354,1596524400"; d="scan'208";a="462144861" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.3.170]) ([10.213.3.170]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 03:23:01 -0700 To: Thomas Monjalon , Jerin Jacob Cc: "Ananyev, Konstantin" , David Marchand , "Ma, Liang J" , dpdk-dev , "Hunt, David" , Stephen Hemminger , Honnappa Nagarahalli , "Ruifeng Wang (Arm Technology China)" , David Christensen , Jerin Jacob References: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> <92b20fa1-f37a-9938-90a7-72956a206b62@intel.com> <1744543.fjzj2HbIQ4@thomas> From: "Burakov, Anatoly" Message-ID: Date: Fri, 9 Oct 2020 11:22:58 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <1744543.fjzj2HbIQ4@thomas> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 09-Oct-20 11:17 AM, Thomas Monjalon wrote: > 09/10/2020 12:03, Burakov, Anatoly: >> On 09-Oct-20 10:54 AM, Jerin Jacob wrote: >>> On Fri, Oct 9, 2020 at 3:10 PM Burakov, Anatoly >>> wrote: >>>> On 09-Oct-20 10:29 AM, Thomas Monjalon wrote: >>>>> 09/10/2020 11:25, Burakov, Anatoly: >>>>>> On 09-Oct-20 6:42 AM, Jerin Jacob wrote: >>>>>>> On Thu, Oct 8, 2020 at 10:38 PM Ananyev, Konstantin >>>>>>> wrote: >>>>>>>>> On Thu, Oct 8, 2020 at 6:57 PM Burakov, Anatoly >>>>>>>>> wrote: >>>>>>>>>> On 08-Oct-20 9:44 AM, Jerin Jacob wrote: >>>>>>>>>>> On Thu, Oct 8, 2020 at 2:04 PM Thomas Monjalon wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Add two new power management intrinsics, and provide an implementation >>>>>>>>>>>>> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions >>>>>>>>>>>>> are implemented as raw byte opcodes because there is not yet widespread >>>>>>>>>>>>> compiler support for these instructions. >>>>>>>>>>>>> >>>>>>>>>>>>> The power management instructions provide an architecture-specific >>>>>>>>>>>>> function to either wait until a specified TSC timestamp is reached, or >>>>>>>>>>>>> optionally wait until either a TSC timestamp is reached or a memory >>>>>>>>>>>>> location is written to. The monitor function also provides an optional >>>>>>>>>>>>> comparison, to avoid sleeping when the expected write has already >>>>>>>>>>>>> happened, and no more writes are expected. >>>>>>>>>>>>> >>>>>>>>>>>>> For more details, Please reference Intel SDM Volume 2. >>>>>>>>>>>> >>>>>>>>>>>> I really would like to see feedbacks from other arch maintainers. >>>>>>>>>>>> Unfortunately they were not Cc'ed. >>>>>>>>>>> >>>>>>>>>>> Shared the feedback from the arm64 perspective here. Yet to get a reply on this. >>>>>>>>>>> http://mails.dpdk.org/archives/dev/2020-September/181646.html >>>>>>>>>>> >>>>>>>>>>>> Also please mark the new functions as experimental. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Jerin, >>>>>>>>> >>>>>>>>> Hi Anatoly, >>>>>>>>> >>>>>>>>>> >>>>>>>>>> > IMO, We must introduce some arch feature-capability _get_ scheme to tell >>>>>>>>>> > the consumer of this API is only supported on x86. Probably as >>>>>>>>>> functions[1] >>>>>>>>>> > or macro flags scheme and have a stub for the other architectures as the >>>>>>>>>> > API marked as generic ie rte_power_* not rte_x86_.. >>>>>>>>>> > >>>>>>>>>> > This will help the consumer to create workers based on the >>>>>>>>>> instruction features >>>>>>>>>> > which can NOT be abstracted as a generic feature across the >>>>>>>>>> architectures. >>>>>>>>>> >>>>>>>>>> I'm not entirely sure what you mean by that. >>>>>>>>>> >>>>>>>>>> I mean, yes, we should have added stubs for other architectures, and we >>>>>>>>>> will add those in future revisions, but what does your proposed runtime >>>>>>>>>> check accomplish that cannot currently be done with CPUID flags? >>>>>>>>> >>>>>>>>> >>>>>>>>> RTE_CPUFLAG_WAITPKG flag definition is not available in other architectures. >>>>>>>>> i.e RTE_CPUFLAG_WAITPKG defined in lib/librte_eal/x86/include/rte_cpuflags.h >>>>>>>>> and it is used in http://patches.dpdk.org/patch/79540/ as generic API. >>>>>>>>> I doubt http://patches.dpdk.org/patch/79540/ would compile on non-x86. >>>>>>>> >>>>>>>> >>>>>>>> I am agree with Jerin, that we need some generic way to >>>>>>>> figure-out does platform supports power_monitor() or not. >>>>>>>> Though not sure do we need to create a new feature-get framework here... >>>>>>> >>>>>>> That's works too. Some means of generic probing is fine. Following >>>>>>> schemed needs >>>>>>> more documentation on that usage, as, it is not straight forward compare to >>>>>>> feature-get framework. Also, on the other thread, we are adding the >>>>>>> new instructions like >>>>>>> demote cacheline etc, maybe if the user wants to KNOW if the arch >>>>>>> supports it then >>>>>>> the feature-get framework is good. >>>>>>> If we think, there is no other usecase for generic arch feature-get >>>>>>> framework then >>>>>>> we can keep the below scheme else generic arch feature is better for >>>>>>> more forward >>>>>>> looking use cases. >>>>>>> >>>>>>>> Might be just something like: >>>>>>>> rte_power_monitor(...) == -ENOTSUP >>>>>>>> be enough indication for that? >>>>>>>> So user can just do: >>>>>>>> if (rte_power_monitor(NULL, 0, 0, 0, 0) == -ENOTSUP) { >>>>>>>> /* not supported path */ >>>>>>>> } >>>>>>>> >>>>>>>> To check is that feature supported or not. >>>>>>> >>>>>>> >>>>>> >>>>>> Looking at CLDEMOTE patches, CLDEMOTE is a noop on other archs. I think >>>>>> we can safely make this intrinsic as a noop on other archs as well, as >>>>>> it's functionally identical to waking up immediately. >>>>>> >>>>>> If we're not creating this for CLDEMOTE, we don't need it here as well. >>>>>> If we do need it for this, then we arguably need it for CLDEMOTE too. >>>>> >>>>> Sorry I don't understand what you mean, too many "it" and "this" :) >>>>> >>>> >>>> Sorry, i meant "the generic feature-get framework". CLDEMOTE doesn't >>>> exist on other archs, this doesn't too, so it's a fairly similar >>>> situation. Stubbing UMWAIT with a noop is a valid approach because it's >>>> equivalent to sleeping and then immediately waking up (which can happen >>>> for a host of reasons unrelated to the code itself). >>> >>> If we are keeping the following return in the public API then it can not be NOP >>> + * @return >>> + * - 1 if wakeup was due to TSC timeout expiration. >>> + * - 0 if wakeup was due to memory write or other reasons. >>> + */ >>> >> >> In the generic header, it is specified that return value is >> implementation-defined (i.e. arch-specific). > > Obviously an API definition should *never* be "implementation-defined". If there isn't a meaningful return value, we could either make it a void, or return 0/-ENOTSUP so. I'm OK with either as nop is a valid result for a UMWAIT, and there are no side-effects to the intrinsic itself (it's basically a fancy rte_pause). > > >> I guess we could remove >> that and set return value to either 0 or -ENOTSUP if that would resolve >> the issue? >> >>> Also, we need to fix compilation issue if any with >>> http://patches.dpdk.org/patch/79540/ >>> as it has direct reference to if >>> (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) { >>> Either we need to add -ENOTSUP return or generic feature-get framework. >> >> IIRC power library isn't compiled on anything other than x86, so this >> code wouldn't get compiled. > > It is not call "power-x86", so we must assume it could work > on any architecture. #ifdef it is! > > >>>> I'm not against a generic feature-get framework, i'm just pointing out >>>> that if this is what's preventing the merge, it should prevent the merge >>>> of CLDEMOTE as well, yet Jerin has acked that one and has explicitly >>>> stated that he's OK with leaving CLDEMOTE as a noop on other architectures. > > CLDEMOTE is used for optimization, while UMWAIT can be used in a logic, > that's why the expectations may be different. > UMWAIT is a best-effort mechanism with no side-effects. It's perfectly legal for a UMWAIT to not sleep at all, thus rendering it effectively a noop. So i don't think it's all that different. -- Thanks, Anatoly