From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 78CB5A04BC; Fri, 9 Oct 2020 18:10:47 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5F4E11D50A; Fri, 9 Oct 2020 18:10:46 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 9EA4F1C202 for ; Fri, 9 Oct 2020 18:10:44 +0200 (CEST) IronPort-SDR: ubqa80Sqw1eJrBYEzBm3zsQhIerNfEnswmv6TggHfuEiBKTA4Hm5T+9lG4QsKA+N//lPWM58Ou cwK4te8b2Z5g== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="250199005" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="250199005" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:10:41 -0700 IronPort-SDR: gijfYT0OWLc5b0KzcFn2ufAiufEPXDL/CAfYbLvynpm166DOEUdMgt7R+MwLWn4odIRpf2zJve ab72IkOlCZHQ== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="462242701" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.3.170]) ([10.213.3.170]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:10:39 -0700 To: "Ananyev, Konstantin" , "Ma, Liang J" , "dev@dpdk.org" Cc: "Hunt, David" , "stephen@networkplumber.org" References: <1599214740-3927-1-git-send-email-liang.j.ma@intel.com> <1601647919-25312-1-git-send-email-liang.j.ma@intel.com> <1601647919-25312-2-git-send-email-liang.j.ma@intel.com> <665bcb31-dcf0-553b-bae1-054e5f50e77f@intel.com> From: "Burakov, Anatoly" Message-ID: <3609c5b3-f431-3954-6350-cb2de77b72a7@intel.com> Date: Fri, 9 Oct 2020 17:10:36 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 09-Oct-20 4:39 PM, Ananyev, Konstantin wrote: > >> On 08-Oct-20 6:15 PM, Ananyev, Konstantin wrote: >>>> >>>> Add two new power management intrinsics, and provide an implementation >>>> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions >>>> are implemented as raw byte opcodes because there is not yet widespread >>>> compiler support for these instructions. >>>> >>>> The power management instructions provide an architecture-specific >>>> function to either wait until a specified TSC timestamp is reached, or >>>> optionally wait until either a TSC timestamp is reached or a memory >>>> location is written to. The monitor function also provides an optional >>>> comparison, to avoid sleeping when the expected write has already >>>> happened, and no more writes are expected. >>> >>> I think what this API is missing - a function to wakeup sleeping core. >>> If user can/should use some system call to achieve that, then at least >>> it has to be clearly documented, even better some wrapper provided. >> >> I don't think it's possible to do that without severely overcomplicating >> the intrinsic and its usage, because AFAIK the only way to wake up a >> sleeping core would be to send some kind of interrupt to the core, or >> trigger a write to the cache-line in question. >> > > Yes, I think we either need a syscall that would do an IPI for us > (on top of my head - membarrier() does that, might be there are some other syscalls too), > or something hand-made. For hand-made, I wonder would something like that > be safe and sufficient: > uint64_t val = atomic_load(addr); > CAS(addr, val, &val); > ? > Anyway, one way or another - I think ability to wakeup core we put to sleep > have to be an essential part of this feature. > As I understand linux kernel will limit max amount of sleep time for these instructions: > https://lwn.net/Articles/790920/ > But relying just on that, seems too vague for me: > - user can adjust that value > - wouldn't apply to older kernels and non-linux cases > Konstantin > This implies knowing the value the core is sleeping on. That's not always the case - with this particular PMD power management scheme, we get the address from the PMD and it stays inside the callback. -- Thanks, Anatoly