From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 69AAEA051A; Fri, 17 Jan 2020 12:16:22 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3D183FFA; Fri, 17 Jan 2020 12:16:22 +0100 (CET) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by dpdk.org (Postfix) with ESMTP id A0DFAF72 for ; Fri, 17 Jan 2020 12:16:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579259780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H6pFBIu4NFZt13f6Lo56PyIKIrgWfwUtyMBUfTPTWKo=; b=QXA+8X+IBeeN7wvNhsHdgyMog7+tjhmRp9sJxAuUEz4yCX51nnd0uKn5Gb0NZF5YO6Kw0e IpHiOGXJEgHvak4yvaowgDpAVj7ztVgTDn/MGnSHZwdNEWwXC2QTICvtql6j5qBk9y9z/U pPg0scXm5XjjbI6UYBertL9bl0PeMZc= Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-268-oPdJWrk4N-uf1Fl4GwfDeA-1; Fri, 17 Jan 2020 06:16:07 -0500 Received: by mail-vk1-f198.google.com with SMTP id c127so9540555vkh.18 for ; Fri, 17 Jan 2020 03:16:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=odk9e2SAequQ8pb5I5el+4q0m/XALnVPYizF2qXDIPU=; b=YBljrEyOidLYyavXTfcDzzTkmw0RKXWILDD6LyjCQYSOZLEqZB1tpHIwKkljfPSu6s /iCyvV30XPFhs13T0eQHVghDJBr6b2RguBFfNrbfIt+kFn4G5QKIjtWjFcX5IFK2i9Ci UiwusEyjM24BchFIEYOuU4jJUbDhnjq5ILougfI6TYwsf9xirD+krOEddLNKh7qQ77gW hPjPd+gP7lRUS3ezVt7eYbrlF9KUZLElvhYdrMBC5p3SVBxEvxj61axBqE5bIZfTGdkT QYUSHtiaFrHxUsk9WHGicgT5aTLUQ9PWK1QjyPjBUhZNiXiZL8HbIPZ6Z22PB/V2nnbA k6Rw== X-Gm-Message-State: APjAAAX1eamJJVNSBSErXoIxMkDpt78CsczWYy/5w7QDFdfialpW/Wjx fGTiAeFeWwIIS0IsrzaHCSZ0FkkPTvAS3QEjy3/MF41SOkl2XpQyd+VfLSPZR4GQUK7PBrgjdDO p1ALQBLlouaIYWnLZMVc= X-Received: by 2002:ab0:2505:: with SMTP id j5mr2515785uan.87.1579259767345; Fri, 17 Jan 2020 03:16:07 -0800 (PST) X-Google-Smtp-Source: APXvYqz+w8Gjd7vxhPVLjrreTsun5K5Z3MHxpZahpO5De2MONg38cE/2QeYRJJD6hJolVjef32qk8bNiRhvZ5uHEWus= X-Received: by 2002:ab0:2505:: with SMTP id j5mr2515762uan.87.1579259766996; Fri, 17 Jan 2020 03:16:06 -0800 (PST) MIME-Version: 1.0 References: <1561911676-37718-1-git-send-email-gavin.hu@arm.com> <1573162528-16230-1-git-send-email-david.marchand@redhat.com> In-Reply-To: <1573162528-16230-1-git-send-email-david.marchand@redhat.com> From: David Marchand Date: Fri, 17 Jan 2020 12:15:56 +0100 Message-ID: To: Gavin Hu Cc: nd , "Ananyev, Konstantin" , dev , Honnappa Nagarahalli X-MC-Unique: oPdJWrk4N-uf1Fl4GwfDeA-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v13 0/5] use WFE for aarch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Nov 7, 2019 at 10:35 PM David Marchand wrote: > > DPDK has multiple use cases where the core repeatedly polls a location in > memory. This polling results in many cache and memory transactions. > > Arm architecture provides WFE (Wait For Event) instruction, which allows > the cpu core to enter a low power state until woken up by the update to t= he > memory location being polled. Thus reducing the cache and memory > transactions. > > x86 has the PAUSE hint instruction to reduce such overhead. > > The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling > for a memory location to become equal to a given value'. > > For non-Arm platforms, these APIs are just wrappers around do-while loop > with rte_pause, so there are no performance differences. > > For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE > option. It is disabled by default. > > Currently, use of WFE is supported only for aarch64 platforms. armv7 > platforms do support the WFE instruction, but they require explicit wake = up > events(sev) and are less performannt. > > Testing shows that, performance varies across different platforms, with > some showing degradation. > > CONFIG_RTE_USE_WFE should be enabled depending on the performance on the > target platforms. > > V13: > - added release notes update, > - reworked arm implementation to avoid exporting inlines, > - added assert in generic implementation, > > V12: > - remove the 'rte_' prefix from the arm specific functions (David Marchan= d) > - use the __atomic_load_ex_xx functions in arm specific implementations o= f > APIS (David Marchand) > - remove the experimental warnings (David Marchand) > - tweak the macros working scope (David Marchand) > V11: > - add rte_ prefix to the __atomic_load_ex_x funtions (Ananyev Konstantin) > - define the above rte_atomic_load_ex_x funtions even if not > RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED for future non-wfe usages (Ananyev > Konstantin) > - use the above functions for arm specific rte_wait_until_equal_x functio= ns > (Ananyev Konstantin) > - simplify the generic implementation by immersing "if" into "while" > (Ananyev Konstantin) > > V10: > - move arm specific stuff to arch/arm/rte_pause_64.h (Ananyev Konstantin) > > V9: > - fix a weblink broken (David Marchand) > - define rte_wfe and rte_sev() (Ananyev Konstantin) > - explicitly define three function APIs instead of marcos (Ananyev Konsta= ntin) > - incorporate common rte_wfe and rte_sev into the generic rte_spinlock (D= avid > Marchand) > - define arch neutral RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED (Ananyev Konstant= in) > - define rte_load_ex_16/32/64 functions to use load-exclusive instruction= for > aarch64, which is required for wake up of WFE > - drop the rte_spinlock patch from this series, as the it calls this > experimental API and it is widely included by a lot of components each > requires the ALLOW_EXPERIMENRAL_API for the Makefile and meson.build, l= eave > it to future after the experimental is removed. > > V8: > - simplify dmb definition to use io barriers (David Marchand) > - define wfe() and sev() macros and use them inside normal C code (Ananye= v > Konstantin) > - pass memorder as parameter, not to incorporate it into function name, l= ess > functions, similar to C11 atomic intrinsics (Ananyev Konstantin) > - remove mandating RTE_FORCE_INTRINSICS in arm spinlock implementation (D= avid > Marchand) > - undef __WAIT_UNTIL_EQUAL after use (David Marchand) > - add experimental tag and warning (David Marchand) > - add the limitation of using WFE instruction in the commit log (David > Marchand) > - tweak the use of RTE_FORCE_INSTRINSICS (still mandatory for aarch64) an= d > RTE_ARM_USE_WFE for spinlock (David Marchand) > - drop the rte_ring patch from this series, as the rte_ring.h calls this = API > and it is widely included by a lot of components each requires the > ALLOW_EXPERIMENRAL_API for the Makefile and meson.build, leave it to fu= ture > after the experimental is removed. > > V7: > - fix the checkpatch LONG_LINE_COMMENT issue > > V6: > - squash the RTE_ARM_USE_WFE configuration entry patch into the new API p= atch > - move the new configuration to the end of EAL > - add doxygen comments to reflect the relaxed and acquire semantics > - correct the meson configuration > > V5: > - add doxygen comments for the new APIs > - spinlock early exit without wfe if the spinlock not taken by others. > - add two patches on top for opdl and thunderx > > V4: > - rename the config as CONFIG_RTE_ARM_USE_WFE to indicate it applys to ar= m only > - introduce a macro for assembly Skelton to reduce the duplication of cod= e > - add one patch for nxp fslmc to address a compiling error > > V3: > - Convert RFCs to patches > > V2: > - Use inline functions instead of marcos > - Add load and compare in the beginning of the APIs > - Fix some style errors in asm inline > > V1: > - Add the new APIs and use it for ring and locks Series applied. Thanks. --=20 David Marchand