From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 69AAEA051A;
	Fri, 17 Jan 2020 12:16:22 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 3D183FFA;
	Fri, 17 Jan 2020 12:16:22 +0100 (CET)
Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com
 [205.139.110.120]) by dpdk.org (Postfix) with ESMTP id A0DFAF72
 for <dev@dpdk.org>; Fri, 17 Jan 2020 12:16:20 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1579259780;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=H6pFBIu4NFZt13f6Lo56PyIKIrgWfwUtyMBUfTPTWKo=;
 b=QXA+8X+IBeeN7wvNhsHdgyMog7+tjhmRp9sJxAuUEz4yCX51nnd0uKn5Gb0NZF5YO6Kw0e
 IpHiOGXJEgHvak4yvaowgDpAVj7ztVgTDn/MGnSHZwdNEWwXC2QTICvtql6j5qBk9y9z/U
 pPg0scXm5XjjbI6UYBertL9bl0PeMZc=
Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com
 [209.85.221.198]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-268-oPdJWrk4N-uf1Fl4GwfDeA-1; Fri, 17 Jan 2020 06:16:07 -0500
Received: by mail-vk1-f198.google.com with SMTP id c127so9540555vkh.18
 for <dev@dpdk.org>; Fri, 17 Jan 2020 03:16:07 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=odk9e2SAequQ8pb5I5el+4q0m/XALnVPYizF2qXDIPU=;
 b=YBljrEyOidLYyavXTfcDzzTkmw0RKXWILDD6LyjCQYSOZLEqZB1tpHIwKkljfPSu6s
 /iCyvV30XPFhs13T0eQHVghDJBr6b2RguBFfNrbfIt+kFn4G5QKIjtWjFcX5IFK2i9Ci
 UiwusEyjM24BchFIEYOuU4jJUbDhnjq5ILougfI6TYwsf9xirD+krOEddLNKh7qQ77gW
 hPjPd+gP7lRUS3ezVt7eYbrlF9KUZLElvhYdrMBC5p3SVBxEvxj61axBqE5bIZfTGdkT
 QYUSHtiaFrHxUsk9WHGicgT5aTLUQ9PWK1QjyPjBUhZNiXiZL8HbIPZ6Z22PB/V2nnbA
 k6Rw==
X-Gm-Message-State: APjAAAX1eamJJVNSBSErXoIxMkDpt78CsczWYy/5w7QDFdfialpW/Wjx
 fGTiAeFeWwIIS0IsrzaHCSZ0FkkPTvAS3QEjy3/MF41SOkl2XpQyd+VfLSPZR4GQUK7PBrgjdDO
 p1ALQBLlouaIYWnLZMVc=
X-Received: by 2002:ab0:2505:: with SMTP id j5mr2515785uan.87.1579259767345;
 Fri, 17 Jan 2020 03:16:07 -0800 (PST)
X-Google-Smtp-Source: APXvYqz+w8Gjd7vxhPVLjrreTsun5K5Z3MHxpZahpO5De2MONg38cE/2QeYRJJD6hJolVjef32qk8bNiRhvZ5uHEWus=
X-Received: by 2002:ab0:2505:: with SMTP id j5mr2515762uan.87.1579259766996;
 Fri, 17 Jan 2020 03:16:06 -0800 (PST)
MIME-Version: 1.0
References: <1561911676-37718-1-git-send-email-gavin.hu@arm.com>
 <1573162528-16230-1-git-send-email-david.marchand@redhat.com>
In-Reply-To: <1573162528-16230-1-git-send-email-david.marchand@redhat.com>
From: David Marchand <david.marchand@redhat.com>
Date: Fri, 17 Jan 2020 12:15:56 +0100
Message-ID: <CAJFAV8wqgLqq9Mf1oSidCptSk2+CAhxy53OYqAzXrmrgmPiUdg@mail.gmail.com>
To: Gavin Hu <gavin.hu@arm.com>
Cc: nd <nd@arm.com>, "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 dev <dev@dpdk.org>, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
X-MC-Unique: oPdJWrk4N-uf1Fl4GwfDeA-1
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [dpdk-dev] [PATCH v13 0/5] use WFE for aarch64
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Thu, Nov 7, 2019 at 10:35 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> DPDK has multiple use cases where the core repeatedly polls a location in
> memory. This polling results in many cache and memory transactions.
>
> Arm architecture provides WFE (Wait For Event) instruction, which allows
> the cpu core to enter a low power state until woken up by the update to t=
he
> memory location being polled. Thus reducing the cache and memory
> transactions.
>
> x86 has the PAUSE hint instruction to reduce such overhead.
>
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'.
>
> For non-Arm platforms, these APIs are just wrappers around do-while loop
> with rte_pause, so there are no performance differences.
>
> For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE
> option. It is disabled by default.
>
> Currently, use of WFE is supported only for aarch64 platforms. armv7
> platforms do support the WFE instruction, but they require explicit wake =
up
> events(sev) and are less performannt.
>
> Testing shows that, performance varies across different platforms, with
> some showing degradation.
>
> CONFIG_RTE_USE_WFE should be enabled depending on the performance on the
> target platforms.
>
> V13:
> - added release notes update,
> - reworked arm implementation to avoid exporting inlines,
> - added assert in generic implementation,
>
> V12:
> - remove the 'rte_' prefix from the arm specific functions (David Marchan=
d)
> - use the __atomic_load_ex_xx functions in arm specific implementations o=
f
>   APIS (David Marchand)
> - remove the experimental warnings (David Marchand)
> - tweak the macros working scope (David Marchand)
> V11:
> - add rte_ prefix to the __atomic_load_ex_x funtions (Ananyev Konstantin)
> - define the above rte_atomic_load_ex_x funtions even if not
>   RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED for future non-wfe usages (Ananyev
>   Konstantin)
> - use the above functions for arm specific rte_wait_until_equal_x functio=
ns
>   (Ananyev Konstantin)
> - simplify the generic implementation by immersing "if" into "while"
>   (Ananyev Konstantin)
>
> V10:
> - move arm specific stuff to arch/arm/rte_pause_64.h (Ananyev Konstantin)
>
> V9:
> - fix a weblink broken (David Marchand)
> - define rte_wfe and rte_sev() (Ananyev Konstantin)
> - explicitly define three function APIs instead of marcos (Ananyev Konsta=
ntin)
> - incorporate common rte_wfe and rte_sev into the generic rte_spinlock (D=
avid
>   Marchand)
> - define arch neutral RTE_WAIT_UNTIL_EQUAL_ARCH_DEFINED (Ananyev Konstant=
in)
> - define rte_load_ex_16/32/64 functions to use load-exclusive instruction=
 for
>   aarch64, which is required for wake up of WFE
> - drop the rte_spinlock patch from this series, as the it calls this
>   experimental API and it is widely included by a lot of components each
>   requires the ALLOW_EXPERIMENRAL_API for the Makefile and meson.build, l=
eave
>   it to future after the experimental is removed.
>
> V8:
> - simplify dmb definition to use io barriers (David Marchand)
> - define wfe() and sev() macros and use them inside normal C code (Ananye=
v
>   Konstantin)
> - pass memorder as parameter, not to incorporate it into function name, l=
ess
>   functions, similar to C11 atomic intrinsics (Ananyev Konstantin)
> - remove mandating RTE_FORCE_INTRINSICS in arm spinlock implementation (D=
avid
>   Marchand)
> - undef __WAIT_UNTIL_EQUAL after use (David Marchand)
> - add experimental tag and warning (David Marchand)
> - add the limitation of using WFE instruction in the commit log (David
>   Marchand)
> - tweak the use of RTE_FORCE_INSTRINSICS (still mandatory for aarch64) an=
d
>   RTE_ARM_USE_WFE for spinlock (David Marchand)
> - drop the rte_ring patch from this series, as the rte_ring.h calls this =
API
>   and it is widely included by a lot of components each requires the
>   ALLOW_EXPERIMENRAL_API for the Makefile and meson.build, leave it to fu=
ture
>   after the experimental is removed.
>
> V7:
> - fix the checkpatch LONG_LINE_COMMENT issue
>
> V6:
> - squash the RTE_ARM_USE_WFE configuration entry patch into the new API p=
atch
> - move the new configuration to the end of EAL
> - add doxygen comments to reflect the relaxed and acquire semantics
> - correct the meson configuration
>
> V5:
> - add doxygen comments for the new APIs
> - spinlock early exit without wfe if the spinlock not taken by others.
> - add two patches on top for opdl and thunderx
>
> V4:
> - rename the config as CONFIG_RTE_ARM_USE_WFE to indicate it applys to ar=
m only
> - introduce a macro for assembly Skelton to reduce the duplication of cod=
e
> - add one patch for nxp fslmc to address a compiling error
>
> V3:
> - Convert RFCs to patches
>
> V2:
> - Use inline functions instead of marcos
> - Add load and compare in the beginning of the APIs
> - Fix some style errors in asm inline
>
> V1:
> - Add the new APIs and use it for ring and locks

Series applied.
Thanks.


--=20
David Marchand