From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9C88BA04A2; Tue, 12 May 2020 10:29:04 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3538A1C02D; Tue, 12 May 2020 10:29:04 +0200 (CEST) Received: from mail-io1-f66.google.com (mail-io1-f66.google.com [209.85.166.66]) by dpdk.org (Postfix) with ESMTP id 1AF8D1C020 for ; Tue, 12 May 2020 10:29:03 +0200 (CEST) Received: by mail-io1-f66.google.com with SMTP id w11so12981172iov.8 for ; Tue, 12 May 2020 01:29:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FXqxVde/Abs2a4sxizDquRphV+WRBjTIUEhFiOhgl4c=; b=uPhxYgbTCvJ7z5ftNqcWQ56cr2FWB0cW/H4MXqlSNAJAEnYsWkCtGMCj1vWrpbZROP ksxaypO+Ctg9Pa1k4f++IV+jxPRQZ/BDrM4WxBgCll+aFTJnaSkke42COrE+CqUNuXDu JTGTG3PyvJA5UDOAXrjMpGRbGcNE9ZoEIU6zQuZqaa2/ilfzP9ndTepjV+HMkXAxe9na XkIikgjjva4MYa+ahSufZLJOkBGRkWL+eeRivPwqFgBaTDSi5Vf1g0J8T4C8VCBduYbb Jf1OXklCGs/mHIH1ngiBFdToJgbl330zxyff/QVVulliPbCF2FMIGvrO7uhe5vQhrN7X gyXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FXqxVde/Abs2a4sxizDquRphV+WRBjTIUEhFiOhgl4c=; b=mS9ACYNLmPpB3WG+GPMGgm8b1+ZYD6jqw6JqAWc3EPb+1PPksbISPDMgQCqomLeO6i wEdNPOGk+hj2ooogYN3WkFaGnaz5M93hXBBy6uB0EBVQfUCVOgeiCAr6v/zLZgI+ULjx vqxcn+oWUZFiVu/MoFhsmMlnD9fMpdTyK75CAeA6XSgafAlRvDVSvna6G15xhmMW4X3h BhBeb0Kdtvogy43ZCmwE49Rje001qV1PsDETWp1V5buaA+h50474/u6DUbk88xt2ae+K QBO9+H/oA9r82LzWSeAAYZAbWy5ZSAChBYgTqVKd+W0gKydb33zXvz65V0EmC5aOL0ye y4Xg== X-Gm-Message-State: AGi0PuYuyHHt9Y0v6TKotqEf2COYoV2E3wMqKVJfbPw0/8HqmQDNKVA3 5yaPtD53T334wyDZTcCo0Y7/Thhln+krUFs2aiw= X-Google-Smtp-Source: APiQypKxZLhqjyYpnDtabcLZ1lW6by2XPVKjDqRKhCU49KbQngYeTc1F3B14Ykj74SGXHUSVUm2IWqnJ0IEE9+ZIWUQ= X-Received: by 2002:a5e:9904:: with SMTP id t4mr11923943ioj.59.1589272142267; Tue, 12 May 2020 01:29:02 -0700 (PDT) MIME-Version: 1.0 References: <20200410164127.54229-1-gavin.hu@arm.com> <20200511180637.22200-1-honnappa.nagarahalli@arm.com> In-Reply-To: From: Jerin Jacob Date: Tue, 12 May 2020 13:58:46 +0530 Message-ID: To: Ruifeng Wang Cc: Honnappa Nagarahalli , "dev@dpdk.org" , "jerinj@marvell.com" , "hemant.agrawal@nxp.com" , "Ajit Khaparde (ajit.khaparde@broadcom.com)" , "igorch@amazon.com" , "thomas@monjalon.net" , "viacheslavo@mellanox.com" , "arybchenko@solarflare.com" , nd Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [RFC] eal: adjust barriers for IO on Armv8-a X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, May 12, 2020 at 1:32 PM Ruifeng Wang wrote: > > > > -----Original Message----- > > From: Jerin Jacob > > Sent: Tuesday, May 12, 2020 2:42 PM > > To: Ruifeng Wang > > Cc: Honnappa Nagarahalli ; > > dev@dpdk.org; jerinj@marvell.com; hemant.agrawal@nxp.com; Ajit > > Khaparde (ajit.khaparde@broadcom.com) ; > > igorch@amazon.com; thomas@monjalon.net; viacheslavo@mellanox.com; > > arybchenko@solarflare.com; nd > > Subject: Re: [dpdk-dev] [RFC] eal: adjust barriers for IO on Armv8-a > > > > On Tue, May 12, 2020 at 11:48 AM Ruifeng Wang > > wrote: > > > > > > > > > > -----Original Message----- > > > > From: Honnappa Nagarahalli > > > > Sent: Tuesday, May 12, 2020 2:07 AM > > > > To: dev@dpdk.org; jerinj@marvell.com; hemant.agrawal@nxp.com; Ajit > > > > Khaparde (ajit.khaparde@broadcom.com) > > ; > > > > igorch@amazon.com; thomas@monjalon.net; > > viacheslavo@mellanox.com; > > > > arybchenko@solarflare.com; Honnappa Nagarahalli > > > > > > > > Cc: Ruifeng Wang ; nd > > > > Subject: [RFC] eal: adjust barriers for IO on Armv8-a > > > > > > > > Change the barrier APIs for IO to reflect that Armv8-a is > > > > other-multi-copy atomicity memory model. > > > > > > > > Armv8-a memory model has been strengthened to require > > > > other-multi-copy atomicity. This property requires memory accesses > > > > from an observer to become visible to all other observers > > > > simultaneously [3]. This means > > > > > > > > a) A write arriving at an endpoint shared between multiple CPUs is > > > > visible to all CPUs > > > > b) A write that is visible to all CPUs is also visible to all other > > > > observers in the shareability domain > > > > > > > > This allows for using cheaper DMB instructions in the place of DSB > > > > for devices that are visible to all CPUs (i.e. devices that DPDK caters to). > > > > > > > > Please refer to [1], [2] and [3] for more information. > > > > > > > > [1] > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c > > > > ommit/?i d=22ec71615d824f4f11d38d0e55a88d8956b7e45f > > > > [2] https://www.youtube.com/watch?v=i6DayghhA8Q > > > > [3] https://www.cl.cam.ac.uk/~pes20/armv8-mca/ > > > > > > > > Signed-off-by: Honnappa Nagarahalli > > > > --- > > > > lib/librte_eal/arm/include/rte_atomic_64.h | 10 +++++----- > > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/lib/librte_eal/arm/include/rte_atomic_64.h > > > > b/lib/librte_eal/arm/include/rte_atomic_64.h > > > > index 7b7099cdc..e406411bb 100644 > > > > --- a/lib/librte_eal/arm/include/rte_atomic_64.h > > > > +++ b/lib/librte_eal/arm/include/rte_atomic_64.h > > > > @@ -19,11 +19,11 @@ extern "C" { > > > > #include > > > > #include > > > > > > > > -#define rte_mb() asm volatile("dsb sy" : : : "memory") > > > > +#define rte_mb() asm volatile("dmb osh" : : : "memory") > > > > > > > > -#define rte_wmb() asm volatile("dsb st" : : : "memory") > > > > +#define rte_wmb() asm volatile("dmb oshst" : : : "memory") > > > > > > > > -#define rte_rmb() asm volatile("dsb ld" : : : "memory") > > > > +#define rte_rmb() asm volatile("dmb oshld" : : : "memory") > > > > > > > > #define rte_smp_mb() asm volatile("dmb ish" : : : "memory") > > > > > > > > @@ -37,9 +37,9 @@ extern "C" { > > > > > > > > #define rte_io_rmb() rte_rmb() > > > > > > > > -#define rte_cio_wmb() asm volatile("dmb oshst" : : : "memory") > > > > +#define rte_cio_wmb() rte_wmb() > > > > > > > > -#define rte_cio_rmb() asm volatile("dmb oshld" : : : "memory") > > > > +#define rte_cio_rmb() rte_rmb() > > > > > > > > /*------------------------ 128 bit atomic operations > > > > -------------------------*/ > > > > > > > > -- > > > > 2.17.1 > > > > > > This change showed about 7% performance gain in testpmd single core > > NDR test. > > > > I am trying to understand this patch wrt DPDK current usage model? > > > > 1) Is performance improvement due to the fact that the PMD that you are > > using it for testing suppose to use existing rte_cio_* but it was using > > rte_[rw]mb? > > This is part of the reason. There are also cases where rte_io_* was used and can be relaxed. > Such as: http://patches.dpdk.org/patch/68162/ > > > 2) In my understanding : > > a) CPU to CPU barrier requirements are addressed by rte_smp_* > > b) CPU to DMA/Device barrier requirements are addressed by rte_cio_* > > c) CPU to ANY(CPU or Device) are addressed by rte_[rw]mb > > > > If (c) is true then we are violating the DPDK spec with change. Right? > > Developers are still required to use correct barrier APIs for different use cases. > I think this change mitigates performance penalty when non optimal barrier is used. But does it violate the contract? We are using rte_[rw]mb as a low performance/heavyweight for all the cases. I think that is the contract to DPDK consumers. For different requirment, We have a specific API. IMO, It makes sense to change the fastpath code for more fine granted barriers based on the need rather than changing the generic one to lightweight. i.e rte_[rw]wb is the superset that works on all cases and use customized one for the specific use case. > > > This change will not be required if fastpath (CPU to Device) is using rte_cio_*. > > Right? > > See 1). Correct usage of rte_cio_* is not the whole. > For some other use cases, such as barrier between accesses of different memory types, we can also use lighter barrier 'dmb'. > > > > > > > > > > Tested-by: Ruifeng Wang > > >