From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by dpdk.org (Postfix) with ESMTP id 569DB11A4; Tue, 19 Mar 2019 21:45:06 +0100 (CET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id E00EB222F3; Tue, 19 Mar 2019 16:45:05 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Tue, 19 Mar 2019 16:45:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=mesmtp; bh=3DB/mTxLAUsfRIC6CxPFsFlF4TPDNftpaoWylYRkqNY=; b=j0bklJ3HSJas NspsvQgZ57W3QbuhUbnXthkXoftumEO3YvJ+6vTBlHCqRA2w8a6ito7+sQQleWFq nr7F6kv4hBpiDG6GrfZrZ9WQRa/XhwW/ae3BDRIfa0yNNWp2kg7sWwkR3rKVtHcE FqCqJir36H/mRYoEZvr0fpRyzJK6hEo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=3DB/mTxLAUsfRIC6CxPFsFlF4TPDNftpaoWylYRkq NY=; b=Gy3l8x+/EtlEo2Ej/lDw82NkngicuNsBSrTzZ9abhll9bFaabFE7bEuPG FmgtzDj4jahnLW21EnMb/U1/thLYCS7VLifMp0BGMG3N6vWl/RIPSps+hHZlYGIv hXHB9GVTK6ZUCQbXHvP7gVfgIiiB3jLAzv8H26UQS8hwvMLG3k1VZYvmZqzZni6+ y14/wJz6hj1YdWfavXEdb3DTHEXPNVgkg+PucARAXRUs3W8woEoWFDReM9ljai4i aCgJE6gU+5EZQsel94KTN3l7IEIRYkpHYD6l2goWKYMyDz6/luFhtF3yNzGWcLs/ gfruuF1W6Ir2a2J34qXkymQUOYtyQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrieeggddugeefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkfgjfhgggfgtsehtqhertddttddunecuhfhrohhmpefvhhhomhgr shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecukf hppeejjedrudefgedrvddtfedrudekgeenucfrrghrrghmpehmrghilhhfrhhomhepthhh ohhmrghssehmohhnjhgrlhhonhdrnhgvthenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 8760410317; Tue, 19 Mar 2019 16:45:03 -0400 (EDT) From: Thomas Monjalon To: Shahaf Shuler Cc: Dekel Peled , Chao Zhu , Yongseok Koh , "dev@dpdk.org" , Ori Kam , "stable@dpdk.org" , pradeep@us.ibm.com Date: Tue, 19 Mar 2019 21:45:01 +0100 Message-ID: <12440555.pBuX5BjkXC@xps> In-Reply-To: References: <1552913893-43407-1-git-send-email-dekelp@mellanox.com> <1789153.zrlSK8XYcq@xps> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" Subject: Re: [dpdk-dev] [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Mar 2019 20:45:06 -0000 19/03/2019 20:42, Shahaf Shuler: > Tuesday, March 19, 2019 1:15 PM, Thomas Monjalon: > > Subject: Re: [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER > >=20 > > Guys, please let's avoid top-post. > >=20 > > You are both not replying to each other: > >=20 > > 1/ Dekel mentioned the IBM doc but Chao did not argue about the lack of= IO > > protection with lwsync. > > We assume that rte_mb should protect any access including IO. > >=20 > > 2/ Chao asked about the semantic of the barrier used in mlx5 code, but = Dekel > > did not reply about the semantic: are we protecting IO or general memory > > access? >=20 > In mlx5 code we want to sync between two different writes: > 1. write to system memory (RAM) > 2. write to MMIO memory (device) >=20 > We need #1 to be visible on host memory before #2 is committed to NIC. > We want to have a single type of barrier which will translate to the corr= ect assembly based on the system arch, and in addition we want it light-wei= ght as possible. >=20 > So far, when not running on power, we used the rte_wmb for that. On x86 a= nd ARM systems it provided the needed guarantees. =20 > It is also mentioned in the barrier doxygen on ARM arch: > " > Write memory barrier. =20 > =20 > Guarantees that the STORE operations generated before the barrier > occur before the STORE operations generated after. > " >=20 > It doesn't restrict to store to system memory only.=20 > w/ power is on somewhat different and in fact rte_mb is required. It obvi= ously miss the point of those barrier if we will need to use a different ba= rrier based on the system arch.=20 >=20 > We need to align the definition of the different barriers in DPDK: > 1. need a clear documentation of each. this should be global and not part= of the specific implementation on each arch. The global definition is in lib/librte_eal/common/include/generic/rte_atomi= c.h There are some copy/paste in Arm32 and PPC that I will remove. > 2. either modify ppc rte_wmb to match ARM and x86 ones or to define a new= type of barrier which will sync between both I/O and stores to systems mem= ory. The basic memory barrier of DPDK does not mention a difference between I/O and system memory. It is not explicit (yet) but I assume it is protecting both. So, in my opinion, we need to make it explicit in the doc, and fix the PPC implementation to comply with this definition. Anyway, I don't see any significant effort from IBM to move from the alpha support stage to a real Open Source support. PS: sending a mail every two months, to promise improvements, is not enough! =2D---------------- > > 19/03/2019 11:05, Dekel Peled: > > > Hi, > > > > > > For ppc, rte_io_mb() is defined as rte_mb(), which is defined as asm = sync. > > > According to comments in arch/ppc_64/rte_atomic.h, rte_wmb() and > > rte_rmb() are the same as rte_mb(), for store and load respectively. > > > My patch propose to define rte_wmb() and rte_rmb() as asm sync, like > > rte_mb(), since using lwsync is incorrect for them. > > > > > > Regards, > > > Dekel > > > > > > From: Chao Zhu > > > > Dekel=A3=AC > > > > > > > > To control the memory order for device memory, I think you should > > > > use > > > > rte_io_mb() instead of rte_mb(). This will generate correct result. > > > > rte_wmb() is used for system memory. > > > > > > > > From: Dekel Peled > > > > > > > > > > From previous patch description: "to improve performance on PPC64, > > > > > use light weight sync instruction instead of sync instruction." > > > > > > > > > > Excerpt from IBM doc [1], section "Memory barrier instructions": > > > > > "The second form of the sync instruction is light-weight sync, or= lwsync. > > > > > This form is used to control ordering for storage accesses to > > > > > system memory only. It does not create a memory barrier for > > > > > accesses to device > > > > memory." > > > > > > > > > > This patch removes the use of lwsync, so calls to rte_wmb() and > > > > > rte_rmb() will provide correct memory barrier to ensure order of > > > > > accesses to system memory and device memory. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 26DB2A00E6 for ; Tue, 19 Mar 2019 21:45:09 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 760CF2956; Tue, 19 Mar 2019 21:45:08 +0100 (CET) Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by dpdk.org (Postfix) with ESMTP id 569DB11A4; Tue, 19 Mar 2019 21:45:06 +0100 (CET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id E00EB222F3; Tue, 19 Mar 2019 16:45:05 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Tue, 19 Mar 2019 16:45:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=mesmtp; bh=3DB/mTxLAUsfRIC6CxPFsFlF4TPDNftpaoWylYRkqNY=; b=j0bklJ3HSJas NspsvQgZ57W3QbuhUbnXthkXoftumEO3YvJ+6vTBlHCqRA2w8a6ito7+sQQleWFq nr7F6kv4hBpiDG6GrfZrZ9WQRa/XhwW/ae3BDRIfa0yNNWp2kg7sWwkR3rKVtHcE FqCqJir36H/mRYoEZvr0fpRyzJK6hEo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=3DB/mTxLAUsfRIC6CxPFsFlF4TPDNftpaoWylYRkq NY=; b=Gy3l8x+/EtlEo2Ej/lDw82NkngicuNsBSrTzZ9abhll9bFaabFE7bEuPG FmgtzDj4jahnLW21EnMb/U1/thLYCS7VLifMp0BGMG3N6vWl/RIPSps+hHZlYGIv hXHB9GVTK6ZUCQbXHvP7gVfgIiiB3jLAzv8H26UQS8hwvMLG3k1VZYvmZqzZni6+ y14/wJz6hj1YdWfavXEdb3DTHEXPNVgkg+PucARAXRUs3W8woEoWFDReM9ljai4i aCgJE6gU+5EZQsel94KTN3l7IEIRYkpHYD6l2goWKYMyDz6/luFhtF3yNzGWcLs/ gfruuF1W6Ir2a2J34qXkymQUOYtyQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrieeggddugeefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkfgjfhgggfgtsehtqhertddttddunecuhfhrohhmpefvhhhomhgr shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecukf hppeejjedrudefgedrvddtfedrudekgeenucfrrghrrghmpehmrghilhhfrhhomhepthhh ohhmrghssehmohhnjhgrlhhonhdrnhgvthenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 8760410317; Tue, 19 Mar 2019 16:45:03 -0400 (EDT) From: Thomas Monjalon To: Shahaf Shuler Cc: Dekel Peled , Chao Zhu , Yongseok Koh , "dev@dpdk.org" , Ori Kam , "stable@dpdk.org" , pradeep@us.ibm.com Date: Tue, 19 Mar 2019 21:45:01 +0100 Message-ID: <12440555.pBuX5BjkXC@xps> In-Reply-To: References: <1552913893-43407-1-git-send-email-dekelp@mellanox.com> <1789153.zrlSK8XYcq@xps> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190319204501.bZelLlJAFL2QCEe-O2XUspNpzaUsNjwRZloHlKmcqt8@z> 19/03/2019 20:42, Shahaf Shuler: > Tuesday, March 19, 2019 1:15 PM, Thomas Monjalon: > > Subject: Re: [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER > >=20 > > Guys, please let's avoid top-post. > >=20 > > You are both not replying to each other: > >=20 > > 1/ Dekel mentioned the IBM doc but Chao did not argue about the lack of= IO > > protection with lwsync. > > We assume that rte_mb should protect any access including IO. > >=20 > > 2/ Chao asked about the semantic of the barrier used in mlx5 code, but = Dekel > > did not reply about the semantic: are we protecting IO or general memory > > access? >=20 > In mlx5 code we want to sync between two different writes: > 1. write to system memory (RAM) > 2. write to MMIO memory (device) >=20 > We need #1 to be visible on host memory before #2 is committed to NIC. > We want to have a single type of barrier which will translate to the corr= ect assembly based on the system arch, and in addition we want it light-wei= ght as possible. >=20 > So far, when not running on power, we used the rte_wmb for that. On x86 a= nd ARM systems it provided the needed guarantees. =20 > It is also mentioned in the barrier doxygen on ARM arch: > " > Write memory barrier. =20 > =20 > Guarantees that the STORE operations generated before the barrier > occur before the STORE operations generated after. > " >=20 > It doesn't restrict to store to system memory only.=20 > w/ power is on somewhat different and in fact rte_mb is required. It obvi= ously miss the point of those barrier if we will need to use a different ba= rrier based on the system arch.=20 >=20 > We need to align the definition of the different barriers in DPDK: > 1. need a clear documentation of each. this should be global and not part= of the specific implementation on each arch. The global definition is in lib/librte_eal/common/include/generic/rte_atomi= c.h There are some copy/paste in Arm32 and PPC that I will remove. > 2. either modify ppc rte_wmb to match ARM and x86 ones or to define a new= type of barrier which will sync between both I/O and stores to systems mem= ory. The basic memory barrier of DPDK does not mention a difference between I/O and system memory. It is not explicit (yet) but I assume it is protecting both. So, in my opinion, we need to make it explicit in the doc, and fix the PPC implementation to comply with this definition. Anyway, I don't see any significant effort from IBM to move from the alpha support stage to a real Open Source support. PS: sending a mail every two months, to promise improvements, is not enough! =2D---------------- > > 19/03/2019 11:05, Dekel Peled: > > > Hi, > > > > > > For ppc, rte_io_mb() is defined as rte_mb(), which is defined as asm = sync. > > > According to comments in arch/ppc_64/rte_atomic.h, rte_wmb() and > > rte_rmb() are the same as rte_mb(), for store and load respectively. > > > My patch propose to define rte_wmb() and rte_rmb() as asm sync, like > > rte_mb(), since using lwsync is incorrect for them. > > > > > > Regards, > > > Dekel > > > > > > From: Chao Zhu > > > > Dekel=A3=AC > > > > > > > > To control the memory order for device memory, I think you should > > > > use > > > > rte_io_mb() instead of rte_mb(). This will generate correct result. > > > > rte_wmb() is used for system memory. > > > > > > > > From: Dekel Peled > > > > > > > > > > From previous patch description: "to improve performance on PPC64, > > > > > use light weight sync instruction instead of sync instruction." > > > > > > > > > > Excerpt from IBM doc [1], section "Memory barrier instructions": > > > > > "The second form of the sync instruction is light-weight sync, or= lwsync. > > > > > This form is used to control ordering for storage accesses to > > > > > system memory only. It does not create a memory barrier for > > > > > accesses to device > > > > memory." > > > > > > > > > > This patch removes the use of lwsync, so calls to rte_wmb() and > > > > > rte_rmb() will provide correct memory barrier to ensure order of > > > > > accesses to system memory and device memory.