From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <thomas@monjalon.net>
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by dpdk.org (Postfix) with ESMTP id CD73E37B0
 for <dev@dpdk.org>; Fri, 22 Mar 2019 09:49:08 +0100 (CET)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailout.nyi.internal (Postfix) with ESMTP id 27A0C236CA;
 Fri, 22 Mar 2019 04:49:08 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162])
 by compute1.internal (MEProxy); Fri, 22 Mar 2019 04:49:08 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=mesmtp;
 bh=pguyjRnaLB1k2QUcM5EnvEc+OkVrNpDYNMTorH8v27Q=; b=S67bG1RroWMa
 /1wZyYw/tf2uJNi3sDLjEpAaI8lUmMQqyRI6YikEortQC0dipWinWdB6m4Boa8Kp
 uEUzh7eeVCxziTH7yiSUlpPOcg/aKbvOGQmObUMiVzk/ocGFjZSFCAOV/NfAnb0J
 82Q3mmsnjgeiZyXTYt3Cp85CibBxjlo=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :date:from:in-reply-to:message-id:mime-version:references
 :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm2; bh=pguyjRnaLB1k2QUcM5EnvEc+OkVrNpDYNMTorH8v2
 7Q=; b=a8HjFp0ANE4uNBbhGsEhT5g3jy2f3qbP55FdW6S1t57v0ABwM9pB/qPG+
 YMmE9/s/+ExU75rKeNhmLolJOrpUk5awDesAVDLSBmiv8HZKdPEELUrBfBcJGrUI
 m2aPlB1XIMioTS69tf6vPlgjA9NOznlnIfxZM6q4xx0xitxIIcEcEYKetmCHsLsy
 UovzpIWPGhzwzzOtLIGCq2hzQlbfyJs1OKGncAB+c48B0hI6Y/gvl9rnFy4fyGw5
 S7ty94apUAezjx9397HABu8Cp8M9OSBMrYX7nN1I6KUy7ST+7dAn4Ej9QapCHDEa
 bRz0BQiB1fxk76RTkLhA6V+lnz9Sg==
X-ME-Sender: <xms:gqGUXHvoaA-mBLO_UFjeAskM4S8JUyRzbE6xqR7is2FeqI_vGr4ZtA>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrjedtgdduvdehucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne
 cujfgurhephffvufffkfgjfhgggfgtsehtqhertddttdejnecuhfhrohhmpefvhhhomhgr
 shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecuff
 homhgrihhnpeguphgukhdrohhrghenucfkphepjeejrddufeegrddvtdefrddukeegnecu
 rfgrrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtne
 cuvehluhhsthgvrhfuihiivgeptd
X-ME-Proxy: <xmx:g6GUXMtzEm22BQtxium4LSLDNaQT4Od2jgFpbrh0rVWoWuI-iG7WUg>
 <xmx:g6GUXFycso-Qgo2tZCCZU29zmTXVQopW9d0LFHXXzZpm8-Gfiqp5zg>
 <xmx:g6GUXKhEVqMGJNB9Rqutr6mNPq-wrI-clDWh7kQnQc5YqfOWRYkuTA>
 <xmx:hKGUXOoqJoBgQ5OMN0cmwlshDs4mWbY-HrNj9CMYk08NMzaiOMRw0Q>
Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id 4A34BE4176;
 Fri, 22 Mar 2019 04:49:05 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: Pradeep Satyanarayana <pradeep@us.ibm.com>,
 David Wilder <wilder@us.ibm.com>
Cc: Shahaf Shuler <shahafs@mellanox.com>, Chao Zhu <chaozhu@linux.vnet.ibm.com>,
 Dekel Peled <dekelp@mellanox.com>, dev@dpdk.org,
 David Christensen <drc@ibm.com>, Ori Kam <orika@mellanox.com>,
 Yongseok Koh <yskoh@mellanox.com>, konstantin.ananyev@intel.com,
 ola.liljedahl@arm.com, honnappa.nagarahalli@arm.com,
 bruce.richardson@intel.com
Date: Fri, 22 Mar 2019 09:49:03 +0100
Message-ID: <11283309.AIL3tCH6tf@xps>
In-Reply-To: <OF1F3C40D3.A212732F-ON882583C4.007F1C22-882583C5.0009390F@notes.na.collabserv.com>
References: <1552913893-43407-1-git-send-email-dekelp@mellanox.com>
 <AM0PR0502MB3795291036D9019239391AF4C3420@AM0PR0502MB3795.eurprd05.prod.outlook.com>
 <OF1F3C40D3.A212732F-ON882583C4.007F1C22-882583C5.0009390F@notes.na.collabserv.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
Subject: Re: [dpdk-dev] [PATCH] eal/ppc: remove fix of memory barrier for
	IBM POWER
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Mar 2019 08:49:09 -0000

We need to agree on the definitions.
Please see below,

22/03/2019 02:40, Pradeep Satyanarayana:
> Shahaf Shuler <shahafs@mellanox.com> wrote on 03/21/2019 01:49:39 AM:
> > Pradeep Satyanarayana <pradeep@us.ibm.com> wrote on Thu 3/21/2019 12:41
> AM:
> > >> > So far, when not running on power, we used the rte_wmb for that.
> > >> On x86 and ARM systems it provided the needed guarantees.
> > >> > It is also mentioned in the barrier doxygen on ARM arch:
> > >> > "
> > >> > Write memory barrier.
> > >> >
> > >> > Guarantees that the STORE operations generated before the barrier
> > >> > occur before the STORE operations generated after.
> > >> > "
> > >> >
> > >> > It doesn't restrict to store to system memory only.
> > >> > w/ power is on somewhat different and in fact rte_mb is required.
> > >> It obviously miss the point of those barrier if we will need to use
> > >> a different barrier based on the system arch.
> > >> >
> > >> > We need to align the definition of the different barriers in DPDK:
> > >> > 1. need a clear documentation of each. this should be global and
> > >> not part of the specific implementation on each arch.
> > >
> > >A single approach may not work for all architectures. Power is differe=
nt
> > >from others, so we need to be able to accommodate that. More comments
> below.
> >
> > it don't get this claim.
> > It is ok to have some differences between the different arch, but
> > here you implement a well-defined barrier - rte_wmb.
> > if you see a need we can discuss to define a **new** barrier which
> > sync STORE only to system memory, and will be able to utilize the
> > lwsync command.
> >
> > >
> > >> The global definition is in lib/librte_eal/common/include/
> > generic/rte_atomic.h
> > >>
> > >> There are some copy/paste in Arm32 and PPC that I will remove.
> > >>
> > >> > 2. either modify ppc rte_wmb to match ARM and x86 ones or to
> > >> define a new type of barrier which will sync between both I/O and
> > >> stores to systems memory.
> > >>
> > >> The basic memory barrier of DPDK does not mention
> > >> a difference between I/O and system memory.
> > >
> > >In the case of Power, sync will cater to both I/O and system
> > memory. However, that
> > >is too big a hammer in all cases.
> >
> > rte_wmb requires such sync. you propose to have the wrong barrier in
> > favor of performance.
> > to mitigate this you can take my suggestion above and define a new,
> > more lightweight one.
> >
> > >
> > >> It is not explicit (yet) but I assume it is protecting both.
> > >> So, in my opinion, we need to make it explicit in the doc,
> > >> and fix the PPC implementation to comply with this definition.
> > >>
> > >> Anyway, I don't see any significant effort from IBM to move from
> > >> the alpha support stage to a real Open Source support.
> > >> PS: sending a mail every two months, to promise improvements, is
> > not enough!
> > >
> >
> > [=E2=80=A6]
> >
> > >
> > >We should retain lwsync, should not be removed as discussed in here:
> > >
> > >http://mails.dpdk.org/archives/dev/2019-March/126746.html
> >
> > i don't agree.
> > it is very clear the rte_wmb implementation in power is broken and
> > we need to fix this right away before other customers will hit the
> > same issue.
>=20
>=20
> In the DPDK source I see a couple of different classes of memory barriers.
> I am
> not clear on the usage of these in the drivers, but I would think the
> guidelines
> to be as shown below (for Power):
>=20
> - rte_[rw]mb (general memory barrier) --> should be lwsync

This is what may be discussed.
The assumption is that the general memory barrier should cover
all cases (CPU caches, SMP and I/O).
That's why we think it should "sync" for Power.

> - rte_smp_[rw]mb (SMP memory barrier) -->should be lwsync
> - rte_io_[rw]mb (I/O memory barrier)  --> should be sync
> - rte_cio_[rw]mb (coherent I/O memory barrier) -->should be sync
>=20
> lwsync is appropriate for cases where CPUs are accessing cacheable memory
> (i.e. Memory Coherence Required) while the sync instruction should be used
> in all other cases.
>=20
> With the patch in:
> http://mails.dpdk.org/archives/dev/2019-March/126746.html
>=20
> It converts even the rte_smp_[rw]mb into sync. That is not what the
> rte_smp*() should
> be implementing as per the guidelines above.
>=20
> static __rte_always_inline void
> mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe
> *wqe,
>                        int cond)
> {
>         uint64_t *dst =3D (uint64_t *)((uintptr_t)txq->bf_reg);
>         volatile uint64_t *src =3D ((volatile uint64_t *)wqe);
>=20
>         rte_cio_wmb(); --> would rte_cio_rmb() be more appropriate?
>         *txq->qp_db =3D rte_cpu_to_be_32(txq->wqe_ci);
>         /* Ensure ordering between DB record and BF copy. */
>         rte_wmb(); --> what has been established is that for Power we need
> "sync" instead of lwsync
> 		We are dealing with device memory -should we be using an
> rte_io_wmb() here?
>=20
>         mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
>         if (cond)
>                 rte_wmb(); --> what about here? Are there conditions when
> we need sync?
> }

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by dpdk.space (Postfix) with ESMTP id F1668A00E6
	for <public@inbox.dpdk.org>; Fri, 22 Mar 2019 09:49:10 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 5D98A1B55C;
	Fri, 22 Mar 2019 09:49:10 +0100 (CET)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by dpdk.org (Postfix) with ESMTP id CD73E37B0
 for <dev@dpdk.org>; Fri, 22 Mar 2019 09:49:08 +0100 (CET)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailout.nyi.internal (Postfix) with ESMTP id 27A0C236CA;
 Fri, 22 Mar 2019 04:49:08 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162])
 by compute1.internal (MEProxy); Fri, 22 Mar 2019 04:49:08 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=mesmtp;
 bh=pguyjRnaLB1k2QUcM5EnvEc+OkVrNpDYNMTorH8v27Q=; b=S67bG1RroWMa
 /1wZyYw/tf2uJNi3sDLjEpAaI8lUmMQqyRI6YikEortQC0dipWinWdB6m4Boa8Kp
 uEUzh7eeVCxziTH7yiSUlpPOcg/aKbvOGQmObUMiVzk/ocGFjZSFCAOV/NfAnb0J
 82Q3mmsnjgeiZyXTYt3Cp85CibBxjlo=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :date:from:in-reply-to:message-id:mime-version:references
 :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm2; bh=pguyjRnaLB1k2QUcM5EnvEc+OkVrNpDYNMTorH8v2
 7Q=; b=a8HjFp0ANE4uNBbhGsEhT5g3jy2f3qbP55FdW6S1t57v0ABwM9pB/qPG+
 YMmE9/s/+ExU75rKeNhmLolJOrpUk5awDesAVDLSBmiv8HZKdPEELUrBfBcJGrUI
 m2aPlB1XIMioTS69tf6vPlgjA9NOznlnIfxZM6q4xx0xitxIIcEcEYKetmCHsLsy
 UovzpIWPGhzwzzOtLIGCq2hzQlbfyJs1OKGncAB+c48B0hI6Y/gvl9rnFy4fyGw5
 S7ty94apUAezjx9397HABu8Cp8M9OSBMrYX7nN1I6KUy7ST+7dAn4Ej9QapCHDEa
 bRz0BQiB1fxk76RTkLhA6V+lnz9Sg==
X-ME-Sender: <xms:gqGUXHvoaA-mBLO_UFjeAskM4S8JUyRzbE6xqR7is2FeqI_vGr4ZtA>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrjedtgdduvdehucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne
 cujfgurhephffvufffkfgjfhgggfgtsehtqhertddttdejnecuhfhrohhmpefvhhhomhgr
 shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecuff
 homhgrihhnpeguphgukhdrohhrghenucfkphepjeejrddufeegrddvtdefrddukeegnecu
 rfgrrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtne
 cuvehluhhsthgvrhfuihiivgeptd
X-ME-Proxy: <xmx:g6GUXMtzEm22BQtxium4LSLDNaQT4Od2jgFpbrh0rVWoWuI-iG7WUg>
 <xmx:g6GUXFycso-Qgo2tZCCZU29zmTXVQopW9d0LFHXXzZpm8-Gfiqp5zg>
 <xmx:g6GUXKhEVqMGJNB9Rqutr6mNPq-wrI-clDWh7kQnQc5YqfOWRYkuTA>
 <xmx:hKGUXOoqJoBgQ5OMN0cmwlshDs4mWbY-HrNj9CMYk08NMzaiOMRw0Q>
Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id 4A34BE4176;
 Fri, 22 Mar 2019 04:49:05 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: Pradeep Satyanarayana <pradeep@us.ibm.com>,
 David Wilder <wilder@us.ibm.com>
Cc: Shahaf Shuler <shahafs@mellanox.com>, Chao Zhu <chaozhu@linux.vnet.ibm.com>,
 Dekel Peled <dekelp@mellanox.com>, dev@dpdk.org,
 David Christensen <drc@ibm.com>, Ori Kam <orika@mellanox.com>,
 Yongseok Koh <yskoh@mellanox.com>, konstantin.ananyev@intel.com,
 ola.liljedahl@arm.com, honnappa.nagarahalli@arm.com,
 bruce.richardson@intel.com
Date: Fri, 22 Mar 2019 09:49:03 +0100
Message-ID: <11283309.AIL3tCH6tf@xps>
In-Reply-To: <OF1F3C40D3.A212732F-ON882583C4.007F1C22-882583C5.0009390F@notes.na.collabserv.com>
References: <1552913893-43407-1-git-send-email-dekelp@mellanox.com>
 <AM0PR0502MB3795291036D9019239391AF4C3420@AM0PR0502MB3795.eurprd05.prod.outlook.com>
 <OF1F3C40D3.A212732F-ON882583C4.007F1C22-882583C5.0009390F@notes.na.collabserv.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
Subject: Re: [dpdk-dev] [PATCH] eal/ppc: remove fix of memory barrier for
	IBM POWER
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>
Message-ID: <20190322084903.WVYzJMDCQKrqYJIpEiJP_yv1gyihQD9LUpDFCQd4dnI@z>

We need to agree on the definitions.
Please see below,

22/03/2019 02:40, Pradeep Satyanarayana:
> Shahaf Shuler <shahafs@mellanox.com> wrote on 03/21/2019 01:49:39 AM:
> > Pradeep Satyanarayana <pradeep@us.ibm.com> wrote on Thu 3/21/2019 12:41
> AM:
> > >> > So far, when not running on power, we used the rte_wmb for that.
> > >> On x86 and ARM systems it provided the needed guarantees.
> > >> > It is also mentioned in the barrier doxygen on ARM arch:
> > >> > "
> > >> > Write memory barrier.
> > >> >
> > >> > Guarantees that the STORE operations generated before the barrier
> > >> > occur before the STORE operations generated after.
> > >> > "
> > >> >
> > >> > It doesn't restrict to store to system memory only.
> > >> > w/ power is on somewhat different and in fact rte_mb is required.
> > >> It obviously miss the point of those barrier if we will need to use
> > >> a different barrier based on the system arch.
> > >> >
> > >> > We need to align the definition of the different barriers in DPDK:
> > >> > 1. need a clear documentation of each. this should be global and
> > >> not part of the specific implementation on each arch.
> > >
> > >A single approach may not work for all architectures. Power is differe=
nt
> > >from others, so we need to be able to accommodate that. More comments
> below.
> >
> > it don't get this claim.
> > It is ok to have some differences between the different arch, but
> > here you implement a well-defined barrier - rte_wmb.
> > if you see a need we can discuss to define a **new** barrier which
> > sync STORE only to system memory, and will be able to utilize the
> > lwsync command.
> >
> > >
> > >> The global definition is in lib/librte_eal/common/include/
> > generic/rte_atomic.h
> > >>
> > >> There are some copy/paste in Arm32 and PPC that I will remove.
> > >>
> > >> > 2. either modify ppc rte_wmb to match ARM and x86 ones or to
> > >> define a new type of barrier which will sync between both I/O and
> > >> stores to systems memory.
> > >>
> > >> The basic memory barrier of DPDK does not mention
> > >> a difference between I/O and system memory.
> > >
> > >In the case of Power, sync will cater to both I/O and system
> > memory. However, that
> > >is too big a hammer in all cases.
> >
> > rte_wmb requires such sync. you propose to have the wrong barrier in
> > favor of performance.
> > to mitigate this you can take my suggestion above and define a new,
> > more lightweight one.
> >
> > >
> > >> It is not explicit (yet) but I assume it is protecting both.
> > >> So, in my opinion, we need to make it explicit in the doc,
> > >> and fix the PPC implementation to comply with this definition.
> > >>
> > >> Anyway, I don't see any significant effort from IBM to move from
> > >> the alpha support stage to a real Open Source support.
> > >> PS: sending a mail every two months, to promise improvements, is
> > not enough!
> > >
> >
> > [=E2=80=A6]
> >
> > >
> > >We should retain lwsync, should not be removed as discussed in here:
> > >
> > >http://mails.dpdk.org/archives/dev/2019-March/126746.html
> >
> > i don't agree.
> > it is very clear the rte_wmb implementation in power is broken and
> > we need to fix this right away before other customers will hit the
> > same issue.
>=20
>=20
> In the DPDK source I see a couple of different classes of memory barriers.
> I am
> not clear on the usage of these in the drivers, but I would think the
> guidelines
> to be as shown below (for Power):
>=20
> - rte_[rw]mb (general memory barrier) --> should be lwsync

This is what may be discussed.
The assumption is that the general memory barrier should cover
all cases (CPU caches, SMP and I/O).
That's why we think it should "sync" for Power.

> - rte_smp_[rw]mb (SMP memory barrier) -->should be lwsync
> - rte_io_[rw]mb (I/O memory barrier)  --> should be sync
> - rte_cio_[rw]mb (coherent I/O memory barrier) -->should be sync
>=20
> lwsync is appropriate for cases where CPUs are accessing cacheable memory
> (i.e. Memory Coherence Required) while the sync instruction should be used
> in all other cases.
>=20
> With the patch in:
> http://mails.dpdk.org/archives/dev/2019-March/126746.html
>=20
> It converts even the rte_smp_[rw]mb into sync. That is not what the
> rte_smp*() should
> be implementing as per the guidelines above.
>=20
> static __rte_always_inline void
> mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe
> *wqe,
>                        int cond)
> {
>         uint64_t *dst =3D (uint64_t *)((uintptr_t)txq->bf_reg);
>         volatile uint64_t *src =3D ((volatile uint64_t *)wqe);
>=20
>         rte_cio_wmb(); --> would rte_cio_rmb() be more appropriate?
>         *txq->qp_db =3D rte_cpu_to_be_32(txq->wqe_ci);
>         /* Ensure ordering between DB record and BF copy. */
>         rte_wmb(); --> what has been established is that for Power we need
> "sync" instead of lwsync
> 		We are dealing with device memory -should we be using an
> rte_io_wmb() here?
>=20
>         mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
>         if (cond)
>                 rte_wmb(); --> what about here? Are there conditions when
> we need sync?
> }