Hi, Honnappa I'm sorry for delay - I had to play with patch, to compile this one with assembly listings and check what code is actually generated on x86/ARM. On x86 there is no difference at all (with and w/o patch), so no objection from my side. On ARM we have: w/o patch: dmb oshld with patch: dmb ishld What is the purpose of the barrier - to not allow the CQE read access reordering. On x86, "compiler barrier" is quite enough (due to strong load/store ordering). On ARM load/store might be reordered, AFAIU. CQE resides in the host memory and can be directly written by the NIC via PCIe. (In my understanding it can cause core cache(s) invalidations). I have a question - should we consider this as outer sharable domain? Is it safe to have a barrier for inner domain only in our case? We have updated the cqe_check() routine, sorry for this. Could you, please, update the patch and send v3? With best regards, Slava > -----Original Message----- > From: Honnappa Nagarahalli > Sent: вторник, 15 ноября 2022 г. 03:46 > To: Slava Ovsiienko ; dev@dpdk.org; Ruifeng Wang > ; Matan Azrad ; Shahaf Shuler > > Cc: nd ; Matan Azrad ; stable@dpdk.org; > nd > Subject: RE: [PATCH v2] net/mlx5: use just sufficient barrier for Arm platforms > > > > > > > > > > > Hi, Honnappa > > Hi Slava, thanks for the feedback. > > > > > > > > We discussed the barrier here: > > > http://patches.dpdk.org/project/dpdk/patch/20210606164948.35997-1- > > > honnappa.nagarahalli@arm.com/ > > Yes, I have changed the patch according to the discussion. i.e. > > barrier is needed, but different (inner sharable domain) barrier is required. > > > > > > > > (BTW, it is good practice to keep the reference to previous patch > > > versions below Commit Message of the next ones). > > > > > > This barrier is not about compiler ordering, it is about external HW > > > agent memory action completions. > > > So, I'm not sure the rte_atomic_thread_fence() is safe for x86 - > > > patch impacts > > > x86 as well. > > The earlier barrier 'rte_io_rmb()', resolves to a compiler barrier on x86 [1]. > > The rte_atomic_thread_fence(__ATOMIC_ACQUIRE) on x86 also acts as a > > compiler barrier. So, there is no change for x86. > > > > > > [1] > > https://github.com/DPDK/dpdk/blob/main/lib/eal/x86/include/rte_atomic. > > h# > > L80 > Hi Slava, any more comments on this? > > > > > > > > > With best regards, > > > Slava > > > > > > > -----Original Message----- > > > > From: Honnappa Nagarahalli > > > > Sent: Tuesday, August 30, 2022 23:01 > > > > To: dev@dpdk.org; honnappa.nagarahalli@arm.com; > > > ruifeng.wang@arm.com; > > > > Matan Azrad ; Shahaf Shuler > > ; > > > > Slava Ovsiienko > > > > Cc: nd@arm.com; Matan Azrad ; stable@dpdk.org > > > > Subject: [PATCH v2] net/mlx5: use just sufficient barrier for Arm > > > > platforms > > > > > > > > cqe->op_own indicates if the CQE is owned by the NIC. The rest of > > > > the fields in CQE should be read only after op_own is read. On Arm > > > > platforms using "dmb ishld" is sufficient to enforce this. > > > > > > > > Fixes: 88c0733535d6 ("net/mlx5: extend Rx completion with error > > > > handling") > > > > Cc: matan@mellanox.com > > > > Cc: stable@dpdk.org > > > > > > > > Signed-off-by: Honnappa Nagarahalli > > > > Reviewed-by: Ruifeng Wang > > > > --- > > > > drivers/common/mlx5/mlx5_common.h | 6 +++++- > > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/common/mlx5/mlx5_common.h > > > > b/drivers/common/mlx5/mlx5_common.h > > > > index 5028a05b49..ac2e85b15f 100644 > > > > --- a/drivers/common/mlx5/mlx5_common.h > > > > +++ b/drivers/common/mlx5/mlx5_common.h > > > > @@ -195,7 +195,11 @@ check_cqe(volatile struct mlx5_cqe *cqe, > > > > const uint16_t cqes_n, > > > > > > > > if (unlikely((op_owner != (!!(idx))) || (op_code == > > > > MLX5_CQE_INVALID))) > > > > return MLX5_CQE_STATUS_HW_OWN; > > > > - rte_io_rmb(); > > > > + /* Prevent speculative reading of other fields in CQE until > > > > + * CQE is valid. > > > > + */ > > > > + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); > > > > + > > > > if (unlikely(op_code == MLX5_CQE_RESP_ERR || > > > > op_code == MLX5_CQE_REQ_ERR)) > > > > return MLX5_CQE_STATUS_ERR; > > > > -- > > > > 2.17.1