From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id D851F1B1F2 for ; Wed, 9 Jan 2019 09:54:36 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from yskoh@mellanox.com) with ESMTPS (AES256-SHA encrypted); 9 Jan 2019 10:54:33 +0200 Received: from scfae-sc-2.mti.labs.mlnx (scfae-sc-2.mti.labs.mlnx [10.101.0.96]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x098sVAm029611; Wed, 9 Jan 2019 10:54:32 +0200 From: Yongseok Koh To: shahafs@mellanox.com Cc: dev@dpdk.org, stable@dpdk.org Date: Wed, 9 Jan 2019 00:54:26 -0800 Message-Id: <20190109085426.39965-1-yskoh@mellanox.com> X-Mailer: git-send-email 2.11.0 Subject: [dpdk-dev] [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jan 2019 08:54:37 -0000 On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM. Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh --- drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxtx_vec.h index fda7004e2d..ced5547307 100644 --- a/drivers/net/mlx5/mlx5_rxtx_vec.h +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data *rxq, uint16_t n) return; } for (i = 0; i < n; ++i) { - wq[i].addr = rte_cpu_to_be_64((uintptr_t)elts[i]->buf_addr + - RTE_PKTMBUF_HEADROOM); + uintptr_t buf_addr = + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + + rte_pktmbuf_priv_size(rxq->mp) + RTE_PKTMBUF_HEADROOM; + + assert(buf_addr == (uintptr_t)elts[i]->buf_addr); + wq[i].addr = rte_cpu_to_be_64(buf_addr); /* If there's only one MR, no need to replace LKey in WQE. */ if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > 1)) wq[i].lkey = mlx5_rx_mb2mr(rxq, elts[i]); -- 2.11.0