From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 81987A0524;
	Fri, 31 Jan 2020 13:51:07 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id A067F1C0C3;
	Fri, 31 Jan 2020 13:51:06 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com
 [67.231.156.173]) by dpdk.org (Postfix) with ESMTP id E72D11C0C2
 for <dev@dpdk.org>; Fri, 31 Jan 2020 13:51:04 +0100 (CET)
Received: from pps.filterd (m0045851.ppops.net [127.0.0.1])
 by mx0b-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id
 00VClkTB027044 for <dev@dpdk.org>; Fri, 31 Jan 2020 04:51:04 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com;
 h=from : to : cc :
 subject : date : message-id : mime-version : content-transfer-encoding :
 content-type; s=pfpt0818; bh=BIV8CcAy98XaXDbrW4gy0Crk/i5fevFclxJUKH0MyyM=;
 b=RxRBsI8NzbEXiYhszuTQOlcCkNfwEfO/WoGMUot0Ezpz5ZiJZDmJ2xh1YhVmlV79Yf4Z
 3F3RE0W+jvuIBovkDvrhOLNfr5o+kFJk5/a0PRrX+VoPx3pt5do55vVyAUTWzIPu795h
 cI+4ekm3PKYtbo+n1XbEsG6GNZT6XQA9saVP6aYpCtD8TXi+We3L6DECHdk+9sEDuiV1
 4PSuD+45IynghCHP9skqDmz8DU4SakSI+U1PuPAYuAlyXdYPerrhKTDe757A+N9KfJXh
 6JEcR8MvNU2VkR48SOjumwfLDRJ+ZhlCYhdGxPwSnMGh+CDf973IA5jEwjMthLDFowTm rg== 
Received: from sc-exch04.marvell.com ([199.233.58.184])
 by mx0b-0016f401.pphosted.com with ESMTP id 2xrp2tkhuk-2
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Fri, 31 Jan 2020 04:51:04 -0800
Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH04.marvell.com
 (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 31 Jan
 2020 04:51:02 -0800
Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com
 (10.93.176.83) with Microsoft SMTP Server id 15.0.1497.2 via Frontend
 Transport; Fri, 31 Jan 2020 04:51:02 -0800
Received: from BG-LT7430.marvell.com (unknown [10.28.17.67])
 by maili.marvell.com (Postfix) with ESMTP id 22CF73F703F;
 Fri, 31 Jan 2020 04:51:00 -0800 (PST)
From: <pbhagavatula@marvell.com>
To: Jerin Jacob <jerinj@marvell.com>, Nithin Dabilpuram
 <ndabilpuram@marvell.com>, Vamsi Attunuru <vattunuru@marvell.com>
CC: <dev@dpdk.org>, Pavan Nikhilesh <pbhagavatula@marvell.com>
Date: Fri, 31 Jan 2020 18:20:58 +0530
Message-ID: <20200131125059.1556-1-pbhagavatula@marvell.com>
X-Mailer: git-send-email 2.17.1
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572
 definitions=2020-01-31_03:2020-01-31,
 2020-01-31 signatures=0
Subject: [dpdk-dev] [PATCH] mempool/octeontx2: optimize for L1D cache
	architecture
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

From: Pavan Nikhilesh <pbhagavatula@marvell.com>

OCTEON TX2 has 8 sets, 41 ways L1D cache, VA<9:7> bits dictate
the set selection.
Add additional padding to ensure that the element size always
occupies odd number of cachelines to ensure even distribution
of elements among L1D cache sets.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 drivers/mempool/octeontx2/otx2_mempool_ops.c | 41 ++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/drivers/mempool/octeontx2/otx2_mempool_ops.c b/drivers/mempool/octeontx2/otx2_mempool_ops.c
index ea4b1c45d..bd71babdb 100644
--- a/drivers/mempool/octeontx2/otx2_mempool_ops.c
+++ b/drivers/mempool/octeontx2/otx2_mempool_ops.c
@@ -641,6 +641,7 @@ otx2_npa_alloc(struct rte_mempool *mp)
 	struct npa_aura_s aura;
 	struct npa_pool_s pool;
 	uint64_t aura_handle;
+	size_t padding;
 	int rc;
 
 	lf = otx2_npa_lf_obj_get();
@@ -650,6 +651,18 @@ otx2_npa_alloc(struct rte_mempool *mp)
 	}
 
 	block_size = mp->elt_size + mp->header_size + mp->trailer_size;
+	/*
+	 * OCTEON TX2 has 8 sets, 41 ways L1D cache, VA<9:7> bits dictate
+	 * the set selection.
+	 * Add additional padding to ensure that the element size always
+	 * occupies odd number of cachelines to ensure even distribution
+	 * of elements among L1D cache sets.
+	 */
+	padding = ((block_size / RTE_CACHE_LINE_SIZE) % 2) ? 0 :
+				RTE_CACHE_LINE_SIZE;
+	mp->trailer_size += padding;
+	block_size += padding;
+
 	block_count = mp->size;
 
 	if (block_size % OTX2_ALIGN != 0) {
@@ -724,12 +737,22 @@ otx2_npa_calc_mem_size(const struct rte_mempool *mp, uint32_t obj_num,
 						align);
 }
 
+static uint8_t
+otx2_npa_l1d_way_set_get(uint64_t iova)
+{
+	return (iova >> rte_log2_u32(RTE_CACHE_LINE_SIZE)) & 0x7;
+}
+
 static int
 otx2_npa_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr,
 		  rte_iova_t iova, size_t len,
 		  rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg)
 {
+#define OTX2_L1D_NB_SETS	8
+	uint64_t distribution[OTX2_L1D_NB_SETS];
+	rte_iova_t start_iova;
 	size_t total_elt_sz;
+	uint8_t set;
 	size_t off;
 
 	if (iova == RTE_BAD_IOVA)
@@ -743,10 +766,28 @@ otx2_npa_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr,
 	if (len < off)
 		return -EINVAL;
 
+
 	vaddr = (char *)vaddr + off;
 	iova += off;
 	len -= off;
 
+	memset(distribution, 0, sizeof(uint64_t) * OTX2_L1D_NB_SETS);
+	start_iova = iova;
+	while (start_iova < iova + len) {
+		set = otx2_npa_l1d_way_set_get(start_iova + mp->header_size);
+		distribution[set]++;
+		start_iova += total_elt_sz;
+	}
+
+	otx2_npa_dbg("iova %lx, aligned iova %lx", iova - off, iova);
+	otx2_npa_dbg("length %ld, aligned length %ld", len + off, len);
+	otx2_npa_dbg("element size %ld", total_elt_sz);
+	otx2_npa_dbg("requested objects %d, possible objects %ld", max_objs,
+			len / total_elt_sz);
+	otx2_npa_dbg("L1D set distribution :");
+	for (int i = 0; i < OTX2_L1D_NB_SETS; i++)
+		otx2_npa_dbg("set[%d] : objects : %ld", i, distribution[i]);
+
 	npa_lf_aura_op_range_set(mp->pool_id, iova, iova + len);
 
 	if (npa_lf_aura_range_update_check(mp->pool_id) < 0)
-- 
2.17.1