From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <aburakov@ecsmtp.ir.intel.com>
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id 1B6121B879
 for <dev@dpdk.org>; Mon,  9 Apr 2018 20:01:36 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga007.fm.intel.com ([10.253.24.52])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 09 Apr 2018 11:01:36 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,427,1517904000"; d="scan'208";a="30772076"
Received: from irvmail001.ir.intel.com ([163.33.26.43])
 by fmsmga007.fm.intel.com with ESMTP; 09 Apr 2018 11:01:32 -0700
Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com
 [10.237.217.45])
 by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id
 w39I1Vkj031136; Mon, 9 Apr 2018 19:01:31 +0100
Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1])
 by sivswdev01.ir.intel.com with ESMTP id w39I1V88027933;
 Mon, 9 Apr 2018 19:01:31 +0100
Received: (from aburakov@localhost)
 by sivswdev01.ir.intel.com with LOCAL id w39I1VKJ027929;
 Mon, 9 Apr 2018 19:01:31 +0100
From: Anatoly Burakov <anatoly.burakov@intel.com>
To: dev@dpdk.org
Cc: keith.wiles@intel.com, jianfeng.tan@intel.com, andras.kovacs@ericsson.com, 
 laszlo.vadkeri@ericsson.com, benjamin.walker@intel.com,
 bruce.richardson@intel.com, thomas@monjalon.net,
 konstantin.ananyev@intel.com, kuralamudhan.ramakrishnan@intel.com,
 louise.m.daly@intel.com, nelio.laranjeiro@6wind.com,
 yskoh@mellanox.com, pepperjo@japf.ch, jerin.jacob@caviumnetworks.com,
 hemant.agrawal@nxp.com, olivier.matz@6wind.com, shreyansh.jain@nxp.com,
 gowrishankar.m@linux.vnet.ibm.com
Date: Mon,  9 Apr 2018 19:00:36 +0100
Message-Id: <b0f2c9478814fd551b584cc19b88e61587729913.1523296700.git.anatoly.burakov@intel.com>
X-Mailer: git-send-email 1.7.0.7
In-Reply-To: <cover.1523296700.git.anatoly.burakov@intel.com>
References: <cover.1523296700.git.anatoly.burakov@intel.com>
In-Reply-To: <cover.1523296700.git.anatoly.burakov@intel.com>
References: <cover.1523218215.git.anatoly.burakov@intel.com>
 <cover.1523296700.git.anatoly.burakov@intel.com>
Subject: [dpdk-dev] [PATCH v5 33/70] vfio/spapr: use memseg walk instead of
	iteration
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Apr 2018 18:01:38 -0000

Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <Santosh.Shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---

Notes:
    v5:
    - Add missing window creation

 lib/librte_eal/linuxapp/eal/eal_vfio.c | 113 ++++++++++++++++++++-------------
 1 file changed, 68 insertions(+), 45 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 2a34ae9..e18e413 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -694,16 +694,69 @@ vfio_type1_dma_map(int vfio_container_fd)
 	return rte_memseg_walk(type1_map, &vfio_container_fd);
 }
 
+struct spapr_walk_param {
+	uint64_t window_size;
+	uint64_t hugepage_sz;
+};
 static int
-vfio_spapr_dma_map(int vfio_container_fd)
+spapr_window_size(const struct rte_memseg *ms, void *arg)
 {
-	const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-	int i, ret;
+	struct spapr_walk_param *param = arg;
+	uint64_t max = ms->iova + ms->len;
+
+	if (max > param->window_size) {
+		param->hugepage_sz = ms->hugepage_sz;
+		param->window_size = max;
+	}
 
+	return 0;
+}
+
+static int
+spapr_map(const struct rte_memseg *ms, void *arg)
+{
+	struct vfio_iommu_type1_dma_map dma_map;
 	struct vfio_iommu_spapr_register_memory reg = {
 		.argsz = sizeof(reg),
 		.flags = 0
 	};
+	int *vfio_container_fd = arg;
+	int ret;
+
+	reg.vaddr = (uintptr_t) ms->addr;
+	reg.size = ms->len;
+	ret = ioctl(*vfio_container_fd,
+		VFIO_IOMMU_SPAPR_REGISTER_MEMORY, &reg);
+	if (ret) {
+		RTE_LOG(ERR, EAL, "  cannot register vaddr for IOMMU, error %i (%s)\n",
+				errno, strerror(errno));
+		return -1;
+	}
+
+	memset(&dma_map, 0, sizeof(dma_map));
+	dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
+	dma_map.vaddr = ms->addr_64;
+	dma_map.size = ms->len;
+	dma_map.iova = ms->iova;
+	dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
+			 VFIO_DMA_MAP_FLAG_WRITE;
+
+	ret = ioctl(*vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
+
+	if (ret) {
+		RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, error %i (%s)\n",
+				errno, strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+vfio_spapr_dma_map(int vfio_container_fd)
+{
+	struct spapr_walk_param param;
+	int ret;
 	struct vfio_iommu_spapr_tce_info info = {
 		.argsz = sizeof(info),
 	};
@@ -714,6 +767,8 @@ vfio_spapr_dma_map(int vfio_container_fd)
 		.argsz = sizeof(remove),
 	};
 
+	memset(&param, 0, sizeof(param));
+
 	/* query spapr iommu info */
 	ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
 	if (ret) {
@@ -732,17 +787,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
 	}
 
 	/* create DMA window from 0 to max(phys_addr + len) */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (ms[i].addr == NULL)
-			break;
-
-		create.window_size = RTE_MAX(create.window_size,
-				ms[i].iova + ms[i].len);
-	}
+	rte_memseg_walk(spapr_window_size, &param);
 
 	/* sPAPR requires window size to be a power of 2 */
-	create.window_size = rte_align64pow2(create.window_size);
-	create.page_shift = __builtin_ctzll(ms->hugepage_sz);
+	create.window_size = rte_align64pow2(param.window_size);
+	create.page_shift = __builtin_ctzll(param.hugepage_sz);
 	create.levels = 1;
 
 	ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
@@ -757,41 +806,15 @@ vfio_spapr_dma_map(int vfio_container_fd)
 		return -1;
 	}
 
-	/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		struct vfio_iommu_type1_dma_map dma_map;
-
-		if (ms[i].addr == NULL)
-			break;
-
-		reg.vaddr = (uintptr_t) ms[i].addr;
-		reg.size = ms[i].len;
-		ret = ioctl(vfio_container_fd,
-			VFIO_IOMMU_SPAPR_REGISTER_MEMORY, &reg);
-		if (ret) {
-			RTE_LOG(ERR, EAL, "  cannot register vaddr for IOMMU, "
-				"error %i (%s)\n", errno, strerror(errno));
-			return -1;
-		}
-
-		memset(&dma_map, 0, sizeof(dma_map));
-		dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
-		dma_map.vaddr = ms[i].addr_64;
-		dma_map.size = ms[i].len;
-		dma_map.iova = ms[i].iova;
-		dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
-				 VFIO_DMA_MAP_FLAG_WRITE;
-
-		ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
-
-		if (ret) {
-			RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
-				"error %i (%s)\n", errno, strerror(errno));
-			return -1;
-		}
-
+	if (vfio_spapr_create_new_dma_window(vfio_container_fd, &create) < 0) {
+		RTE_LOG(ERR, EAL, "Could not create new DMA window\n");
+		return -1;
 	}
 
+	/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
+	if (rte_memseg_walk(spapr_map, &vfio_container_fd) < 0)
+		return -1;
+
 	return 0;
 }
 
-- 
2.7.4