[PATCH] vfio: do not coalesce DMA mappings

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Nipun Gupta <nipun.gupta@amd.com>
To: <dev@dpdk.org>, <thomas@monjalon.net>,
	<anatoly.burakov@intel.com>, <ferruh.yigit@amd.com>
Cc: <nikhil.agarwal@amd.com>, Nipun Gupta <nipun.gupta@amd.com>
Subject: [PATCH] vfio: do not coalesce DMA mappings
Date: Fri, 30 Dec 2022 15:28:53 +0530	[thread overview]
Message-ID: <20221230095853.1323616-1-nipun.gupta@amd.com> (raw)

At the cleanup time when dma unmap is done, linux kernel
does not allow unmap of individual segments which were
coalesced together while creating the DMA map for type1 IOMMU
mappings. So, this change updates the mapping of the memory
segments(hugepages) on a per-page basis.

Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
---

When hotplug of devices is used, multiple pages gets colaeced
and a single mapping gets created for these pages (using APIs
rte_memseg_contig_walk() and type1_map_contig(). On the cleanup
time when the memory is released, the VFIO does not cleans up
that memory and following error is observed in the eal for 2MB
hugepages:
EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152

This is because VFIO does not clear the DMA (refer API
vfio_dma_do_unmap() -
https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.c#L1330),
where it checks the dma mapping where it checks for IOVA to free:
https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.c#L1418.

Thus this change updates the mapping to be created individually
instead of colaecing them.

 lib/eal/linux/eal_vfio.c | 29 -----------------------------
 1 file changed, 29 deletions(-)

diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c
index 549b86ae1d..56edccb0db 100644
--- a/lib/eal/linux/eal_vfio.c
+++ b/lib/eal/linux/eal_vfio.c
@@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base,
 	return 1;
 }

-static int
-type1_map_contig(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
-		size_t len, void *arg)
-{
-	int *vfio_container_fd = arg;
-
-	if (msl->external)
-		return 0;
-
-	return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova,
-			len, 1);
-}
-
 static int
 type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
 		void *arg)
@@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
 	if (ms->iova == RTE_BAD_IOVA)
 		return 0;

-	/* if IOVA mode is VA, we've already mapped the internal segments */
-	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
-		return 0;
-
 	return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova,
 			ms->len, 1);
 }
@@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd, uint64_t vaddr, uint64_t iova,
 static int
 vfio_type1_dma_map(int vfio_container_fd)
 {
-	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
-		/* with IOVA as VA mode, we can get away with mapping contiguous
-		 * chunks rather than going page-by-page.
-		 */
-		int ret = rte_memseg_contig_walk(type1_map_contig,
-				&vfio_container_fd);
-		if (ret)
-			return ret;
-		/* we have to continue the walk because we've skipped the
-		 * external segments during the config walk.
-		 */
-	}
 	return rte_memseg_walk(type1_map, &vfio_container_fd);
 }

-- 
2.25.1

next             reply	other threads:[~2022-12-30  9:59 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  9:58 Nipun Gupta [this message]
2023-01-04  5:19 ` [PATCH v2] " Nipun Gupta
2023-02-02 10:48   ` David Marchand
2023-02-07  8:56     ` Gupta, Nipun
2023-04-04 14:57       ` Gupta, Nipun
2023-04-04 15:13       ` Burakov, Anatoly
2023-04-04 16:32         ` Nipun Gupta
2023-04-05 14:16           ` Burakov, Anatoly
2023-04-05 15:06             ` Gupta, Nipun
2023-04-24 15:22             ` David Marchand
2023-04-24 16:10               ` Stephen Hemminger
2023-04-24 16:16                 ` Gupta, Nipun
2023-05-10 12:58               ` Nipun Gupta
2023-05-11 14:08                 ` Burakov, Anatoly
2023-04-11 15:13           ` Thomas Monjalon
2023-04-11 16:51             ` Gupta, Nipun
2023-04-07  6:13         ` Nipun Gupta
2023-04-07  7:26           ` David Marchand
2023-05-15 11:16   ` David Marchand
2023-06-29  8:21 ` [PATCH] " Ding, Xuan
2023-06-30  1:45   ` Nipun Gupta
2023-06-30  5:58     ` Ding, Xuan
2023-07-04  5:13       ` Ding, Xuan
2023-07-04  6:53         ` Gupta, Nipun
2023-07-04  8:06           ` Ding, Xuan
2023-07-04  9:23             ` Gupta, Nipun
2023-07-04 14:09               ` Thomas Monjalon
2023-07-04 16:39                 ` Gupta, Nipun
2023-07-04 15:11               ` Ding, Xuan
2023-07-04 16:42                 ` Gupta, Nipun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221230095853.1323616-1-nipun.gupta@amd.com \
    --to=nipun.gupta@amd.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=nikhil.agarwal@amd.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).