DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ding, Xuan" <xuan.ding@intel.com>
To: Nipun Gupta <nipun.gupta@amd.com>, "dev@dpdk.org" <dev@dpdk.org>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>
Cc: "nikhil.agarwal@amd.com" <nikhil.agarwal@amd.com>,
	"He, Xingguang" <xingguang.he@intel.com>,
	"Ling, WeiX" <weix.ling@intel.com>
Subject: RE: [PATCH] vfio: do not coalesce DMA mappings
Date: Thu, 29 Jun 2023 08:21:14 +0000	[thread overview]
Message-ID: <BN9PR11MB55137CBE3281E8EAC1CB8BB5E725A@BN9PR11MB5513.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20221230095853.1323616-1-nipun.gupta@amd.com>

Hi Nipun,

I'd like to appreciate your time reading this email.

Our QA team found that since this commit "a399d7b5a994: do not coalesce DMA mappings" is introduced,
the dpdk testpmd start with "--no-huge" parameters will failed, and shows "EAL: Cannot set up DMA remapping, error 28 (No space left on device)".
So they reported it on dpdk Bugzilla: https://bugs.dpdk.org/show_bug.cgi?id=1235.

I understand this feature is to keep consistent with the kernel and not allow memory segments be merged.
The side effect is the testpmd with "--no-huge" parameters will not be able to start because the too many pages will exceed the capability of IOMMU.
Is it expected? Should we remove the --no-huge" in our testcase?

Regards,
Xuan

> -----Original Message-----
> From: Nipun Gupta <nipun.gupta@amd.com>
> Sent: Friday, December 30, 2022 5:59 PM
> To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly
> <anatoly.burakov@intel.com>; ferruh.yigit@amd.com
> Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com>
> Subject: [PATCH] vfio: do not coalesce DMA mappings
> 
> At the cleanup time when dma unmap is done, linux kernel does not allow
> unmap of individual segments which were coalesced together while creating
> the DMA map for type1 IOMMU mappings. So, this change updates the
> mapping of the memory
> segments(hugepages) on a per-page basis.
> 
> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
> ---
> 
> When hotplug of devices is used, multiple pages gets colaeced and a single
> mapping gets created for these pages (using APIs
> rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time
> when the memory is released, the VFIO does not cleans up that memory and
> following error is observed in the eal for 2MB
> hugepages:
> EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152
> 
> This is because VFIO does not clear the DMA (refer API
> vfio_dma_do_unmap() -
> https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.
> c#L1330),
> where it checks the dma mapping where it checks for IOVA to free:
> https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.
> c#L1418.
> 
> Thus this change updates the mapping to be created individually instead of
> colaecing them.
> 
>  lib/eal/linux/eal_vfio.c | 29 -----------------------------
>  1 file changed, 29 deletions(-)
> 
> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index
> 549b86ae1d..56edccb0db 100644
> --- a/lib/eal/linux/eal_vfio.c
> +++ b/lib/eal/linux/eal_vfio.c
> @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base,
>  	return 1;
>  }
> 
> -static int
> -type1_map_contig(const struct rte_memseg_list *msl, const struct
> rte_memseg *ms,
> -		size_t len, void *arg)
> -{
> -	int *vfio_container_fd = arg;
> -
> -	if (msl->external)
> -		return 0;
> -
> -	return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64,
> ms->iova,
> -			len, 1);
> -}
> -
>  static int
>  type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
>  		void *arg)
> @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl,
> const struct rte_memseg *ms,
>  	if (ms->iova == RTE_BAD_IOVA)
>  		return 0;
> 
> -	/* if IOVA mode is VA, we've already mapped the internal segments */
> -	if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
> -		return 0;
> -
>  	return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64,
> ms->iova,
>  			ms->len, 1);
>  }
> @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd,
> uint64_t vaddr, uint64_t iova,  static int  vfio_type1_dma_map(int
> vfio_container_fd)  {
> -	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
> -		/* with IOVA as VA mode, we can get away with mapping
> contiguous
> -		 * chunks rather than going page-by-page.
> -		 */
> -		int ret = rte_memseg_contig_walk(type1_map_contig,
> -				&vfio_container_fd);
> -		if (ret)
> -			return ret;
> -		/* we have to continue the walk because we've skipped the
> -		 * external segments during the config walk.
> -		 */
> -	}
>  	return rte_memseg_walk(type1_map, &vfio_container_fd);  }
> 
> --
> 2.25.1


  parent reply	other threads:[~2023-06-29  8:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  9:58 Nipun Gupta
2023-01-04  5:19 ` [PATCH v2] " Nipun Gupta
2023-02-02 10:48   ` David Marchand
2023-02-07  8:56     ` Gupta, Nipun
2023-04-04 14:57       ` Gupta, Nipun
2023-04-04 15:13       ` Burakov, Anatoly
2023-04-04 16:32         ` Nipun Gupta
2023-04-05 14:16           ` Burakov, Anatoly
2023-04-05 15:06             ` Gupta, Nipun
2023-04-24 15:22             ` David Marchand
2023-04-24 16:10               ` Stephen Hemminger
2023-04-24 16:16                 ` Gupta, Nipun
2023-05-10 12:58               ` Nipun Gupta
2023-05-11 14:08                 ` Burakov, Anatoly
2023-04-11 15:13           ` Thomas Monjalon
2023-04-11 16:51             ` Gupta, Nipun
2023-04-07  6:13         ` Nipun Gupta
2023-04-07  7:26           ` David Marchand
2023-05-15 11:16   ` David Marchand
2023-06-29  8:21 ` Ding, Xuan [this message]
2023-06-30  1:45   ` [PATCH] " Nipun Gupta
2023-06-30  5:58     ` Ding, Xuan
2023-07-04  5:13       ` Ding, Xuan
2023-07-04  6:53         ` Gupta, Nipun
2023-07-04  8:06           ` Ding, Xuan
2023-07-04  9:23             ` Gupta, Nipun
2023-07-04 14:09               ` Thomas Monjalon
2023-07-04 16:39                 ` Gupta, Nipun
2023-07-04 15:11               ` Ding, Xuan
2023-07-04 16:42                 ` Gupta, Nipun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR11MB55137CBE3281E8EAC1CB8BB5E725A@BN9PR11MB5513.namprd11.prod.outlook.com \
    --to=xuan.ding@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=nikhil.agarwal@amd.com \
    --cc=nipun.gupta@amd.com \
    --cc=thomas@monjalon.net \
    --cc=weix.ling@intel.com \
    --cc=xingguang.he@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).