* [PATCH] vfio: do not coalesce DMA mappings @ 2022-12-30 9:58 Nipun Gupta 2023-01-04 5:19 ` [PATCH v2] " Nipun Gupta 2023-06-29 8:21 ` [PATCH] " Ding, Xuan 0 siblings, 2 replies; 30+ messages in thread From: Nipun Gupta @ 2022-12-30 9:58 UTC (permalink / raw) To: dev, thomas, anatoly.burakov, ferruh.yigit; +Cc: nikhil.agarwal, Nipun Gupta At the cleanup time when dma unmap is done, linux kernel does not allow unmap of individual segments which were coalesced together while creating the DMA map for type1 IOMMU mappings. So, this change updates the mapping of the memory segments(hugepages) on a per-page basis. Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> --- When hotplug of devices is used, multiple pages gets colaeced and a single mapping gets created for these pages (using APIs rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time when the memory is released, the VFIO does not cleans up that memory and following error is observed in the eal for 2MB hugepages: EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 This is because VFIO does not clear the DMA (refer API vfio_dma_do_unmap() - https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.c#L1330), where it checks the dma mapping where it checks for IOVA to free: https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.c#L1418. Thus this change updates the mapping to be created individually instead of colaecing them. lib/eal/linux/eal_vfio.c | 29 ----------------------------- 1 file changed, 29 deletions(-) diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index 549b86ae1d..56edccb0db 100644 --- a/lib/eal/linux/eal_vfio.c +++ b/lib/eal/linux/eal_vfio.c @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base, return 1; } -static int -type1_map_contig(const struct rte_memseg_list *msl, const struct rte_memseg *ms, - size_t len, void *arg) -{ - int *vfio_container_fd = arg; - - if (msl->external) - return 0; - - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova, - len, 1); -} - static int type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, void *arg) @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, if (ms->iova == RTE_BAD_IOVA) return 0; - /* if IOVA mode is VA, we've already mapped the internal segments */ - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) - return 0; - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova, ms->len, 1); } @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd, uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int vfio_container_fd) { - if (rte_eal_iova_mode() == RTE_IOVA_VA) { - /* with IOVA as VA mode, we can get away with mapping contiguous - * chunks rather than going page-by-page. - */ - int ret = rte_memseg_contig_walk(type1_map_contig, - &vfio_container_fd); - if (ret) - return ret; - /* we have to continue the walk because we've skipped the - * external segments during the config walk. - */ - } return rte_memseg_walk(type1_map, &vfio_container_fd); } -- 2.25.1 ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2] vfio: do not coalesce DMA mappings 2022-12-30 9:58 [PATCH] vfio: do not coalesce DMA mappings Nipun Gupta @ 2023-01-04 5:19 ` Nipun Gupta 2023-02-02 10:48 ` David Marchand 2023-05-15 11:16 ` David Marchand 2023-06-29 8:21 ` [PATCH] " Ding, Xuan 1 sibling, 2 replies; 30+ messages in thread From: Nipun Gupta @ 2023-01-04 5:19 UTC (permalink / raw) To: dev, thomas, anatoly.burakov, ferruh.yigit; +Cc: nikhil.agarwal, Nipun Gupta At the cleanup time when dma unmap is done, linux kernel does not allow unmap of individual segments which were coalesced together while creating the DMA map for type1 IOMMU mappings. So, this change updates the mapping of the memory segments(hugepages) on a per-page basis. Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com> --- Changes in v2: - Fix checkpatch errors by updated mailmap .mailmap | 4 ++-- lib/eal/linux/eal_vfio.c | 29 ----------------------------- 2 files changed, 2 insertions(+), 31 deletions(-) diff --git a/.mailmap b/.mailmap index 75884b6fe2..a234c9b3de 100644 --- a/.mailmap +++ b/.mailmap @@ -954,7 +954,7 @@ Nicolas Chautru <nicolas.chautru@intel.com> Nicolas Dichtel <nicolas.dichtel@6wind.com> Nicolas Harnois <nicolas.harnois@6wind.com> Nicolás Pernas Maradei <nico@emutex.com> <nicolas.pernas.maradei@emutex.com> -Nikhil Agarwal <nikhil.agarwal@linaro.org> +Nikhil Agarwal <nikhil.agarwal@amd.com> <nikhil.agarwal@xilinx.com> <nikhil.agarwal@linaro.org> Nikhil Jagtap <nikhil.jagtap@gmail.com> Nikhil Rao <nikhil.rao@intel.com> Nikhil Vasoya <nikhil.vasoya@chelsio.com> @@ -962,7 +962,7 @@ Nikita Kozlov <nikita@elyzion.net> Niklas Söderlund <niklas.soderlund@corigine.com> Nikolay Nikolaev <nicknickolaev@gmail.com> Ning Li <muziding001@163.com> <lining18@jd.com> -Nipun Gupta <nipun.gupta@nxp.com> +Nipun Gupta <nipun.gupta@amd.com> <nipun.gupta@xilinx.com> <nipun.gupta@nxp.com> Nir Efrati <nir.efrati@intel.com> Nirmoy Das <ndas@suse.de> Nithin Dabilpuram <ndabilpuram@marvell.com> <nithin.dabilpuram@caviumnetworks.com> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index 549b86ae1d..56edccb0db 100644 --- a/lib/eal/linux/eal_vfio.c +++ b/lib/eal/linux/eal_vfio.c @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base, return 1; } -static int -type1_map_contig(const struct rte_memseg_list *msl, const struct rte_memseg *ms, - size_t len, void *arg) -{ - int *vfio_container_fd = arg; - - if (msl->external) - return 0; - - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova, - len, 1); -} - static int type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, void *arg) @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, if (ms->iova == RTE_BAD_IOVA) return 0; - /* if IOVA mode is VA, we've already mapped the internal segments */ - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) - return 0; - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, ms->iova, ms->len, 1); } @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd, uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int vfio_container_fd) { - if (rte_eal_iova_mode() == RTE_IOVA_VA) { - /* with IOVA as VA mode, we can get away with mapping contiguous - * chunks rather than going page-by-page. - */ - int ret = rte_memseg_contig_walk(type1_map_contig, - &vfio_container_fd); - if (ret) - return ret; - /* we have to continue the walk because we've skipped the - * external segments during the config walk. - */ - } return rte_memseg_walk(type1_map, &vfio_container_fd); } -- 2.25.1 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-01-04 5:19 ` [PATCH v2] " Nipun Gupta @ 2023-02-02 10:48 ` David Marchand 2023-02-07 8:56 ` Gupta, Nipun 2023-05-15 11:16 ` David Marchand 1 sibling, 1 reply; 30+ messages in thread From: David Marchand @ 2023-02-02 10:48 UTC (permalink / raw) To: anatoly.burakov, Nipun Gupta; +Cc: dev, thomas, ferruh.yigit, nikhil.agarwal Hi, On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> wrote: > > At the cleanup time when dma unmap is done, linux kernel > does not allow unmap of individual segments which were > coalesced together while creating the DMA map for type1 IOMMU > mappings. So, this change updates the mapping of the memory > segments(hugepages) on a per-page basis. > > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com> This change is scary. I won't take it without a review from the maintainer. Anatoly, can you have a look? Thanks. -- David Marchand ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH v2] vfio: do not coalesce DMA mappings 2023-02-02 10:48 ` David Marchand @ 2023-02-07 8:56 ` Gupta, Nipun 2023-04-04 14:57 ` Gupta, Nipun 2023-04-04 15:13 ` Burakov, Anatoly 0 siblings, 2 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-02-07 8:56 UTC (permalink / raw) To: David Marchand, anatoly.burakov Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil [AMD Official Use Only - General] Hi David, I agree that change is not straightforward to review, but it should not cause any functional issue as we are still creating all the memory mappings, but one by one for each segment. For hot plug case this causes issue as mentioned, that VFIO does not allow unmap of the individual segments in case mapping was created of a single coalesced segment. But yes, I am not sure why this code was added, which Anatoly may have more understanding on. Anatoly, Can you please provide your feedback on this change? Thanks, Nipun > -----Original Message----- > From: David Marchand <david.marchand@redhat.com> > Sent: Thursday, February 2, 2023 4:19 PM > To: anatoly.burakov@intel.com; Gupta, Nipun <Nipun.Gupta@amd.com> > Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh > <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com> > Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings > > Caution: This message originated from an External Source. Use proper caution > when opening attachments, clicking links, or responding. > > > Hi, > > On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> wrote: > > > > At the cleanup time when dma unmap is done, linux kernel > > does not allow unmap of individual segments which were > > coalesced together while creating the DMA map for type1 IOMMU > > mappings. So, this change updates the mapping of the memory > > segments(hugepages) on a per-page basis. > > > > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > > Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com> > > This change is scary. > > I won't take it without a review from the maintainer. > Anatoly, can you have a look? > > > Thanks. > -- > David Marchand ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH v2] vfio: do not coalesce DMA mappings 2023-02-07 8:56 ` Gupta, Nipun @ 2023-04-04 14:57 ` Gupta, Nipun 2023-04-04 15:13 ` Burakov, Anatoly 1 sibling, 0 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-04-04 14:57 UTC (permalink / raw) To: David Marchand, anatoly.burakov Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil, Gupta, Nipun Hi David, As the DPDK 23.03 release is out can we have a relook at this change? Thanks, Nipun > -----Original Message----- > From: Gupta, Nipun > Sent: Tuesday, February 7, 2023 2:27 PM > To: David Marchand <david.marchand@redhat.com>; > anatoly.burakov@intel.com > Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh > <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com> > Subject: RE: [PATCH v2] vfio: do not coalesce DMA mappings > > [AMD Official Use Only - General] > > Hi David, > > I agree that change is not straightforward to review, but it should not cause any > functional issue as we are still creating all the memory mappings, but one by one > for each segment. > For hot plug case this causes issue as mentioned, that VFIO does not allow > unmap of the individual segments in case mapping was created of a single > coalesced segment. > > But yes, I am not sure why this code was added, which Anatoly may have more > understanding on. > > Anatoly, > > Can you please provide your feedback on this change? > > Thanks, > Nipun > > > -----Original Message----- > > From: David Marchand <david.marchand@redhat.com> > > Sent: Thursday, February 2, 2023 4:19 PM > > To: anatoly.burakov@intel.com; Gupta, Nipun <Nipun.Gupta@amd.com> > > Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh > > <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com> > > Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings > > > > Caution: This message originated from an External Source. Use proper caution > > when opening attachments, clicking links, or responding. > > > > > > Hi, > > > > On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> > wrote: > > > > > > At the cleanup time when dma unmap is done, linux kernel > > > does not allow unmap of individual segments which were > > > coalesced together while creating the DMA map for type1 IOMMU > > > mappings. So, this change updates the mapping of the memory > > > segments(hugepages) on a per-page basis. > > > > > > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > > > Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com> > > > > This change is scary. > > > > I won't take it without a review from the maintainer. > > Anatoly, can you have a look? > > > > > > Thanks. > > -- > > David Marchand ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-02-07 8:56 ` Gupta, Nipun 2023-04-04 14:57 ` Gupta, Nipun @ 2023-04-04 15:13 ` Burakov, Anatoly 2023-04-04 16:32 ` Nipun Gupta 2023-04-07 6:13 ` Nipun Gupta 1 sibling, 2 replies; 30+ messages in thread From: Burakov, Anatoly @ 2023-04-04 15:13 UTC (permalink / raw) To: Gupta, Nipun, David Marchand; +Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On 2/7/2023 8:56 AM, Gupta, Nipun wrote: > [AMD Official Use Only - General] > > Hi David, > > I agree that change is not straightforward to review, but it should not cause any functional issue as we are still creating all the memory mappings, but one by one for each segment. > For hot plug case this causes issue as mentioned, that VFIO does not allow unmap of the individual segments in case mapping was created of a single coalesced segment. > > But yes, I am not sure why this code was added, which Anatoly may have more understanding on. The motivation behind this code was that Linux allows limited amount of page mappings, so we were trying to save on those. However, since then there have been a few changes related to partial unmaps that may make it so that this code is not only no longer necessary, but is in fact actively harmful. I agree that this at least warrants a second look. > > Anatoly, > > Can you please provide your feedback on this change? The patch probably shouldn't include the mailmap changes :) Could you please provide some steps to reproduce the hotplug issue you're having? It would be great to have a test case for this patchset to put it in context. -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-04 15:13 ` Burakov, Anatoly @ 2023-04-04 16:32 ` Nipun Gupta 2023-04-05 14:16 ` Burakov, Anatoly 2023-04-11 15:13 ` Thomas Monjalon 2023-04-07 6:13 ` Nipun Gupta 1 sibling, 2 replies; 30+ messages in thread From: Nipun Gupta @ 2023-04-04 16:32 UTC (permalink / raw) To: Burakov, Anatoly, David Marchand Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On 4/4/2023 8:43 PM, Burakov, Anatoly wrote: > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > On 2/7/2023 8:56 AM, Gupta, Nipun wrote: >> [AMD Official Use Only - General] >> >> Hi David, >> >> I agree that change is not straightforward to review, but it should >> not cause any functional issue as we are still creating all the memory >> mappings, but one by one for each segment. >> For hot plug case this causes issue as mentioned, that VFIO does not >> allow unmap of the individual segments in case mapping was created of >> a single coalesced segment. >> >> But yes, I am not sure why this code was added, which Anatoly may have >> more understanding on. > > The motivation behind this code was that Linux allows limited amount of > page mappings, so we were trying to save on those. However, since then > there have been a few changes related to partial unmaps that may make it > so that this code is not only no longer necessary, but is in fact > actively harmful. I agree that this at least warrants a second look. > >> >> Anatoly, >> >> Can you please provide your feedback on this change? > > The patch probably shouldn't include the mailmap changes :) Sure, will send a separate patch for it. > > Could you please provide some steps to reproduce the hotplug issue > you're having? It would be great to have a test case for this patchset > to put it in context. I am working on CDX bus (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug. The test is as follows: # Run testpmd application ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1 # Bind to VFIO echo "vfio-cdx" > /sys/bus/cdx/devices/cdx-00\:00/driver_override echo "cdx-00:00" > /sys/bus/cdx/drivers_probe # Plug a device testpmd> port attach cdx:cdx-00:00 #quit testpmd testpmd> quit This gave error at testpmd exit that memory cannot be freed. On debugging I updated this code and seems it should be seen with any of the device. I see similar test case (without quit) mentioned https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the difference is that it is with igb_uio and issue is being observed with VFIO. Please note the device/bus mentioned in the commands is not yet upstreamed in DPDK, but patches would be sent out soon. Thanks, Nipun > > -- > Thanks, > Anatoly > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-04 16:32 ` Nipun Gupta @ 2023-04-05 14:16 ` Burakov, Anatoly 2023-04-05 15:06 ` Gupta, Nipun 2023-04-24 15:22 ` David Marchand 2023-04-11 15:13 ` Thomas Monjalon 1 sibling, 2 replies; 30+ messages in thread From: Burakov, Anatoly @ 2023-04-05 14:16 UTC (permalink / raw) To: Nipun Gupta, David Marchand; +Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On 4/4/2023 5:32 PM, Nipun Gupta wrote: > > > On 4/4/2023 8:43 PM, Burakov, Anatoly wrote: >> Caution: This message originated from an External Source. Use proper >> caution when opening attachments, clicking links, or responding. >> >> >> On 2/7/2023 8:56 AM, Gupta, Nipun wrote: >>> [AMD Official Use Only - General] >>> >>> Hi David, >>> >>> I agree that change is not straightforward to review, but it should >>> not cause any functional issue as we are still creating all the >>> memory mappings, but one by one for each segment. >>> For hot plug case this causes issue as mentioned, that VFIO does not >>> allow unmap of the individual segments in case mapping was created of >>> a single coalesced segment. >>> >>> But yes, I am not sure why this code was added, which Anatoly may >>> have more understanding on. >> >> The motivation behind this code was that Linux allows limited amount of >> page mappings, so we were trying to save on those. However, since then >> there have been a few changes related to partial unmaps that may make it >> so that this code is not only no longer necessary, but is in fact >> actively harmful. I agree that this at least warrants a second look. >> >>> >>> Anatoly, >>> >>> Can you please provide your feedback on this change? >> >> The patch probably shouldn't include the mailmap changes :) > > Sure, will send a separate patch for it. > >> >> Could you please provide some steps to reproduce the hotplug issue >> you're having? It would be great to have a test case for this patchset >> to put it in context. > > I am working on CDX bus > (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug. > > The test is as follows: > # Run testpmd application > ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1 > > # Bind to VFIO > echo "vfio-cdx" > /sys/bus/cdx/devices/cdx-00\:00/driver_override > echo "cdx-00:00" > /sys/bus/cdx/drivers_probe > > # Plug a device > testpmd> port attach cdx:cdx-00:00 > > #quit testpmd > testpmd> quit > > This gave error at testpmd exit that memory cannot be freed. On > debugging I updated this code and seems it should be seen with any of > the device. > > I see similar test case (without quit) mentioned > https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the > difference is that it is with igb_uio and issue is being observed with > VFIO. > > Please note the device/bus mentioned in the commands is not yet > upstreamed in DPDK, but patches would be sent out soon. > > Thanks, > Nipun > Thanks, I can reproduce this issue with regular devices too (run testpmd with no devices, bind a NIC to VFIO, attach it, then quit). You're correct in that since the initial mapping was done with mapping large contiguous zones (such as when mempools are created before attach), any subsequent freeing of memory will cause these errors to happen. I don't think this can be fixed by anything other than not doing the contiguous mapping thing, so provisionally, I think this patch should be accepted. I'll play around with it some more and get back to you :) >> >> -- >> Thanks, >> Anatoly >> -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-05 14:16 ` Burakov, Anatoly @ 2023-04-05 15:06 ` Gupta, Nipun 2023-04-24 15:22 ` David Marchand 1 sibling, 0 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-04-05 15:06 UTC (permalink / raw) To: Burakov, Anatoly, David Marchand Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil [AMD Official Use Only - General] > -----Original Message----- > From: Burakov, Anatoly <anatoly.burakov@intel.com> > Sent: Wednesday, April 5, 2023 7:46 PM > To: Gupta, Nipun <Nipun.Gupta@amd.com>; David Marchand > <david.marchand@redhat.com> > Cc: dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh > <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com> > Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings > > Caution: This message originated from an External Source. Use proper caution > when opening attachments, clicking links, or responding. > > > On 4/4/2023 5:32 PM, Nipun Gupta wrote: > > > > > > On 4/4/2023 8:43 PM, Burakov, Anatoly wrote: > >> Caution: This message originated from an External Source. Use proper > >> caution when opening attachments, clicking links, or responding. > >> > >> > >> On 2/7/2023 8:56 AM, Gupta, Nipun wrote: > >>> [AMD Official Use Only - General] > >>> > >>> Hi David, > >>> > >>> I agree that change is not straightforward to review, but it should > >>> not cause any functional issue as we are still creating all the > >>> memory mappings, but one by one for each segment. > >>> For hot plug case this causes issue as mentioned, that VFIO does not > >>> allow unmap of the individual segments in case mapping was created of > >>> a single coalesced segment. > >>> > >>> But yes, I am not sure why this code was added, which Anatoly may > >>> have more understanding on. > >> > >> The motivation behind this code was that Linux allows limited amount of > >> page mappings, so we were trying to save on those. However, since then > >> there have been a few changes related to partial unmaps that may make it > >> so that this code is not only no longer necessary, but is in fact > >> actively harmful. I agree that this at least warrants a second look. > >> > >>> > >>> Anatoly, > >>> > >>> Can you please provide your feedback on this change? > >> > >> The patch probably shouldn't include the mailmap changes :) > > > > Sure, will send a separate patch for it. > > > >> > >> Could you please provide some steps to reproduce the hotplug issue > >> you're having? It would be great to have a test case for this patchset > >> to put it in context. > > > > I am working on CDX bus > > (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2- > nipun.gupta@amd.com/) and trying out some cases for plug/unplug. > > > > The test is as follows: > > # Run testpmd application > > ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1 > > > > # Bind to VFIO > > echo "vfio-cdx" > /sys/bus/cdx/devices/cdx-00\:00/driver_override > > echo "cdx-00:00" > /sys/bus/cdx/drivers_probe > > > > # Plug a device > > testpmd> port attach cdx:cdx-00:00 > > > > #quit testpmd > > testpmd> quit > > > > This gave error at testpmd exit that memory cannot be freed. On > > debugging I updated this code and seems it should be seen with any of > > the device. > > > > I see similar test case (without quit) mentioned > > https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the > > difference is that it is with igb_uio and issue is being observed with > > VFIO. > > > > Please note the device/bus mentioned in the commands is not yet > > upstreamed in DPDK, but patches would be sent out soon. > > > > Thanks, > > Nipun > > > > Thanks, I can reproduce this issue with regular devices too (run testpmd > with no devices, bind a NIC to VFIO, attach it, then quit). You're > correct in that since the initial mapping was done with mapping large > contiguous zones (such as when mempools are created before attach), any > subsequent freeing of memory will cause these errors to happen. Thanks for trying on other devices too, and good to know that this is also reproduced with them :) Regards, Nipun > > I don't think this can be fixed by anything other than not doing the > contiguous mapping thing, so provisionally, I think this patch should be > accepted. I'll play around with it some more and get back to you :) > > >> > >> -- > >> Thanks, > >> Anatoly > >> > > -- > Thanks, > Anatoly ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-05 14:16 ` Burakov, Anatoly 2023-04-05 15:06 ` Gupta, Nipun @ 2023-04-24 15:22 ` David Marchand 2023-04-24 16:10 ` Stephen Hemminger 2023-05-10 12:58 ` Nipun Gupta 1 sibling, 2 replies; 30+ messages in thread From: David Marchand @ 2023-04-24 15:22 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: Nipun Gupta, dev, thomas, Yigit, Ferruh, Agarwal, Nikhil Hello Anatoly, On Wed, Apr 5, 2023 at 4:17 PM Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > >> Could you please provide some steps to reproduce the hotplug issue > >> you're having? It would be great to have a test case for this patchset > >> to put it in context. > > > > I am working on CDX bus > > (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug. > > > > The test is as follows: > > # Run testpmd application > > ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1 > > > > # Bind to VFIO > > echo "vfio-cdx" > /sys/bus/cdx/devices/cdx-00\:00/driver_override > > echo "cdx-00:00" > /sys/bus/cdx/drivers_probe > > > > # Plug a device > > testpmd> port attach cdx:cdx-00:00 > > > > #quit testpmd > > testpmd> quit > > > > This gave error at testpmd exit that memory cannot be freed. On > > debugging I updated this code and seems it should be seen with any of > > the device. > > > > I see similar test case (without quit) mentioned > > https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the > > difference is that it is with igb_uio and issue is being observed with > > VFIO. > > > > Please note the device/bus mentioned in the commands is not yet > > upstreamed in DPDK, but patches would be sent out soon. > > > > Thanks, > > Nipun > > > > Thanks, I can reproduce this issue with regular devices too (run testpmd > with no devices, bind a NIC to VFIO, attach it, then quit). You're > correct in that since the initial mapping was done with mapping large > contiguous zones (such as when mempools are created before attach), any > subsequent freeing of memory will cause these errors to happen. > > I don't think this can be fixed by anything other than not doing the > contiguous mapping thing, so provisionally, I think this patch should be > accepted. I'll play around with it some more and get back to you :) Can we conclude on this topic? It is best we merge this kind of change the sooner possible for a release. Thanks. -- David Marchand ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-24 15:22 ` David Marchand @ 2023-04-24 16:10 ` Stephen Hemminger 2023-04-24 16:16 ` Gupta, Nipun 2023-05-10 12:58 ` Nipun Gupta 1 sibling, 1 reply; 30+ messages in thread From: Stephen Hemminger @ 2023-04-24 16:10 UTC (permalink / raw) To: David Marchand Cc: Burakov, Anatoly, Nipun Gupta, dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On Mon, 24 Apr 2023 17:22:46 +0200 David Marchand <david.marchand@redhat.com> wrote: > > > > > > > Thanks, I can reproduce this issue with regular devices too (run testpmd > > with no devices, bind a NIC to VFIO, attach it, then quit). You're > > correct in that since the initial mapping was done with mapping large > > contiguous zones (such as when mempools are created before attach), any > > subsequent freeing of memory will cause these errors to happen. > > > > I don't think this can be fixed by anything other than not doing the > > contiguous mapping thing, so provisionally, I think this patch should be > > accepted. I'll play around with it some more and get back to you :) > > Can we conclude on this topic? > It is best we merge this kind of change the sooner possible for a release. > > Thanks. Shouldn't the coalesced mappings be able to have correct datastructure (accounting) so that on shutdown the unmap's are done for the right size? ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-24 16:10 ` Stephen Hemminger @ 2023-04-24 16:16 ` Gupta, Nipun 0 siblings, 0 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-04-24 16:16 UTC (permalink / raw) To: Stephen Hemminger, David Marchand Cc: Burakov, Anatoly, dev, thomas, Yigit, Ferruh, Agarwal, Nikhil > -----Original Message----- > From: Stephen Hemminger <stephen@networkplumber.org> > Sent: Monday, April 24, 2023 9:41 PM > To: David Marchand <david.marchand@redhat.com> > Cc: Burakov, Anatoly <anatoly.burakov@intel.com>; Gupta, Nipun > <Nipun.Gupta@amd.com>; dev@dpdk.org; thomas@monjalon.net; Yigit, Ferruh > <Ferruh.Yigit@amd.com>; Agarwal, Nikhil <nikhil.agarwal@amd.com> > Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings > > > On Mon, 24 Apr 2023 17:22:46 +0200 > David Marchand <david.marchand@redhat.com> wrote: > > > > > > > > > > > Thanks, I can reproduce this issue with regular devices too (run testpmd > > > with no devices, bind a NIC to VFIO, attach it, then quit). You're > > > correct in that since the initial mapping was done with mapping large > > > contiguous zones (such as when mempools are created before attach), any > > > subsequent freeing of memory will cause these errors to happen. > > > > > > I don't think this can be fixed by anything other than not doing the > > > contiguous mapping thing, so provisionally, I think this patch should be > > > accepted. I'll play around with it some more and get back to you :) > > > > Can we conclude on this topic? > > It is best we merge this kind of change the sooner possible for a release. > > > > Thanks. > > Shouldn't the coalesced mappings be able to have correct datastructure > (accounting) so that on shutdown the unmap's are done for the right size? This issue occurs only on the hotplug case. Other devices which are not hot plugged and are existing from the start of the application need to have individual (non- coalesced) mappings. So individual (non-coalesced) mappings are definitely required. IMO we should not maintain separate mapping for each hot-plugged device as it would be unrequired overhead. Regards, Nipun ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-24 15:22 ` David Marchand 2023-04-24 16:10 ` Stephen Hemminger @ 2023-05-10 12:58 ` Nipun Gupta 2023-05-11 14:08 ` Burakov, Anatoly 1 sibling, 1 reply; 30+ messages in thread From: Nipun Gupta @ 2023-05-10 12:58 UTC (permalink / raw) To: David Marchand, Burakov, Anatoly Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On 4/24/2023 8:52 PM, David Marchand wrote: > > Hello Anatoly, > > On Wed, Apr 5, 2023 at 4:17 PM Burakov, Anatoly > <anatoly.burakov@intel.com> wrote: >>>> Could you please provide some steps to reproduce the hotplug issue >>>> you're having? It would be great to have a test case for this patchset >>>> to put it in context. >>> >>> I am working on CDX bus >>> (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug. >>> >>> The test is as follows: >>> # Run testpmd application >>> ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1 >>> >>> # Bind to VFIO >>> echo "vfio-cdx" > /sys/bus/cdx/devices/cdx-00\:00/driver_override >>> echo "cdx-00:00" > /sys/bus/cdx/drivers_probe >>> >>> # Plug a device >>> testpmd> port attach cdx:cdx-00:00 >>> >>> #quit testpmd >>> testpmd> quit >>> >>> This gave error at testpmd exit that memory cannot be freed. On >>> debugging I updated this code and seems it should be seen with any of >>> the device. >>> >>> I see similar test case (without quit) mentioned >>> https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the >>> difference is that it is with igb_uio and issue is being observed with >>> VFIO. >>> >>> Please note the device/bus mentioned in the commands is not yet >>> upstreamed in DPDK, but patches would be sent out soon. >>> >>> Thanks, >>> Nipun >>> >> >> Thanks, I can reproduce this issue with regular devices too (run testpmd >> with no devices, bind a NIC to VFIO, attach it, then quit). You're >> correct in that since the initial mapping was done with mapping large >> contiguous zones (such as when mempools are created before attach), any >> subsequent freeing of memory will cause these errors to happen. >> >> I don't think this can be fixed by anything other than not doing the >> contiguous mapping thing, so provisionally, I think this patch should be >> accepted. I'll play around with it some more and get back to you :) > > Can we conclude on this topic? > It is best we merge this kind of change the sooner possible for a release. Hi Anatoly, Can you kindly update on this? Thanks, Nipun > > Thanks. > > > -- > David Marchand > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-05-10 12:58 ` Nipun Gupta @ 2023-05-11 14:08 ` Burakov, Anatoly 0 siblings, 0 replies; 30+ messages in thread From: Burakov, Anatoly @ 2023-05-11 14:08 UTC (permalink / raw) To: Nipun Gupta, David Marchand; +Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On 5/10/2023 1:58 PM, Nipun Gupta wrote: > > > On 4/24/2023 8:52 PM, David Marchand wrote: >> >> Hello Anatoly, >> >> On Wed, Apr 5, 2023 at 4:17 PM Burakov, Anatoly >> <anatoly.burakov@intel.com> wrote: >>>>> Could you please provide some steps to reproduce the hotplug issue >>>>> you're having? It would be great to have a test case for this patchset >>>>> to put it in context. >>>> >>>> I am working on CDX bus >>>> (http://patchwork.dpdk.org/project/dpdk/patch/20230124140746.594066-2-nipun.gupta@amd.com/) and trying out some cases for plug/unplug. >>>> >>>> The test is as follows: >>>> # Run testpmd application >>>> ./dpdk-testpmd -c 0x3 -- -i --nb-cores=1 >>>> >>>> # Bind to VFIO >>>> echo "vfio-cdx" > /sys/bus/cdx/devices/cdx-00\:00/driver_override >>>> echo "cdx-00:00" > /sys/bus/cdx/drivers_probe >>>> >>>> # Plug a device >>>> testpmd> port attach cdx:cdx-00:00 >>>> >>>> #quit testpmd >>>> testpmd> quit >>>> >>>> This gave error at testpmd exit that memory cannot be freed. On >>>> debugging I updated this code and seems it should be seen with any of >>>> the device. >>>> >>>> I see similar test case (without quit) mentioned >>>> https://doc.dpdk.org/dts/test_plans/hotplug_test_plan.html, but the >>>> difference is that it is with igb_uio and issue is being observed with >>>> VFIO. >>>> >>>> Please note the device/bus mentioned in the commands is not yet >>>> upstreamed in DPDK, but patches would be sent out soon. >>>> >>>> Thanks, >>>> Nipun >>>> >>> >>> Thanks, I can reproduce this issue with regular devices too (run testpmd >>> with no devices, bind a NIC to VFIO, attach it, then quit). You're >>> correct in that since the initial mapping was done with mapping large >>> contiguous zones (such as when mempools are created before attach), any >>> subsequent freeing of memory will cause these errors to happen. >>> >>> I don't think this can be fixed by anything other than not doing the >>> contiguous mapping thing, so provisionally, I think this patch should be >>> accepted. I'll play around with it some more and get back to you :) >> >> Can we conclude on this topic? >> It is best we merge this kind of change the sooner possible for a >> release. > > Hi Anatoly, > Can you kindly update on this? > Hi all, apologies for late reply. Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-04 16:32 ` Nipun Gupta 2023-04-05 14:16 ` Burakov, Anatoly @ 2023-04-11 15:13 ` Thomas Monjalon 2023-04-11 16:51 ` Gupta, Nipun 1 sibling, 1 reply; 30+ messages in thread From: Thomas Monjalon @ 2023-04-11 15:13 UTC (permalink / raw) To: Burakov, Anatoly, David Marchand, Nipun Gupta Cc: dev, Yigit, Ferruh, Agarwal, Nikhil 04/04/2023 18:32, Nipun Gupta: > On 4/4/2023 8:43 PM, Burakov, Anatoly wrote: > > The patch probably shouldn't include the mailmap changes :) > > Sure, will send a separate patch for it. No please, we squash mailmap changes with 1st patch. ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-11 15:13 ` Thomas Monjalon @ 2023-04-11 16:51 ` Gupta, Nipun 0 siblings, 0 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-04-11 16:51 UTC (permalink / raw) To: Thomas Monjalon, Burakov, Anatoly, David Marchand Cc: dev, Yigit, Ferruh, Agarwal, Nikhil [AMD Official Use Only - General] > -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Tuesday, April 11, 2023 8:43 PM > To: Burakov, Anatoly <anatoly.burakov@intel.com>; David Marchand > <david.marchand@redhat.com>; Gupta, Nipun <Nipun.Gupta@amd.com> > Cc: dev@dpdk.org; Yigit, Ferruh <Ferruh.Yigit@amd.com>; Agarwal, Nikhil > <nikhil.agarwal@amd.com> > Subject: Re: [PATCH v2] vfio: do not coalesce DMA mappings > > Caution: This message originated from an External Source. Use proper caution > when opening attachments, clicking links, or responding. > > > 04/04/2023 18:32, Nipun Gupta: > > On 4/4/2023 8:43 PM, Burakov, Anatoly wrote: > > > The patch probably shouldn't include the mailmap changes :) > > > > Sure, will send a separate patch for it. > > No please, we squash mailmap changes with 1st patch. Agree, I have replied same to this later in this mail thread. So, there is no change/resubmission required in this patch as of now. > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-04 15:13 ` Burakov, Anatoly 2023-04-04 16:32 ` Nipun Gupta @ 2023-04-07 6:13 ` Nipun Gupta 2023-04-07 7:26 ` David Marchand 1 sibling, 1 reply; 30+ messages in thread From: Nipun Gupta @ 2023-04-07 6:13 UTC (permalink / raw) To: Burakov, Anatoly, David Marchand Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil On 4/4/2023 8:43 PM, Burakov, Anatoly wrote: > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > On 2/7/2023 8:56 AM, Gupta, Nipun wrote: >> [AMD Official Use Only - General] >> >> Hi David, >> >> I agree that change is not straightforward to review, but it should >> not cause any functional issue as we are still creating all the memory >> mappings, but one by one for each segment. >> For hot plug case this causes issue as mentioned, that VFIO does not >> allow unmap of the individual segments in case mapping was created of >> a single coalesced segment. >> >> But yes, I am not sure why this code was added, which Anatoly may have >> more understanding on. > > The motivation behind this code was that Linux allows limited amount of > page mappings, so we were trying to save on those. However, since then > there have been a few changes related to partial unmaps that may make it > so that this code is not only no longer necessary, but is in fact > actively harmful. I agree that this at least warrants a second look. > >> >> Anatoly, >> >> Can you please provide your feedback on this change? > > The patch probably shouldn't include the mailmap changes :) I see in "git log" that all the mailmap changes are with the patch submitted, probably as it shows checkpatch warning, so it seems this should be fine? Thanks, Nipun > > Could you please provide some steps to reproduce the hotplug issue > you're having? It would be great to have a test case for this patchset > to put it in context. > > -- > Thanks, > Anatoly > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-04-07 6:13 ` Nipun Gupta @ 2023-04-07 7:26 ` David Marchand 0 siblings, 0 replies; 30+ messages in thread From: David Marchand @ 2023-04-07 7:26 UTC (permalink / raw) To: Nipun Gupta, Burakov, Anatoly; +Cc: dev, thomas, Yigit, Ferruh, Agarwal, Nikhil Hello Nipun, Anatoly, On Fri, Apr 7, 2023 at 8:13 AM Nipun Gupta <nipun.gupta@amd.com> wrote: > > The patch probably shouldn't include the mailmap changes :) Mailmap changes can be done by the submitter. So Nipun did nothing wrong. > > I see in "git log" that all the mailmap changes are with the patch > submitted, probably as it shows checkpatch warning, so it seems this > should be fine? If a submitter includes the mailmap change in his patches, it is fine. Otherwise, the maintainer that merges the first patch of a new contributor will fix the mailmap at the same time. -- David Marchand ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] vfio: do not coalesce DMA mappings 2023-01-04 5:19 ` [PATCH v2] " Nipun Gupta 2023-02-02 10:48 ` David Marchand @ 2023-05-15 11:16 ` David Marchand 1 sibling, 0 replies; 30+ messages in thread From: David Marchand @ 2023-05-15 11:16 UTC (permalink / raw) To: Nipun Gupta; +Cc: dev, thomas, anatoly.burakov, ferruh.yigit, nikhil.agarwal On Wed, Jan 4, 2023 at 6:19 AM Nipun Gupta <nipun.gupta@amd.com> wrote: > At the cleanup time when dma unmap is done, linux kernel > does not allow unmap of individual segments which were > coalesced together while creating the DMA map for type1 IOMMU > mappings. So, this change updates the mapping of the memory > segments(hugepages) on a per-page basis. > > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > Signed-off-by: Nikhil Agarwal <nikhil.agarwal@amd.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Applied, thanks. -- David Marchand ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH] vfio: do not coalesce DMA mappings 2022-12-30 9:58 [PATCH] vfio: do not coalesce DMA mappings Nipun Gupta 2023-01-04 5:19 ` [PATCH v2] " Nipun Gupta @ 2023-06-29 8:21 ` Ding, Xuan 2023-06-30 1:45 ` Nipun Gupta 1 sibling, 1 reply; 30+ messages in thread From: Ding, Xuan @ 2023-06-29 8:21 UTC (permalink / raw) To: Nipun Gupta, dev, thomas, Burakov, Anatoly, ferruh.yigit Cc: nikhil.agarwal, He, Xingguang, Ling, WeiX Hi Nipun, I'd like to appreciate your time reading this email. Our QA team found that since this commit "a399d7b5a994: do not coalesce DMA mappings" is introduced, the dpdk testpmd start with "--no-huge" parameters will failed, and shows "EAL: Cannot set up DMA remapping, error 28 (No space left on device)". So they reported it on dpdk Bugzilla: https://bugs.dpdk.org/show_bug.cgi?id=1235. I understand this feature is to keep consistent with the kernel and not allow memory segments be merged. The side effect is the testpmd with "--no-huge" parameters will not be able to start because the too many pages will exceed the capability of IOMMU. Is it expected? Should we remove the --no-huge" in our testcase? Regards, Xuan > -----Original Message----- > From: Nipun Gupta <nipun.gupta@amd.com> > Sent: Friday, December 30, 2022 5:59 PM > To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly > <anatoly.burakov@intel.com>; ferruh.yigit@amd.com > Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com> > Subject: [PATCH] vfio: do not coalesce DMA mappings > > At the cleanup time when dma unmap is done, linux kernel does not allow > unmap of individual segments which were coalesced together while creating > the DMA map for type1 IOMMU mappings. So, this change updates the > mapping of the memory > segments(hugepages) on a per-page basis. > > Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > --- > > When hotplug of devices is used, multiple pages gets colaeced and a single > mapping gets created for these pages (using APIs > rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time > when the memory is released, the VFIO does not cleans up that memory and > following error is observed in the eal for 2MB > hugepages: > EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 > > This is because VFIO does not clear the DMA (refer API > vfio_dma_do_unmap() - > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1. > c#L1330), > where it checks the dma mapping where it checks for IOVA to free: > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1. > c#L1418. > > Thus this change updates the mapping to be created individually instead of > colaecing them. > > lib/eal/linux/eal_vfio.c | 29 ----------------------------- > 1 file changed, 29 deletions(-) > > diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index > 549b86ae1d..56edccb0db 100644 > --- a/lib/eal/linux/eal_vfio.c > +++ b/lib/eal/linux/eal_vfio.c > @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base, > return 1; > } > > -static int > -type1_map_contig(const struct rte_memseg_list *msl, const struct > rte_memseg *ms, > - size_t len, void *arg) > -{ > - int *vfio_container_fd = arg; > - > - if (msl->external) > - return 0; > - > - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > ms->iova, > - len, 1); > -} > - > static int > type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, > void *arg) > @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, > const struct rte_memseg *ms, > if (ms->iova == RTE_BAD_IOVA) > return 0; > > - /* if IOVA mode is VA, we've already mapped the internal segments */ > - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) > - return 0; > - > return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > ms->iova, > ms->len, 1); > } > @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd, > uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int > vfio_container_fd) { > - if (rte_eal_iova_mode() == RTE_IOVA_VA) { > - /* with IOVA as VA mode, we can get away with mapping > contiguous > - * chunks rather than going page-by-page. > - */ > - int ret = rte_memseg_contig_walk(type1_map_contig, > - &vfio_container_fd); > - if (ret) > - return ret; > - /* we have to continue the walk because we've skipped the > - * external segments during the config walk. > - */ > - } > return rte_memseg_walk(type1_map, &vfio_container_fd); } > > -- > 2.25.1 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] vfio: do not coalesce DMA mappings 2023-06-29 8:21 ` [PATCH] " Ding, Xuan @ 2023-06-30 1:45 ` Nipun Gupta 2023-06-30 5:58 ` Ding, Xuan 0 siblings, 1 reply; 30+ messages in thread From: Nipun Gupta @ 2023-06-30 1:45 UTC (permalink / raw) To: Ding, Xuan, dev, thomas, Burakov, Anatoly, ferruh.yigit Cc: nikhil.agarwal, He, Xingguang, Ling, WeiX Hi Xuan, Thanks for pointing out the issue and figuring out the patch which introduced this. If you have answers to below queries, please let me know: Is there any other test cases which tests "--no-huge" which pass? Also, if we change the "-m" option to provide lower memory, does the test pass? When you mention too many pages exceed the capability of IOMMU, you are referring to HW capability to create multiple pages? Here it seems in case of 4K page size we need 256K pages which is limiting the capacity? Regards, Nipun On 6/29/2023 1:51 PM, Ding, Xuan wrote: > Hi Nipun, > > I'd like to appreciate your time reading this email. > > Our QA team found that since this commit "a399d7b5a994: do not coalesce DMA mappings" is introduced, > the dpdk testpmd start with "--no-huge" parameters will failed, and shows "EAL: Cannot set up DMA remapping, error 28 (No space left on device)". > So they reported it on dpdk Bugzilla: https://bugs.dpdk.org/show_bug.cgi?id=1235. > > I understand this feature is to keep consistent with the kernel and not allow memory segments be merged. > The side effect is the testpmd with "--no-huge" parameters will not be able to start because the too many pages will exceed the capability of IOMMU. > Is it expected? Should we remove the --no-huge" in our testcase? > > Regards, > Xuan > >> -----Original Message----- >> From: Nipun Gupta <nipun.gupta@amd.com> >> Sent: Friday, December 30, 2022 5:59 PM >> To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly >> <anatoly.burakov@intel.com>; ferruh.yigit@amd.com >> Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com> >> Subject: [PATCH] vfio: do not coalesce DMA mappings >> >> At the cleanup time when dma unmap is done, linux kernel does not allow >> unmap of individual segments which were coalesced together while creating >> the DMA map for type1 IOMMU mappings. So, this change updates the >> mapping of the memory >> segments(hugepages) on a per-page basis. >> >> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> >> --- >> >> When hotplug of devices is used, multiple pages gets colaeced and a single >> mapping gets created for these pages (using APIs >> rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time >> when the memory is released, the VFIO does not cleans up that memory and >> following error is observed in the eal for 2MB >> hugepages: >> EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 >> >> This is because VFIO does not clear the DMA (refer API >> vfio_dma_do_unmap() - >> https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1. >> c#L1330), >> where it checks the dma mapping where it checks for IOVA to free: >> https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1. >> c#L1418. >> >> Thus this change updates the mapping to be created individually instead of >> colaecing them. >> >> lib/eal/linux/eal_vfio.c | 29 ----------------------------- >> 1 file changed, 29 deletions(-) >> >> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index >> 549b86ae1d..56edccb0db 100644 >> --- a/lib/eal/linux/eal_vfio.c >> +++ b/lib/eal/linux/eal_vfio.c >> @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base, >> return 1; >> } >> >> -static int >> -type1_map_contig(const struct rte_memseg_list *msl, const struct >> rte_memseg *ms, >> - size_t len, void *arg) >> -{ >> - int *vfio_container_fd = arg; >> - >> - if (msl->external) >> - return 0; >> - >> - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, >> ms->iova, >> - len, 1); >> -} >> - >> static int >> type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, >> void *arg) >> @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, >> const struct rte_memseg *ms, >> if (ms->iova == RTE_BAD_IOVA) >> return 0; >> >> - /* if IOVA mode is VA, we've already mapped the internal segments */ >> - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) >> - return 0; >> - >> return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, >> ms->iova, >> ms->len, 1); >> } >> @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd, >> uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int >> vfio_container_fd) { >> - if (rte_eal_iova_mode() == RTE_IOVA_VA) { >> - /* with IOVA as VA mode, we can get away with mapping >> contiguous >> - * chunks rather than going page-by-page. >> - */ >> - int ret = rte_memseg_contig_walk(type1_map_contig, >> - &vfio_container_fd); >> - if (ret) >> - return ret; >> - /* we have to continue the walk because we've skipped the >> - * external segments during the config walk. >> - */ >> - } >> return rte_memseg_walk(type1_map, &vfio_container_fd); } >> >> -- >> 2.25.1 > ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH] vfio: do not coalesce DMA mappings 2023-06-30 1:45 ` Nipun Gupta @ 2023-06-30 5:58 ` Ding, Xuan 2023-07-04 5:13 ` Ding, Xuan 0 siblings, 1 reply; 30+ messages in thread From: Ding, Xuan @ 2023-06-30 5:58 UTC (permalink / raw) To: Nipun Gupta, dev, thomas, Burakov, Anatoly, ferruh.yigit Cc: nikhil.agarwal, He, Xingguang, Ling, WeiX Hi Nipun, Replies are inline. > -----Original Message----- > From: Nipun Gupta <nipun.gupta@amd.com> > Sent: Friday, June 30, 2023 9:46 AM > To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > ferruh.yigit@amd.com > Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; Ling, > WeiX <weix.ling@intel.com> > Subject: Re: [PATCH] vfio: do not coalesce DMA mappings > > Hi Xuan, > > Thanks for pointing out the issue and figuring out the patch which introduced > this. If you have answers to below queries, please let me know: > > Is there any other test cases which tests "--no-huge" which pass? Yes, there are test cases adding "--no-huge" option to validate 4k page size in async vhost. Actually, the page size is decided by front-end, so I think this case can be removed. Previously, testpmd can start with "--no-huge" options (not sure if there are test cases). Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i > > Also, if we change the "-m" option to provide lower memory, does the test > pass? "-m" option is also added and does not work. > > When you mention too many pages exceed the capability of IOMMU, you are > referring to HW capability to create multiple pages? Here it seems in case of > 4K page size we need 256K pages which is limiting the capacity? Yes, this is the result of my initial debugging. The direct impact is that this kind of testpmd cases cannot start now. If this is expected, I think we can close this defect and ignore the "--no-huge" option when start. Regards, Xuan > > Regards, > Nipun > > On 6/29/2023 1:51 PM, Ding, Xuan wrote: > > Hi Nipun, > > > > I'd like to appreciate your time reading this email. > > > > Our QA team found that since this commit "a399d7b5a994: do not > > coalesce DMA mappings" is introduced, the dpdk testpmd start with "--no- > huge" parameters will failed, and shows "EAL: Cannot set up DMA > remapping, error 28 (No space left on device)". > > So they reported it on dpdk Bugzilla: > https://bugs.dpdk.org/show_bug.cgi?id=1235. > > > > I understand this feature is to keep consistent with the kernel and not allow > memory segments be merged. > > The side effect is the testpmd with "--no-huge" parameters will not be able > to start because the too many pages will exceed the capability of IOMMU. > > Is it expected? Should we remove the --no-huge" in our testcase? > > > > Regards, > > Xuan > > > >> -----Original Message----- > >> From: Nipun Gupta <nipun.gupta@amd.com> > >> Sent: Friday, December 30, 2022 5:59 PM > >> To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly > >> <anatoly.burakov@intel.com>; ferruh.yigit@amd.com > >> Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com> > >> Subject: [PATCH] vfio: do not coalesce DMA mappings > >> > >> At the cleanup time when dma unmap is done, linux kernel does not > >> allow unmap of individual segments which were coalesced together > >> while creating the DMA map for type1 IOMMU mappings. So, this change > >> updates the mapping of the memory > >> segments(hugepages) on a per-page basis. > >> > >> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > >> --- > >> > >> When hotplug of devices is used, multiple pages gets colaeced and a > >> single mapping gets created for these pages (using APIs > >> rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time > >> when the memory is released, the VFIO does not cleans up that memory > >> and following error is observed in the eal for 2MB > >> hugepages: > >> EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 > >> > >> This is because VFIO does not clear the DMA (refer API > >> vfio_dma_do_unmap() - > >> > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type > 1. > >> c#L1330), > >> where it checks the dma mapping where it checks for IOVA to free: > >> > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type > 1. > >> c#L1418. > >> > >> Thus this change updates the mapping to be created individually > >> instead of colaecing them. > >> > >> lib/eal/linux/eal_vfio.c | 29 ----------------------------- > >> 1 file changed, 29 deletions(-) > >> > >> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c > >> index 549b86ae1d..56edccb0db 100644 > >> --- a/lib/eal/linux/eal_vfio.c > >> +++ b/lib/eal/linux/eal_vfio.c > >> @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char > *sysfs_base, > >> return 1; > >> } > >> > >> -static int > >> -type1_map_contig(const struct rte_memseg_list *msl, const struct > >> rte_memseg *ms, > >> - size_t len, void *arg) > >> -{ > >> - int *vfio_container_fd = arg; > >> - > >> - if (msl->external) > >> - return 0; > >> - > >> - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > >> ms->iova, > >> - len, 1); > >> -} > >> - > >> static int > >> type1_map(const struct rte_memseg_list *msl, const struct rte_memseg > *ms, > >> void *arg) > >> @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, > >> const struct rte_memseg *ms, > >> if (ms->iova == RTE_BAD_IOVA) > >> return 0; > >> > >> - /* if IOVA mode is VA, we've already mapped the internal segments > */ > >> - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) > >> - return 0; > >> - > >> return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > >> ms->iova, > >> ms->len, 1); > >> } > >> @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int > vfio_container_fd, > >> uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int > >> vfio_container_fd) { > >> - if (rte_eal_iova_mode() == RTE_IOVA_VA) { > >> - /* with IOVA as VA mode, we can get away with mapping > >> contiguous > >> - * chunks rather than going page-by-page. > >> - */ > >> - int ret = rte_memseg_contig_walk(type1_map_contig, > >> - &vfio_container_fd); > >> - if (ret) > >> - return ret; > >> - /* we have to continue the walk because we've skipped the > >> - * external segments during the config walk. > >> - */ > >> - } > >> return rte_memseg_walk(type1_map, &vfio_container_fd); } > >> > >> -- > >> 2.25.1 > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH] vfio: do not coalesce DMA mappings 2023-06-30 5:58 ` Ding, Xuan @ 2023-07-04 5:13 ` Ding, Xuan 2023-07-04 6:53 ` Gupta, Nipun 0 siblings, 1 reply; 30+ messages in thread From: Ding, Xuan @ 2023-07-04 5:13 UTC (permalink / raw) To: Nipun Gupta, dev, thomas, Burakov, Anatoly, ferruh.yigit Cc: nikhil.agarwal, He, Xingguang, Ling, WeiX Hi Nipun, > -----Original Message----- > From: Ding, Xuan > Sent: Friday, June 30, 2023 1:58 PM > To: Nipun Gupta <nipun.gupta@amd.com>; dev@dpdk.org; > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > ferruh.yigit@amd.com > Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; Ling, > WeiX <weix.ling@intel.com> > Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > > Hi Nipun, > > Replies are inline. > > > -----Original Message----- > > From: Nipun Gupta <nipun.gupta@amd.com> > > Sent: Friday, June 30, 2023 9:46 AM > > To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; > > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > > ferruh.yigit@amd.com > > Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; > > Ling, WeiX <weix.ling@intel.com> > > Subject: Re: [PATCH] vfio: do not coalesce DMA mappings > > > > Hi Xuan, > > > > Thanks for pointing out the issue and figuring out the patch which > > introduced this. If you have answers to below queries, please let me know: > > > > Is there any other test cases which tests "--no-huge" which pass? > > Yes, there are test cases adding "--no-huge" option to validate 4k page size in > async vhost. > Actually, the page size is decided by front-end, so I think this case can be > removed. > > Previously, testpmd can start with "--no-huge" options (not sure if there are > test cases). > Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i > > > > > Also, if we change the "-m" option to provide lower memory, does the > > test pass? > > "-m" option is also added and does not work. > > > > > When you mention too many pages exceed the capability of IOMMU, you > > are referring to HW capability to create multiple pages? Here it seems > > in case of 4K page size we need 256K pages which is limiting the capacity? > > Yes, this is the result of my initial debugging. > The direct impact is that this kind of testpmd cases cannot start now. > If this is expected, I think we can close this defect and ignore the "--no-huge" > option when start. Any insights? Should we just ignore the "--no-huge" option and close this defect? Now we did this as a workaround. Seems no one uses the "--no-huge" option in testpmd now. Thanks, Xuan > > Regards, > Xuan > > > > > Regards, > > Nipun > > > > On 6/29/2023 1:51 PM, Ding, Xuan wrote: > > > Hi Nipun, > > > > > > I'd like to appreciate your time reading this email. > > > > > > Our QA team found that since this commit "a399d7b5a994: do not > > > coalesce DMA mappings" is introduced, the dpdk testpmd start with > > > "--no- > > huge" parameters will failed, and shows "EAL: Cannot set up DMA > > remapping, error 28 (No space left on device)". > > > So they reported it on dpdk Bugzilla: > > https://bugs.dpdk.org/show_bug.cgi?id=1235. > > > > > > I understand this feature is to keep consistent with the kernel and > > > not allow > > memory segments be merged. > > > The side effect is the testpmd with "--no-huge" parameters will not > > > be able > > to start because the too many pages will exceed the capability of IOMMU. > > > Is it expected? Should we remove the --no-huge" in our testcase? > > > > > > Regards, > > > Xuan > > > > > >> -----Original Message----- > > >> From: Nipun Gupta <nipun.gupta@amd.com> > > >> Sent: Friday, December 30, 2022 5:59 PM > > >> To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly > > >> <anatoly.burakov@intel.com>; ferruh.yigit@amd.com > > >> Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com> > > >> Subject: [PATCH] vfio: do not coalesce DMA mappings > > >> > > >> At the cleanup time when dma unmap is done, linux kernel does not > > >> allow unmap of individual segments which were coalesced together > > >> while creating the DMA map for type1 IOMMU mappings. So, this > > >> change updates the mapping of the memory > > >> segments(hugepages) on a per-page basis. > > >> > > >> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > > >> --- > > >> > > >> When hotplug of devices is used, multiple pages gets colaeced and a > > >> single mapping gets created for these pages (using APIs > > >> rte_memseg_contig_walk() and type1_map_contig(). On the cleanup > > >> time when the memory is released, the VFIO does not cleans up that > > >> memory and following error is observed in the eal for 2MB > > >> hugepages: > > >> EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 > > >> > > >> This is because VFIO does not clear the DMA (refer API > > >> vfio_dma_do_unmap() - > > >> > > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu > > _type > > 1. > > >> c#L1330), > > >> where it checks the dma mapping where it checks for IOVA to free: > > >> > > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu > > _type > > 1. > > >> c#L1418. > > >> > > >> Thus this change updates the mapping to be created individually > > >> instead of colaecing them. > > >> > > >> lib/eal/linux/eal_vfio.c | 29 ----------------------------- > > >> 1 file changed, 29 deletions(-) > > >> > > >> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c > > >> index 549b86ae1d..56edccb0db 100644 > > >> --- a/lib/eal/linux/eal_vfio.c > > >> +++ b/lib/eal/linux/eal_vfio.c > > >> @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char > > *sysfs_base, > > >> return 1; > > >> } > > >> > > >> -static int > > >> -type1_map_contig(const struct rte_memseg_list *msl, const struct > > >> rte_memseg *ms, > > >> - size_t len, void *arg) > > >> -{ > > >> - int *vfio_container_fd = arg; > > >> - > > >> - if (msl->external) > > >> - return 0; > > >> - > > >> - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > > >> ms->iova, > > >> - len, 1); > > >> -} > > >> - > > >> static int > > >> type1_map(const struct rte_memseg_list *msl, const struct > > >> rte_memseg > > *ms, > > >> void *arg) > > >> @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, > > >> const struct rte_memseg *ms, > > >> if (ms->iova == RTE_BAD_IOVA) > > >> return 0; > > >> > > >> - /* if IOVA mode is VA, we've already mapped the internal segments > > */ > > >> - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) > > >> - return 0; > > >> - > > >> return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > > >> ms->iova, > > >> ms->len, 1); > > >> } > > >> @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int > > vfio_container_fd, > > >> uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int > > >> vfio_container_fd) { > > >> - if (rte_eal_iova_mode() == RTE_IOVA_VA) { > > >> - /* with IOVA as VA mode, we can get away with mapping > > >> contiguous > > >> - * chunks rather than going page-by-page. > > >> - */ > > >> - int ret = rte_memseg_contig_walk(type1_map_contig, > > >> - &vfio_container_fd); > > >> - if (ret) > > >> - return ret; > > >> - /* we have to continue the walk because we've skipped the > > >> - * external segments during the config walk. > > >> - */ > > >> - } > > >> return rte_memseg_walk(type1_map, &vfio_container_fd); } > > >> > > >> -- > > >> 2.25.1 > > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 5:13 ` Ding, Xuan @ 2023-07-04 6:53 ` Gupta, Nipun 2023-07-04 8:06 ` Ding, Xuan 0 siblings, 1 reply; 30+ messages in thread From: Gupta, Nipun @ 2023-07-04 6:53 UTC (permalink / raw) To: Ding, Xuan, dev, thomas, Burakov, Anatoly, Yigit, Ferruh, David Marchand Cc: Agarwal, Nikhil, He, Xingguang, Ling, WeiX Hi Xuan, Please see inline. > -----Original Message----- > From: Ding, Xuan <xuan.ding@intel.com> > Sent: Tuesday, July 4, 2023 10:43 AM > To: Gupta, Nipun <Nipun.Gupta@amd.com>; dev@dpdk.org; > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit, > Ferruh <Ferruh.Yigit@amd.com> > Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang > <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > > Hi Nipun, > > > -----Original Message----- > > From: Ding, Xuan > > Sent: Friday, June 30, 2023 1:58 PM > > To: Nipun Gupta <nipun.gupta@amd.com>; dev@dpdk.org; > > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > > ferruh.yigit@amd.com > > Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; Ling, > > WeiX <weix.ling@intel.com> > > Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > > > > Hi Nipun, > > > > Replies are inline. > > > > > -----Original Message----- > > > From: Nipun Gupta <nipun.gupta@amd.com> > > > Sent: Friday, June 30, 2023 9:46 AM > > > To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; > > > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > > > ferruh.yigit@amd.com > > > Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; > > > Ling, WeiX <weix.ling@intel.com> > > > Subject: Re: [PATCH] vfio: do not coalesce DMA mappings > > > > > > Hi Xuan, > > > > > > Thanks for pointing out the issue and figuring out the patch which > > > introduced this. If you have answers to below queries, please let me know: > > > > > > Is there any other test cases which tests "--no-huge" which pass? > > > > Yes, there are test cases adding "--no-huge" option to validate 4k page size in > > async vhost. > > Actually, the page size is decided by front-end, so I think this case can be > > removed. > > > > Previously, testpmd can start with "--no-huge" options (not sure if there are > > test cases). > > Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i > > > > > > > > Also, if we change the "-m" option to provide lower memory, does the > > > test pass? > > > > "-m" option is also added and does not work. > > > > > > > > When you mention too many pages exceed the capability of IOMMU, you > > > are referring to HW capability to create multiple pages? Here it seems > > > in case of 4K page size we need 256K pages which is limiting the capacity? > > > > Yes, this is the result of my initial debugging. > > The direct impact is that this kind of testpmd cases cannot start now. > > If this is expected, I think we can close this defect and ignore the "--no-huge" > > option when start. > > Any insights? Should we just ignore the "--no-huge" option and close this defect? > Now we did this as a workaround. Seems no one uses the "--no-huge" option in > testpmd now. VFIO supports dma_entry_limit as a module parameter, which has a default value of U16_MAX i.e. 64K, most likely which is limiting creation of 256K entries for 4K pages here. This can be modified while inserting vfio module: modprobe vfio_iommu_type1 dma_entry_limit=1000000 You can update the test case with updating this limit, and test case shall pass. Thanks, Nipun > > Thanks, > Xuan > > > > > Regards, > > Xuan > > > > > > > > Regards, > > > Nipun > > > > > > On 6/29/2023 1:51 PM, Ding, Xuan wrote: > > > > Hi Nipun, > > > > > > > > I'd like to appreciate your time reading this email. > > > > > > > > Our QA team found that since this commit "a399d7b5a994: do not > > > > coalesce DMA mappings" is introduced, the dpdk testpmd start with > > > > "--no- > > > huge" parameters will failed, and shows "EAL: Cannot set up DMA > > > remapping, error 28 (No space left on device)". > > > > So they reported it on dpdk Bugzilla: > > > https://bugs.dpdk.org/show_bug.cgi?id=1235. > > > > > > > > I understand this feature is to keep consistent with the kernel and > > > > not allow > > > memory segments be merged. > > > > The side effect is the testpmd with "--no-huge" parameters will not > > > > be able > > > to start because the too many pages will exceed the capability of IOMMU. > > > > Is it expected? Should we remove the --no-huge" in our testcase? > > > > > > > > Regards, > > > > Xuan > > > > > > > >> -----Original Message----- > > > >> From: Nipun Gupta <nipun.gupta@amd.com> > > > >> Sent: Friday, December 30, 2022 5:59 PM > > > >> To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly > > > >> <anatoly.burakov@intel.com>; ferruh.yigit@amd.com > > > >> Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com> > > > >> Subject: [PATCH] vfio: do not coalesce DMA mappings > > > >> > > > >> At the cleanup time when dma unmap is done, linux kernel does not > > > >> allow unmap of individual segments which were coalesced together > > > >> while creating the DMA map for type1 IOMMU mappings. So, this > > > >> change updates the mapping of the memory > > > >> segments(hugepages) on a per-page basis. > > > >> > > > >> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > > > >> --- > > > >> > > > >> When hotplug of devices is used, multiple pages gets colaeced and a > > > >> single mapping gets created for these pages (using APIs > > > >> rte_memseg_contig_walk() and type1_map_contig(). On the cleanup > > > >> time when the memory is released, the VFIO does not cleans up that > > > >> memory and following error is observed in the eal for 2MB > > > >> hugepages: > > > >> EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 > > > >> > > > >> This is because VFIO does not clear the DMA (refer API > > > >> vfio_dma_do_unmap() - > > > >> > > > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu > > > _type > > > 1. > > > >> c#L1330), > > > >> where it checks the dma mapping where it checks for IOVA to free: > > > >> > > > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu > > > _type > > > 1. > > > >> c#L1418. > > > >> > > > >> Thus this change updates the mapping to be created individually > > > >> instead of colaecing them. > > > >> > > > >> lib/eal/linux/eal_vfio.c | 29 ----------------------------- > > > >> 1 file changed, 29 deletions(-) > > > >> > > > >> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c > > > >> index 549b86ae1d..56edccb0db 100644 > > > >> --- a/lib/eal/linux/eal_vfio.c > > > >> +++ b/lib/eal/linux/eal_vfio.c > > > >> @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char > > > *sysfs_base, > > > >> return 1; > > > >> } > > > >> > > > >> -static int > > > >> -type1_map_contig(const struct rte_memseg_list *msl, const struct > > > >> rte_memseg *ms, > > > >> - size_t len, void *arg) > > > >> -{ > > > >> - int *vfio_container_fd = arg; > > > >> - > > > >> - if (msl->external) > > > >> - return 0; > > > >> - > > > >> - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > > > >> ms->iova, > > > >> - len, 1); > > > >> -} > > > >> - > > > >> static int > > > >> type1_map(const struct rte_memseg_list *msl, const struct > > > >> rte_memseg > > > *ms, > > > >> void *arg) > > > >> @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, > > > >> const struct rte_memseg *ms, > > > >> if (ms->iova == RTE_BAD_IOVA) > > > >> return 0; > > > >> > > > >> - /* if IOVA mode is VA, we've already mapped the internal segments > > > */ > > > >> - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) > > > >> - return 0; > > > >> - > > > >> return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > > > >> ms->iova, > > > >> ms->len, 1); > > > >> } > > > >> @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int > > > vfio_container_fd, > > > >> uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int > > > >> vfio_container_fd) { > > > >> - if (rte_eal_iova_mode() == RTE_IOVA_VA) { > > > >> - /* with IOVA as VA mode, we can get away with mapping > > > >> contiguous > > > >> - * chunks rather than going page-by-page. > > > >> - */ > > > >> - int ret = rte_memseg_contig_walk(type1_map_contig, > > > >> - &vfio_container_fd); > > > >> - if (ret) > > > >> - return ret; > > > >> - /* we have to continue the walk because we've skipped the > > > >> - * external segments during the config walk. > > > >> - */ > > > >> - } > > > >> return rte_memseg_walk(type1_map, &vfio_container_fd); } > > > >> > > > >> -- > > > >> 2.25.1 > > > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 6:53 ` Gupta, Nipun @ 2023-07-04 8:06 ` Ding, Xuan 2023-07-04 9:23 ` Gupta, Nipun 0 siblings, 1 reply; 30+ messages in thread From: Ding, Xuan @ 2023-07-04 8:06 UTC (permalink / raw) To: Gupta, Nipun, dev, thomas, Burakov, Anatoly, Yigit, Ferruh, David Marchand Cc: Agarwal, Nikhil, He, Xingguang, Ling, WeiX Hi Nipun, > -----Original Message----- > From: Gupta, Nipun <Nipun.Gupta@amd.com> > Sent: Tuesday, July 4, 2023 2:54 PM > To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; thomas@monjalon.net; > Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit, Ferruh > <Ferruh.Yigit@amd.com>; David Marchand <david.marchand@redhat.com> > Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang > <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > > Hi Xuan, > > Please see inline. > > > -----Original Message----- > > From: Ding, Xuan <xuan.ding@intel.com> > > Sent: Tuesday, July 4, 2023 10:43 AM > > To: Gupta, Nipun <Nipun.Gupta@amd.com>; dev@dpdk.org; > > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > > Yigit, Ferruh <Ferruh.Yigit@amd.com> > > Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang > > <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > > Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > > > > Hi Nipun, > > > > > -----Original Message----- > > > From: Ding, Xuan > > > Sent: Friday, June 30, 2023 1:58 PM > > > To: Nipun Gupta <nipun.gupta@amd.com>; dev@dpdk.org; > > > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > > > ferruh.yigit@amd.com > > > Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; > > > Ling, WeiX <weix.ling@intel.com> > > > Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > > > > > > Hi Nipun, > > > > > > Replies are inline. > > > > > > > -----Original Message----- > > > > From: Nipun Gupta <nipun.gupta@amd.com> > > > > Sent: Friday, June 30, 2023 9:46 AM > > > > To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; > > > > thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > > > > ferruh.yigit@amd.com > > > > Cc: nikhil.agarwal@amd.com; He, Xingguang > > > > <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > > > > Subject: Re: [PATCH] vfio: do not coalesce DMA mappings > > > > > > > > Hi Xuan, > > > > > > > > Thanks for pointing out the issue and figuring out the patch which > > > > introduced this. If you have answers to below queries, please let me know: > > > > > > > > Is there any other test cases which tests "--no-huge" which pass? > > > > > > Yes, there are test cases adding "--no-huge" option to validate 4k > > > page size in async vhost. > > > Actually, the page size is decided by front-end, so I think this > > > case can be removed. > > > > > > Previously, testpmd can start with "--no-huge" options (not sure if > > > there are test cases). > > > Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i > > > > > > > > > > > Also, if we change the "-m" option to provide lower memory, does > > > > the test pass? > > > > > > "-m" option is also added and does not work. > > > > > > > > > > > When you mention too many pages exceed the capability of IOMMU, > > > > you are referring to HW capability to create multiple pages? Here > > > > it seems in case of 4K page size we need 256K pages which is limiting the > capacity? > > > > > > Yes, this is the result of my initial debugging. > > > The direct impact is that this kind of testpmd cases cannot start now. > > > If this is expected, I think we can close this defect and ignore the "--no- > huge" > > > option when start. > > > > Any insights? Should we just ignore the "--no-huge" option and close this > defect? > > Now we did this as a workaround. Seems no one uses the "--no-huge" > > option in testpmd now. > > VFIO supports dma_entry_limit as a module parameter, which has a default > value of U16_MAX i.e. 64K, most likely which is limiting creation of 256K > entries for 4K pages here. This can be modified while inserting vfio module: > modprobe vfio_iommu_type1 dma_entry_limit=1000000 Thanks for your suggestion. I tried it on ubuntu 22.04 but it does not work. The reason I think is vfio-pci is build-in in kernel driver (since 20.04) and it does not support dynamic insmod/rmmod. Does this command need to rmmod vfio first and then modprobe again? Thanks, Xuan > > You can update the test case with updating this limit, and test case shall pass. > > Thanks, > Nipun > > > > > Thanks, > > Xuan > > > > > > > > Regards, > > > Xuan > > > > > > > > > > > Regards, > > > > Nipun > > > > > > > > On 6/29/2023 1:51 PM, Ding, Xuan wrote: > > > > > Hi Nipun, > > > > > > > > > > I'd like to appreciate your time reading this email. > > > > > > > > > > Our QA team found that since this commit "a399d7b5a994: do not > > > > > coalesce DMA mappings" is introduced, the dpdk testpmd start > > > > > with > > > > > "--no- > > > > huge" parameters will failed, and shows "EAL: Cannot set up DMA > > > > remapping, error 28 (No space left on device)". > > > > > So they reported it on dpdk Bugzilla: > > > > https://bugs.dpdk.org/show_bug.cgi?id=1235. > > > > > > > > > > I understand this feature is to keep consistent with the kernel > > > > > and not allow > > > > memory segments be merged. > > > > > The side effect is the testpmd with "--no-huge" parameters will > > > > > not be able > > > > to start because the too many pages will exceed the capability of IOMMU. > > > > > Is it expected? Should we remove the --no-huge" in our testcase? > > > > > > > > > > Regards, > > > > > Xuan > > > > > > > > > >> -----Original Message----- > > > > >> From: Nipun Gupta <nipun.gupta@amd.com> > > > > >> Sent: Friday, December 30, 2022 5:59 PM > > > > >> To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly > > > > >> <anatoly.burakov@intel.com>; ferruh.yigit@amd.com > > > > >> Cc: nikhil.agarwal@amd.com; Nipun Gupta <nipun.gupta@amd.com> > > > > >> Subject: [PATCH] vfio: do not coalesce DMA mappings > > > > >> > > > > >> At the cleanup time when dma unmap is done, linux kernel does > > > > >> not allow unmap of individual segments which were coalesced > > > > >> together while creating the DMA map for type1 IOMMU mappings. > > > > >> So, this change updates the mapping of the memory > > > > >> segments(hugepages) on a per-page basis. > > > > >> > > > > >> Signed-off-by: Nipun Gupta <nipun.gupta@amd.com> > > > > >> --- > > > > >> > > > > >> When hotplug of devices is used, multiple pages gets colaeced > > > > >> and a single mapping gets created for these pages (using APIs > > > > >> rte_memseg_contig_walk() and type1_map_contig(). On the cleanup > > > > >> time when the memory is released, the VFIO does not cleans up > > > > >> that memory and following error is observed in the eal for 2MB > > > > >> hugepages: > > > > >> EAL: Unexpected size 0 of DMA remapping cleared instead of > > > > >> 2097152 > > > > >> > > > > >> This is because VFIO does not clear the DMA (refer API > > > > >> vfio_dma_do_unmap() - > > > > >> > > > > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_i > > > > ommu > > > > _type > > > > 1. > > > > >> c#L1330), > > > > >> where it checks the dma mapping where it checks for IOVA to free: > > > > >> > > > > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_i > > > > ommu > > > > _type > > > > 1. > > > > >> c#L1418. > > > > >> > > > > >> Thus this change updates the mapping to be created individually > > > > >> instead of colaecing them. > > > > >> > > > > >> lib/eal/linux/eal_vfio.c | 29 ----------------------------- > > > > >> 1 file changed, 29 deletions(-) > > > > >> > > > > >> diff --git a/lib/eal/linux/eal_vfio.c > > > > >> b/lib/eal/linux/eal_vfio.c index 549b86ae1d..56edccb0db 100644 > > > > >> --- a/lib/eal/linux/eal_vfio.c > > > > >> +++ b/lib/eal/linux/eal_vfio.c > > > > >> @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char > > > > *sysfs_base, > > > > >> return 1; > > > > >> } > > > > >> > > > > >> -static int > > > > >> -type1_map_contig(const struct rte_memseg_list *msl, const > > > > >> struct rte_memseg *ms, > > > > >> - size_t len, void *arg) > > > > >> -{ > > > > >> - int *vfio_container_fd = arg; > > > > >> - > > > > >> - if (msl->external) > > > > >> - return 0; > > > > >> - > > > > >> - return vfio_type1_dma_mem_map(*vfio_container_fd, ms- > >addr_64, > > > > >> ms->iova, > > > > >> - len, 1); > > > > >> -} > > > > >> - > > > > >> static int > > > > >> type1_map(const struct rte_memseg_list *msl, const struct > > > > >> rte_memseg > > > > *ms, > > > > >> void *arg) > > > > >> @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list > > > > >> *msl, const struct rte_memseg *ms, > > > > >> if (ms->iova == RTE_BAD_IOVA) > > > > >> return 0; > > > > >> > > > > >> - /* if IOVA mode is VA, we've already mapped the internal > segments > > > > */ > > > > >> - if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) > > > > >> - return 0; > > > > >> - > > > > >> return vfio_type1_dma_mem_map(*vfio_container_fd, > > > > >> ms->addr_64, > > > > >> ms->iova, > > > > >> ms->len, 1); > > > > >> } > > > > >> @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int > > > > vfio_container_fd, > > > > >> uint64_t vaddr, uint64_t iova, static int > > > > >> vfio_type1_dma_map(int > > > > >> vfio_container_fd) { > > > > >> - if (rte_eal_iova_mode() == RTE_IOVA_VA) { > > > > >> - /* with IOVA as VA mode, we can get away with > mapping > > > > >> contiguous > > > > >> - * chunks rather than going page-by-page. > > > > >> - */ > > > > >> - int ret = rte_memseg_contig_walk(type1_map_contig, > > > > >> - &vfio_container_fd); > > > > >> - if (ret) > > > > >> - return ret; > > > > >> - /* we have to continue the walk because we've > skipped the > > > > >> - * external segments during the config walk. > > > > >> - */ > > > > >> - } > > > > >> return rte_memseg_walk(type1_map, &vfio_container_fd); } > > > > >> > > > > >> -- > > > > >> 2.25.1 > > > > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 8:06 ` Ding, Xuan @ 2023-07-04 9:23 ` Gupta, Nipun 2023-07-04 14:09 ` Thomas Monjalon 2023-07-04 15:11 ` Ding, Xuan 0 siblings, 2 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-07-04 9:23 UTC (permalink / raw) To: Ding, Xuan, dev, thomas, Burakov, Anatoly, Yigit, Ferruh, David Marchand Cc: Agarwal, Nikhil, He, Xingguang, Ling, WeiX Hi Xuan, On 7/4/2023 1:36 PM, Ding, Xuan wrote: > Hi Nipun, > >> -----Original Message----- >> From: Gupta, Nipun <Nipun.Gupta@amd.com> >> Sent: Tuesday, July 4, 2023 2:54 PM >> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; thomas@monjalon.net; >> Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit, Ferruh >> <Ferruh.Yigit@amd.com>; David Marchand <david.marchand@redhat.com> >> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang >> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings >> >> Hi Xuan, >> >> Please see inline. >> >>> -----Original Message----- >>> From: Ding, Xuan <xuan.ding@intel.com> >>> Sent: Tuesday, July 4, 2023 10:43 AM >>> To: Gupta, Nipun <Nipun.Gupta@amd.com>; dev@dpdk.org; >>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>> Yigit, Ferruh <Ferruh.Yigit@amd.com> >>> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang >>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings >>> >>> Hi Nipun, >>> >>>> -----Original Message----- >>>> From: Ding, Xuan >>>> Sent: Friday, June 30, 2023 1:58 PM >>>> To: Nipun Gupta <nipun.gupta@amd.com>; dev@dpdk.org; >>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>>> ferruh.yigit@amd.com >>>> Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; >>>> Ling, WeiX <weix.ling@intel.com> >>>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings >>>> >>>> Hi Nipun, >>>> >>>> Replies are inline. >>>> >>>>> -----Original Message----- >>>>> From: Nipun Gupta <nipun.gupta@amd.com> >>>>> Sent: Friday, June 30, 2023 9:46 AM >>>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; >>>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>>>> ferruh.yigit@amd.com >>>>> Cc: nikhil.agarwal@amd.com; He, Xingguang >>>>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >>>>> Subject: Re: [PATCH] vfio: do not coalesce DMA mappings >>>>> >>>>> Hi Xuan, >>>>> >>>>> Thanks for pointing out the issue and figuring out the patch which >>>>> introduced this. If you have answers to below queries, please let me know: >>>>> >>>>> Is there any other test cases which tests "--no-huge" which pass? >>>> >>>> Yes, there are test cases adding "--no-huge" option to validate 4k >>>> page size in async vhost. >>>> Actually, the page size is decided by front-end, so I think this >>>> case can be removed. >>>> >>>> Previously, testpmd can start with "--no-huge" options (not sure if >>>> there are test cases). >>>> Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i >>>> >>>>> >>>>> Also, if we change the "-m" option to provide lower memory, does >>>>> the test pass? >>>> >>>> "-m" option is also added and does not work. >>>> >>>>> >>>>> When you mention too many pages exceed the capability of IOMMU, >>>>> you are referring to HW capability to create multiple pages? Here >>>>> it seems in case of 4K page size we need 256K pages which is limiting the >> capacity? >>>> >>>> Yes, this is the result of my initial debugging. >>>> The direct impact is that this kind of testpmd cases cannot start now. >>>> If this is expected, I think we can close this defect and ignore the "--no- >> huge" >>>> option when start. >>> >>> Any insights? Should we just ignore the "--no-huge" option and close this >> defect? >>> Now we did this as a workaround. Seems no one uses the "--no-huge" >>> option in testpmd now. >> >> VFIO supports dma_entry_limit as a module parameter, which has a default >> value of U16_MAX i.e. 64K, most likely which is limiting creation of 256K >> entries for 4K pages here. This can be modified while inserting vfio module: >> modprobe vfio_iommu_type1 dma_entry_limit=1000000 > > Thanks for your suggestion. I tried it on ubuntu 22.04 but it does not work. > The reason I think is vfio-pci is build-in in kernel driver (since 20.04) and it does not support dynamic insmod/rmmod. > > Does this command need to rmmod vfio first and then modprobe again? > If it is inserted as a module then you can remove using rmmod and then modprobe again with the dma_entry_limit parameter. Also note, vfio_iommu_type1 is the module which is limiting the entries to 64K, so this module needs to be inserted again providing the dma_entry_limit module param. In case the module is built-in you can provide via kernel command line parameter (ref: https://www.kernel.org/doc/html/v4.12/admin-guide/kernel-parameters.html). As per this ref document, "vfio_iommu_type1.dma_entry_limit=1000000" should be used in the bootargs to set the module parameters. FYI.. DPDK documentation also mentions the limitation at: https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html Thanks, Nipun ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 9:23 ` Gupta, Nipun @ 2023-07-04 14:09 ` Thomas Monjalon 2023-07-04 16:39 ` Gupta, Nipun 2023-07-04 15:11 ` Ding, Xuan 1 sibling, 1 reply; 30+ messages in thread From: Thomas Monjalon @ 2023-07-04 14:09 UTC (permalink / raw) To: Ding, Xuan, Burakov, Anatoly, Gupta, Nipun Cc: dev, Yigit, Ferruh, David Marchand, Agarwal, Nikhil, He, Xingguang, Ling, WeiX 04/07/2023 11:23, Gupta, Nipun: > On 7/4/2023 1:36 PM, Ding, Xuan wrote: >> From: Gupta, Nipun <Nipun.Gupta@amd.com> >>> From: Ding, Xuan <xuan.ding@intel.com> >>>> From: Ding, Xuan >>>>> From: Nipun Gupta <nipun.gupta@amd.com> > >>>>> Hi Xuan, > >>>>> > >>>>> Thanks for pointing out the issue and figuring out the patch which > >>>>> introduced this. If you have answers to below queries, please let me know: > >>>>> > >>>>> Is there any other test cases which tests "--no-huge" which pass? > >>>> > >>>> Yes, there are test cases adding "--no-huge" option to validate 4k > >>>> page size in async vhost. > >>>> Actually, the page size is decided by front-end, so I think this > >>>> case can be removed. > >>>> > >>>> Previously, testpmd can start with "--no-huge" options (not sure if > >>>> there are test cases). > >>>> Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i > >>>> > >>>>> > >>>>> Also, if we change the "-m" option to provide lower memory, does > >>>>> the test pass? > >>>> > >>>> "-m" option is also added and does not work. > >>>> > >>>>> > >>>>> When you mention too many pages exceed the capability of IOMMU, > >>>>> you are referring to HW capability to create multiple pages? Here > >>>>> it seems in case of 4K page size we need 256K pages which is limiting the > >> capacity? > >>>> > >>>> Yes, this is the result of my initial debugging. > >>>> The direct impact is that this kind of testpmd cases cannot start now. > >>>> If this is expected, I think we can close this defect and ignore the "--no- > >> huge" > >>>> option when start. > >>> > >>> Any insights? Should we just ignore the "--no-huge" option and close this > >> defect? > >>> Now we did this as a workaround. Seems no one uses the "--no-huge" > >>> option in testpmd now. > >> > >> VFIO supports dma_entry_limit as a module parameter, which has a default > >> value of U16_MAX i.e. 64K, most likely which is limiting creation of 256K > >> entries for 4K pages here. This can be modified while inserting vfio module: > >> modprobe vfio_iommu_type1 dma_entry_limit=1000000 > > > > Thanks for your suggestion. I tried it on ubuntu 22.04 but it does not work. > > The reason I think is vfio-pci is build-in in kernel driver (since 20.04) and it does not support dynamic insmod/rmmod. > > > > Does this command need to rmmod vfio first and then modprobe again? > > > > If it is inserted as a module then you can remove using rmmod and then > modprobe again with the dma_entry_limit parameter. Also note, > vfio_iommu_type1 is the module which is limiting the entries to 64K, so > this module needs to be inserted again providing the dma_entry_limit > module param. > > In case the module is built-in you can provide via kernel command line > parameter (ref: > https://www.kernel.org/doc/html/v4.12/admin-guide/kernel-parameters.html). > As per this ref document, "vfio_iommu_type1.dma_entry_limit=1000000" > should be used in the bootargs to set the module parameters. > > FYI.. DPDK documentation also mentions the limitation at: > https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html Yes the parameter is discussed in https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#vfio-memory-mapping-limits but it does not mention we may need to decrease it with --no-huge. Please could you add this to the documentation? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 14:09 ` Thomas Monjalon @ 2023-07-04 16:39 ` Gupta, Nipun 0 siblings, 0 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-07-04 16:39 UTC (permalink / raw) To: Thomas Monjalon, Ding, Xuan, Burakov, Anatoly Cc: dev, Yigit, Ferruh, David Marchand, Agarwal, Nikhil, He, Xingguang, Ling, WeiX On 7/4/2023 7:39 PM, Thomas Monjalon wrote: > 04/07/2023 11:23, Gupta, Nipun: >> On 7/4/2023 1:36 PM, Ding, Xuan wrote: >>> From: Gupta, Nipun <Nipun.Gupta@amd.com> >>>> From: Ding, Xuan <xuan.ding@intel.com> >>>>> From: Ding, Xuan >>>>>> From: Nipun Gupta <nipun.gupta@amd.com> >>>>>>> Hi Xuan, >>>>>>> >>>>>>> Thanks for pointing out the issue and figuring out the patch which >>>>>>> introduced this. If you have answers to below queries, please let me know: >>>>>>> >>>>>>> Is there any other test cases which tests "--no-huge" which pass? >>>>>> >>>>>> Yes, there are test cases adding "--no-huge" option to validate 4k >>>>>> page size in async vhost. >>>>>> Actually, the page size is decided by front-end, so I think this >>>>>> case can be removed. >>>>>> >>>>>> Previously, testpmd can start with "--no-huge" options (not sure if >>>>>> there are test cases). >>>>>> Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i >>>>>> >>>>>>> >>>>>>> Also, if we change the "-m" option to provide lower memory, does >>>>>>> the test pass? >>>>>> >>>>>> "-m" option is also added and does not work. >>>>>> >>>>>>> >>>>>>> When you mention too many pages exceed the capability of IOMMU, >>>>>>> you are referring to HW capability to create multiple pages? Here >>>>>>> it seems in case of 4K page size we need 256K pages which is limiting the >>>> capacity? >>>>>> >>>>>> Yes, this is the result of my initial debugging. >>>>>> The direct impact is that this kind of testpmd cases cannot start now. >>>>>> If this is expected, I think we can close this defect and ignore the "--no- >>>> huge" >>>>>> option when start. >>>>> >>>>> Any insights? Should we just ignore the "--no-huge" option and close this >>>> defect? >>>>> Now we did this as a workaround. Seems no one uses the "--no-huge" >>>>> option in testpmd now. >>>> >>>> VFIO supports dma_entry_limit as a module parameter, which has a default >>>> value of U16_MAX i.e. 64K, most likely which is limiting creation of 256K >>>> entries for 4K pages here. This can be modified while inserting vfio module: >>>> modprobe vfio_iommu_type1 dma_entry_limit=1000000 >>> >>> Thanks for your suggestion. I tried it on ubuntu 22.04 but it does not work. >>> The reason I think is vfio-pci is build-in in kernel driver (since 20.04) and it does not support dynamic insmod/rmmod. >>> >>> Does this command need to rmmod vfio first and then modprobe again? >>> >> >> If it is inserted as a module then you can remove using rmmod and then >> modprobe again with the dma_entry_limit parameter. Also note, >> vfio_iommu_type1 is the module which is limiting the entries to 64K, so >> this module needs to be inserted again providing the dma_entry_limit >> module param. >> >> In case the module is built-in you can provide via kernel command line >> parameter (ref: >> https://www.kernel.org/doc/html/v4.12/admin-guide/kernel-parameters.html). >> As per this ref document, "vfio_iommu_type1.dma_entry_limit=1000000" >> should be used in the bootargs to set the module parameters. >> >> FYI.. DPDK documentation also mentions the limitation at: >> https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html > > Yes the parameter is discussed in > https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#vfio-memory-mapping-limits > but it does not mention we may need to decrease it with --no-huge. > Please could you add this to the documentation? sure! Ill send out a patch for this. > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 9:23 ` Gupta, Nipun 2023-07-04 14:09 ` Thomas Monjalon @ 2023-07-04 15:11 ` Ding, Xuan 2023-07-04 16:42 ` Gupta, Nipun 1 sibling, 1 reply; 30+ messages in thread From: Ding, Xuan @ 2023-07-04 15:11 UTC (permalink / raw) To: Gupta, Nipun, dev, thomas, Burakov, Anatoly, Yigit, Ferruh, David Marchand Cc: Agarwal, Nikhil, He, Xingguang, Ling, WeiX Hi Nipun, > -----Original Message----- > From: Gupta, Nipun <nipun.gupta@amd.com> > Sent: Tuesday, July 4, 2023 5:23 PM > To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; thomas@monjalon.net; > Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit, Ferruh > <Ferruh.Yigit@amd.com>; David Marchand <david.marchand@redhat.com> > Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang > <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > Subject: Re: [PATCH] vfio: do not coalesce DMA mappings > > Hi Xuan, > > On 7/4/2023 1:36 PM, Ding, Xuan wrote: > > Hi Nipun, > > > >> -----Original Message----- > >> From: Gupta, Nipun <Nipun.Gupta@amd.com> > >> Sent: Tuesday, July 4, 2023 2:54 PM > >> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; > >> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > >> Yigit, Ferruh <Ferruh.Yigit@amd.com>; David Marchand > >> <david.marchand@redhat.com> > >> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang > >> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > >> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > >> > >> Hi Xuan, > >> > >> Please see inline. > >> > >>> -----Original Message----- > >>> From: Ding, Xuan <xuan.ding@intel.com> > >>> Sent: Tuesday, July 4, 2023 10:43 AM > >>> To: Gupta, Nipun <Nipun.Gupta@amd.com>; dev@dpdk.org; > >>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > >>> Yigit, Ferruh <Ferruh.Yigit@amd.com> > >>> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang > >>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > >>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > >>> > >>> Hi Nipun, > >>> > >>>> -----Original Message----- > >>>> From: Ding, Xuan > >>>> Sent: Friday, June 30, 2023 1:58 PM > >>>> To: Nipun Gupta <nipun.gupta@amd.com>; dev@dpdk.org; > >>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > >>>> ferruh.yigit@amd.com > >>>> Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; > >>>> Ling, WeiX <weix.ling@intel.com> > >>>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings > >>>> > >>>> Hi Nipun, > >>>> > >>>> Replies are inline. > >>>> > >>>>> -----Original Message----- > >>>>> From: Nipun Gupta <nipun.gupta@amd.com> > >>>>> Sent: Friday, June 30, 2023 9:46 AM > >>>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; > >>>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; > >>>>> ferruh.yigit@amd.com > >>>>> Cc: nikhil.agarwal@amd.com; He, Xingguang > >>>>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> > >>>>> Subject: Re: [PATCH] vfio: do not coalesce DMA mappings > >>>>> > >>>>> Hi Xuan, > >>>>> > >>>>> Thanks for pointing out the issue and figuring out the patch which > >>>>> introduced this. If you have answers to below queries, please let me > know: > >>>>> > >>>>> Is there any other test cases which tests "--no-huge" which pass? > >>>> > >>>> Yes, there are test cases adding "--no-huge" option to validate 4k > >>>> page size in async vhost. > >>>> Actually, the page size is decided by front-end, so I think this > >>>> case can be removed. > >>>> > >>>> Previously, testpmd can start with "--no-huge" options (not sure if > >>>> there are test cases). > >>>> Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i > >>>> > >>>>> > >>>>> Also, if we change the "-m" option to provide lower memory, does > >>>>> the test pass? > >>>> > >>>> "-m" option is also added and does not work. > >>>> > >>>>> > >>>>> When you mention too many pages exceed the capability of IOMMU, > >>>>> you are referring to HW capability to create multiple pages? Here > >>>>> it seems in case of 4K page size we need 256K pages which is > >>>>> limiting the > >> capacity? > >>>> > >>>> Yes, this is the result of my initial debugging. > >>>> The direct impact is that this kind of testpmd cases cannot start now. > >>>> If this is expected, I think we can close this defect and ignore > >>>> the "--no- > >> huge" > >>>> option when start. > >>> > >>> Any insights? Should we just ignore the "--no-huge" option and close > >>> this > >> defect? > >>> Now we did this as a workaround. Seems no one uses the "--no-huge" > >>> option in testpmd now. > >> > >> VFIO supports dma_entry_limit as a module parameter, which has a > >> default value of U16_MAX i.e. 64K, most likely which is limiting > >> creation of 256K entries for 4K pages here. This can be modified while > inserting vfio module: > >> modprobe vfio_iommu_type1 dma_entry_limit=1000000 > > > > Thanks for your suggestion. I tried it on ubuntu 22.04 but it does not work. > > The reason I think is vfio-pci is build-in in kernel driver (since 20.04) and it > does not support dynamic insmod/rmmod. > > > > Does this command need to rmmod vfio first and then modprobe again? > > > > If it is inserted as a module then you can remove using rmmod and then > modprobe again with the dma_entry_limit parameter. Also note, > vfio_iommu_type1 is the module which is limiting the entries to 64K, so this > module needs to be inserted again providing the dma_entry_limit module > param. > > In case the module is built-in you can provide via kernel command line > parameter (ref: > https://www.kernel.org/doc/html/v4.12/admin-guide/kernel- > parameters.html). > As per this ref document, "vfio_iommu_type1.dma_entry_limit=1000000" > should be used in the bootargs to set the module parameters. > > FYI.. DPDK documentation also mentions the limitation at: > https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html Thanks for pointing out these references. Add supplement for configuring build-in vfio module: Except adding " vfio_iommu_type1.dma_entry_limit=1000000" in bootargs, we can use kernel command line: "echo 1000000 > /sys/module/vfio_iommu_type1/parameters/dma_entry_limit". Both methods works, thanks again. Regards, Xuan > > Thanks, > Nipun ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] vfio: do not coalesce DMA mappings 2023-07-04 15:11 ` Ding, Xuan @ 2023-07-04 16:42 ` Gupta, Nipun 0 siblings, 0 replies; 30+ messages in thread From: Gupta, Nipun @ 2023-07-04 16:42 UTC (permalink / raw) To: Ding, Xuan, dev, thomas, Burakov, Anatoly, Yigit, Ferruh, David Marchand Cc: Agarwal, Nikhil, He, Xingguang, Ling, WeiX On 7/4/2023 8:41 PM, Ding, Xuan wrote: > Hi Nipun, > >> -----Original Message----- >> From: Gupta, Nipun <nipun.gupta@amd.com> >> Sent: Tuesday, July 4, 2023 5:23 PM >> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; thomas@monjalon.net; >> Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit, Ferruh >> <Ferruh.Yigit@amd.com>; David Marchand <david.marchand@redhat.com> >> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang >> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >> Subject: Re: [PATCH] vfio: do not coalesce DMA mappings >> >> Hi Xuan, >> >> On 7/4/2023 1:36 PM, Ding, Xuan wrote: >>> Hi Nipun, >>> >>>> -----Original Message----- >>>> From: Gupta, Nipun <Nipun.Gupta@amd.com> >>>> Sent: Tuesday, July 4, 2023 2:54 PM >>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; >>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>>> Yigit, Ferruh <Ferruh.Yigit@amd.com>; David Marchand >>>> <david.marchand@redhat.com> >>>> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang >>>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >>>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings >>>> >>>> Hi Xuan, >>>> >>>> Please see inline. >>>> >>>>> -----Original Message----- >>>>> From: Ding, Xuan <xuan.ding@intel.com> >>>>> Sent: Tuesday, July 4, 2023 10:43 AM >>>>> To: Gupta, Nipun <Nipun.Gupta@amd.com>; dev@dpdk.org; >>>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>>>> Yigit, Ferruh <Ferruh.Yigit@amd.com> >>>>> Cc: Agarwal, Nikhil <nikhil.agarwal@amd.com>; He, Xingguang >>>>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >>>>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings >>>>> >>>>> Hi Nipun, >>>>> >>>>>> -----Original Message----- >>>>>> From: Ding, Xuan >>>>>> Sent: Friday, June 30, 2023 1:58 PM >>>>>> To: Nipun Gupta <nipun.gupta@amd.com>; dev@dpdk.org; >>>>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>>>>> ferruh.yigit@amd.com >>>>>> Cc: nikhil.agarwal@amd.com; He, Xingguang <xingguang.he@intel.com>; >>>>>> Ling, WeiX <weix.ling@intel.com> >>>>>> Subject: RE: [PATCH] vfio: do not coalesce DMA mappings >>>>>> >>>>>> Hi Nipun, >>>>>> >>>>>> Replies are inline. >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Nipun Gupta <nipun.gupta@amd.com> >>>>>>> Sent: Friday, June 30, 2023 9:46 AM >>>>>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org; >>>>>>> thomas@monjalon.net; Burakov, Anatoly <anatoly.burakov@intel.com>; >>>>>>> ferruh.yigit@amd.com >>>>>>> Cc: nikhil.agarwal@amd.com; He, Xingguang >>>>>>> <xingguang.he@intel.com>; Ling, WeiX <weix.ling@intel.com> >>>>>>> Subject: Re: [PATCH] vfio: do not coalesce DMA mappings >>>>>>> >>>>>>> Hi Xuan, >>>>>>> >>>>>>> Thanks for pointing out the issue and figuring out the patch which >>>>>>> introduced this. If you have answers to below queries, please let me >> know: >>>>>>> >>>>>>> Is there any other test cases which tests "--no-huge" which pass? >>>>>> >>>>>> Yes, there are test cases adding "--no-huge" option to validate 4k >>>>>> page size in async vhost. >>>>>> Actually, the page size is decided by front-end, so I think this >>>>>> case can be removed. >>>>>> >>>>>> Previously, testpmd can start with "--no-huge" options (not sure if >>>>>> there are test cases). >>>>>> Cmd: ./build/app/dpdk-testpmd -l 5-6 -n 4 --no-huge -m 1024 -- -i >>>>>> >>>>>>> >>>>>>> Also, if we change the "-m" option to provide lower memory, does >>>>>>> the test pass? >>>>>> >>>>>> "-m" option is also added and does not work. >>>>>> >>>>>>> >>>>>>> When you mention too many pages exceed the capability of IOMMU, >>>>>>> you are referring to HW capability to create multiple pages? Here >>>>>>> it seems in case of 4K page size we need 256K pages which is >>>>>>> limiting the >>>> capacity? >>>>>> >>>>>> Yes, this is the result of my initial debugging. >>>>>> The direct impact is that this kind of testpmd cases cannot start now. >>>>>> If this is expected, I think we can close this defect and ignore >>>>>> the "--no- >>>> huge" >>>>>> option when start. >>>>> >>>>> Any insights? Should we just ignore the "--no-huge" option and close >>>>> this >>>> defect? >>>>> Now we did this as a workaround. Seems no one uses the "--no-huge" >>>>> option in testpmd now. >>>> >>>> VFIO supports dma_entry_limit as a module parameter, which has a >>>> default value of U16_MAX i.e. 64K, most likely which is limiting >>>> creation of 256K entries for 4K pages here. This can be modified while >> inserting vfio module: >>>> modprobe vfio_iommu_type1 dma_entry_limit=1000000 >>> >>> Thanks for your suggestion. I tried it on ubuntu 22.04 but it does not work. >>> The reason I think is vfio-pci is build-in in kernel driver (since 20.04) and it >> does not support dynamic insmod/rmmod. >>> >>> Does this command need to rmmod vfio first and then modprobe again? >>> >> >> If it is inserted as a module then you can remove using rmmod and then >> modprobe again with the dma_entry_limit parameter. Also note, >> vfio_iommu_type1 is the module which is limiting the entries to 64K, so this >> module needs to be inserted again providing the dma_entry_limit module >> param. >> >> In case the module is built-in you can provide via kernel command line >> parameter (ref: >> https://www.kernel.org/doc/html/v4.12/admin-guide/kernel- >> parameters.html). >> As per this ref document, "vfio_iommu_type1.dma_entry_limit=1000000" >> should be used in the bootargs to set the module parameters. >> >> FYI.. DPDK documentation also mentions the limitation at: >> https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html > > Thanks for pointing out these references. > > Add supplement for configuring build-in vfio module: > Except adding " vfio_iommu_type1.dma_entry_limit=1000000" in bootargs, > we can use kernel command line: "echo 1000000 > /sys/module/vfio_iommu_type1/parameters/dma_entry_limit". > > Both methods works, thanks again. Good to know this works! > > Regards, > Xuan > >> >> Thanks, >> Nipun ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2023-07-04 16:42 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-12-30 9:58 [PATCH] vfio: do not coalesce DMA mappings Nipun Gupta 2023-01-04 5:19 ` [PATCH v2] " Nipun Gupta 2023-02-02 10:48 ` David Marchand 2023-02-07 8:56 ` Gupta, Nipun 2023-04-04 14:57 ` Gupta, Nipun 2023-04-04 15:13 ` Burakov, Anatoly 2023-04-04 16:32 ` Nipun Gupta 2023-04-05 14:16 ` Burakov, Anatoly 2023-04-05 15:06 ` Gupta, Nipun 2023-04-24 15:22 ` David Marchand 2023-04-24 16:10 ` Stephen Hemminger 2023-04-24 16:16 ` Gupta, Nipun 2023-05-10 12:58 ` Nipun Gupta 2023-05-11 14:08 ` Burakov, Anatoly 2023-04-11 15:13 ` Thomas Monjalon 2023-04-11 16:51 ` Gupta, Nipun 2023-04-07 6:13 ` Nipun Gupta 2023-04-07 7:26 ` David Marchand 2023-05-15 11:16 ` David Marchand 2023-06-29 8:21 ` [PATCH] " Ding, Xuan 2023-06-30 1:45 ` Nipun Gupta 2023-06-30 5:58 ` Ding, Xuan 2023-07-04 5:13 ` Ding, Xuan 2023-07-04 6:53 ` Gupta, Nipun 2023-07-04 8:06 ` Ding, Xuan 2023-07-04 9:23 ` Gupta, Nipun 2023-07-04 14:09 ` Thomas Monjalon 2023-07-04 16:39 ` Gupta, Nipun 2023-07-04 15:11 ` Ding, Xuan 2023-07-04 16:42 ` Gupta, Nipun
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).