From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Nithin Dabilpuram <nithind1988@gmail.com>
Cc: Jerin Jacob <jerinj@marvell.com>, dev@dpdk.org, stable@dpdk.org
Subject: Re: [dpdk-stable] [dpdk-dev] [EXT] Re: [PATCH 2/2] vfio: fix partial DMA unmapping for VFIO type1
Date: Wed, 28 Oct 2020 13:04:26 +0000 [thread overview]
Message-ID: <8cd0fd55-1af6-543d-50c1-80c5f4883550@intel.com> (raw)
In-Reply-To: <X5F3a0zdfASX2feK@gmail.com>
On 22-Oct-20 1:13 PM, Nithin Dabilpuram wrote:
> Ping.
>
> On Mon, Oct 19, 2020 at 03:13:15PM +0530, Nithin Dabilpuram wrote:
>> On Sat, Oct 17, 2020 at 05:14:55PM +0100, Burakov, Anatoly wrote:
>>> On 16-Oct-20 8:10 AM, Nithin Dabilpuram wrote:
>>>> On Thu, Oct 15, 2020 at 04:10:31PM +0100, Burakov, Anatoly wrote:
>>>>> On 15-Oct-20 12:57 PM, Nithin Dabilpuram wrote:
>>>>>> On Thu, Oct 15, 2020 at 3:31 PM Burakov, Anatoly
>>>>>> <anatoly.burakov@intel.com> wrote:
>>>>>>>
>>>>>>> On 15-Oct-20 7:09 AM, Nithin Dabilpuram wrote:
>>>>>>>> On Wed, Oct 14, 2020 at 04:07:10PM +0100, Burakov, Anatoly wrote:
>>>>>>>>> External Email
>>>>>>>>>
>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>> On 12-Oct-20 9:11 AM, Nithin Dabilpuram wrote:
>>>>>>>>>> Partial unmapping is not supported for VFIO IOMMU type1
>>>>>>>>>> by kernel. Though kernel gives return as zero, the unmapped size
>>>>>>>>>> returned will not be same as expected. So check for
>>>>>>>>>> returned unmap size and return error.
>>>>>>>>>>
>>>>>>>>>> For case of DMA map/unmap triggered by heap allocations,
>>>>>>>>>> maintain granularity of memseg page size so that heap
>>>>>>>>>> expansion and contraction does not have this issue.
>>>>>>>>>
>>>>>>>>> This is quite unfortunate, because there was a different bug that had to do
>>>>>>>>> with kernel having a very limited number of mappings available [1], as a
>>>>>>>>> result of which the page concatenation code was added.
>>>>>>>>>
>>>>>>>>> It should therefore be documented that the dma_entry_limit parameter should
>>>>>>>>> be adjusted should the user run out of the DMA entries.
>>>>>>>>>
>>>>>>>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_lkml_155414977872.12780.13728555131525362206.stgit-40gimli.home_T_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=3GMg-634_cdUCY4WpQPwjzZ_S4ckuMHOnt2FxyyjXMk&s=TJLzppkaDS95VGyRHX2hzflQfb9XLK0OiOszSXoeXKk&e=
>>>>>>>
>>>>>>> <snip>
>>>>>>>
>>>>>>>>>> RTE_LOG(ERR, EAL, " cannot clear DMA remapping, error %i (%s)\n",
>>>>>>>>>> errno, strerror(errno));
>>>>>>>>>> return -1;
>>>>>>>>>> + } else if (dma_unmap.size != len) {
>>>>>>>>>> + RTE_LOG(ERR, EAL, " unexpected size %"PRIu64" of DMA "
>>>>>>>>>> + "remapping cleared instead of %"PRIu64"\n",
>>>>>>>>>> + (uint64_t)dma_unmap.size, len);
>>>>>>>>>> + rte_errno = EIO;
>>>>>>>>>> + return -1;
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> @@ -1853,6 +1869,12 @@ container_dma_unmap(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,
>>>>>>>>>> /* we're partially unmapping a previously mapped region, so we
>>>>>>>>>> * need to split entry into two.
>>>>>>>>>> */
>>>>>>>>>> + if (!vfio_cfg->vfio_iommu_type->partial_unmap) {
>>>>>>>>>> + RTE_LOG(DEBUG, EAL, "DMA partial unmap unsupported\n");
>>>>>>>>>> + rte_errno = ENOTSUP;
>>>>>>>>>> + ret = -1;
>>>>>>>>>> + goto out;
>>>>>>>>>> + }
>>>>>>>>>
>>>>>>>>> How would we ever arrive here if we never do more than 1 page worth of
>>>>>>>>> memory anyway? I don't think this is needed.
>>>>>>>>
>>>>>>>> container_dma_unmap() is called by user via rte_vfio_container_dma_unmap()
>>>>>>>> and when he maps we don't split it as we don't about his memory.
>>>>>>>> So if he maps multiple pages and tries to unmap partially, then we should fail.
>>>>>>>
>>>>>>> Should we map it in page granularity then, instead of adding this
>>>>>>> discrepancy between EAL and user mapping? I.e. instead of adding a
>>>>>>> workaround, how about we just do the same thing for user mem mappings?
>>>>>>>
>>>>>> In heap mapping's we map and unmap it at huge page granularity as we will always
>>>>>> maintain that.
>>>>>>
>>>>>> But here I think we don't know if user's allocation is huge page or
>>>>>> collection of system
>>>>>> pages. Only thing we can do here is map it at system page granularity which
>>>>>> could waste entries if he say really is working with hugepages. Isn't ?
>>>>>>
>>>>>
>>>>> Yeah we do. The API mandates the pages granularity, and it will check
>>>>> against page size and number of IOVA entries, so yes, we do enforce the fact
>>>>> that the IOVA addresses supplied by the user have to be page addresses.
>>>>
>>>> If I see rte_vfio_container_dma_map(), there is no mention of Huge page size
>>>> user is providing or we computing. He can call rte_vfio_container_dma_map()
>>>> with 1GB huge page or 4K system page.
>>>>
>>>> Am I missing something ?
>>>
>>> Are you suggesting that a DMA mapping for hugepage-backed memory will be
>>> made at system page size granularity? E.g. will a 1GB page-backed segment be
>>> mapped for DMA as a contiguous 4K-based block?
>>
>> I'm not suggesting anything. My only thought is how to solve below problem.
>> Say application does the following.
>>
>> #1 Allocate 1GB memory from huge page or some external mem.
>> #2 Do rte_vfio_container_dma_map(RTE_VFIO_DEFAULT_CONTAINER_FD, mem, mem, 1GB)
>> In linux/eal_vfio.c, we map it is as single VFIO DMA entry of 1 GB as we
>> don't know where this memory is coming from or backed by what.
>> #3 After a while call rte_vfio_container_dma_unmap(RTE_VFIO_DEFAULT_CONTAINER_FD, mem+4KB, mem+4KB, 4KB)
>>
>> Though rte_vfio_container_dma_unmap() supports #3 by splitting entry as shown below,
>> In VFIO type1 iommu, #3 cannot be supported by current kernel interface. So how
>> can we allow #3 ?
>>
>>
>> static int
>> container_dma_unmap(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,
>> uint64_t len)
>> {
>> struct user_mem_map *map, *new_map = NULL;
>> struct user_mem_maps *user_mem_maps;
>> int ret = 0;
>>
>> user_mem_maps = &vfio_cfg->mem_maps;
>> rte_spinlock_recursive_lock(&user_mem_maps->lock);
>>
>> /* find our mapping */
>> map = find_user_mem_map(user_mem_maps, vaddr, iova, len);
>> if (!map) {
>> RTE_LOG(ERR, EAL, "Couldn't find previously mapped region\n");
>> rte_errno = EINVAL;
>> ret = -1;
>> goto out;
>> }
>> if (map->addr != vaddr || map->iova != iova || map->len != len) {
>> /* we're partially unmapping a previously mapped region, so we
>> * need to split entry into two.
>> */
Hi,
Apologies, i was on vacation.
Yes, I can see the problem now. Does VFIO even support non-system page
sizes? Like, if i allocated a 1GB page, would i be able to map *this
page* for DMA, as opposed to first 4K of this page? I suspect that the
mapping doesn't support page sizes other than the system page size.
--
Thanks,
Anatoly
next prev parent reply other threads:[~2020-10-28 13:04 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20201012081106.10610-1-ndabilpuram@marvell.com>
2020-10-12 8:11 ` [dpdk-stable] " Nithin Dabilpuram
2020-10-14 15:07 ` Burakov, Anatoly
2020-10-15 6:09 ` [dpdk-stable] [EXT] " Nithin Dabilpuram
2020-10-15 10:00 ` Burakov, Anatoly
2020-10-15 11:38 ` Nithin Dabilpuram
2020-10-15 11:50 ` Nithin Dabilpuram
2020-10-15 11:57 ` [dpdk-stable] [dpdk-dev] " Nithin Dabilpuram
2020-10-15 15:10 ` Burakov, Anatoly
2020-10-16 7:10 ` Nithin Dabilpuram
2020-10-17 16:14 ` Burakov, Anatoly
2020-10-19 9:43 ` Nithin Dabilpuram
2020-10-22 12:13 ` Nithin Dabilpuram
2020-10-28 13:04 ` Burakov, Anatoly [this message]
2020-10-28 14:17 ` Nithin Dabilpuram
2020-10-28 16:07 ` Burakov, Anatoly
2020-10-28 16:31 ` Nithin Dabilpuram
[not found] ` <20201105090423.11954-1-ndabilpuram@marvell.com>
2020-11-05 9:04 ` [dpdk-stable] [PATCH v2 1/3] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2020-11-05 9:04 ` [dpdk-stable] [PATCH v2 2/3] vfio: fix DMA mapping granularity for type1 iova as va Nithin Dabilpuram
2020-11-10 14:04 ` Burakov, Anatoly
2020-11-10 14:22 ` [dpdk-stable] [dpdk-dev] " Burakov, Anatoly
2020-11-10 14:17 ` [dpdk-stable] " Burakov, Anatoly
2020-11-11 5:08 ` [dpdk-stable] [dpdk-dev] " Nithin Dabilpuram
2020-11-11 10:00 ` Burakov, Anatoly
[not found] ` <20201201193302.28131-1-ndabilpuram@marvell.com>
2020-12-01 19:32 ` [dpdk-stable] [PATCH v3 1/4] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2020-12-01 19:33 ` [dpdk-stable] [PATCH v3 2/4] vfio: fix DMA mapping granularity for type1 IOVA as VA Nithin Dabilpuram
[not found] ` <20201202054647.3449-1-ndabilpuram@marvell.com>
2020-12-02 5:46 ` [dpdk-stable] [PATCH v4 1/4] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2020-12-02 18:36 ` David Christensen
2020-12-02 5:46 ` [dpdk-stable] [PATCH v4 2/4] vfio: fix DMA mapping granularity for type1 IOVA as VA Nithin Dabilpuram
2020-12-02 18:38 ` David Christensen
[not found] ` <20201214081935.23577-1-ndabilpuram@marvell.com>
2020-12-14 8:19 ` [dpdk-stable] [PATCH v5 1/4] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2020-12-14 8:19 ` [dpdk-stable] [PATCH v5 2/4] vfio: fix DMA mapping granularity for type1 IOVA as VA Nithin Dabilpuram
[not found] ` <20201217190604.29803-1-ndabilpuram@marvell.com>
2020-12-17 19:06 ` [dpdk-stable] [PATCH v6 1/4] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2020-12-17 19:06 ` [dpdk-stable] [PATCH v6 2/4] vfio: fix DMA mapping granularity for type1 IOVA as VA Nithin Dabilpuram
[not found] ` <20210112173923.30320-1-ndabilpuram@marvell.com>
2021-01-12 17:39 ` [dpdk-stable] [PATCH v7 1/3] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2021-01-12 17:39 ` [dpdk-stable] [PATCH v7 2/3] vfio: fix DMA mapping granularity for type1 IOVA as VA Nithin Dabilpuram
[not found] ` <20210115073243.7025-1-ndabilpuram@marvell.com>
2021-01-15 7:32 ` [dpdk-stable] [PATCH v8 1/3] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2021-03-05 7:50 ` David Marchand
2021-03-05 13:54 ` Burakov, Anatoly
2021-03-05 15:50 ` [dpdk-stable] [dpdk-dev] " Nithin Dabilpuram
2021-04-01 11:27 ` Burakov, Anatoly
2021-01-15 7:32 ` [dpdk-stable] [PATCH v8 2/3] vfio: fix DMA mapping granularity for type1 IOVA as VA Nithin Dabilpuram
2021-01-15 7:32 ` [dpdk-stable] [PATCH v8 3/3] test: change external memory test to use system page sz Nithin Dabilpuram
2021-02-11 11:21 ` Burakov, Anatoly
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8cd0fd55-1af6-543d-50c1-80c5f4883550@intel.com \
--to=anatoly.burakov@intel.com \
--cc=dev@dpdk.org \
--cc=jerinj@marvell.com \
--cc=nithind1988@gmail.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).