patches for DPDK stable branches
 help / color / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Nithin Dabilpuram <nithind1988@gmail.com>
Cc: Jerin Jacob <jerinj@marvell.com>, dev@dpdk.org, stable@dpdk.org
Subject: Re: [dpdk-stable] [dpdk-dev] [EXT] Re: [PATCH 2/2] vfio: fix partial DMA unmapping for VFIO type1
Date: Sat, 17 Oct 2020 17:14:55 +0100
Message-ID: <4deaf00f-02d3-15b3-2ebe-4a2becc89251@intel.com> (raw)
In-Reply-To: <20201016071015.GA22749@gmail.com>

On 16-Oct-20 8:10 AM, Nithin Dabilpuram wrote:
> On Thu, Oct 15, 2020 at 04:10:31PM +0100, Burakov, Anatoly wrote:
>> On 15-Oct-20 12:57 PM, Nithin Dabilpuram wrote:
>>> On Thu, Oct 15, 2020 at 3:31 PM Burakov, Anatoly
>>> <anatoly.burakov@intel.com> wrote:
>>>>
>>>> On 15-Oct-20 7:09 AM, Nithin Dabilpuram wrote:
>>>>> On Wed, Oct 14, 2020 at 04:07:10PM +0100, Burakov, Anatoly wrote:
>>>>>> External Email
>>>>>>
>>>>>> ----------------------------------------------------------------------
>>>>>> On 12-Oct-20 9:11 AM, Nithin Dabilpuram wrote:
>>>>>>> Partial unmapping is not supported for VFIO IOMMU type1
>>>>>>> by kernel. Though kernel gives return as zero, the unmapped size
>>>>>>> returned will not be same as expected. So check for
>>>>>>> returned unmap size and return error.
>>>>>>>
>>>>>>> For case of DMA map/unmap triggered by heap allocations,
>>>>>>> maintain granularity of memseg page size so that heap
>>>>>>> expansion and contraction does not have this issue.
>>>>>>
>>>>>> This is quite unfortunate, because there was a different bug that had to do
>>>>>> with kernel having a very limited number of mappings available [1], as a
>>>>>> result of which the page concatenation code was added.
>>>>>>
>>>>>> It should therefore be documented that the dma_entry_limit parameter should
>>>>>> be adjusted should the user run out of the DMA entries.
>>>>>>
>>>>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_lkml_155414977872.12780.13728555131525362206.stgit-40gimli.home_T_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=3GMg-634_cdUCY4WpQPwjzZ_S4ckuMHOnt2FxyyjXMk&s=TJLzppkaDS95VGyRHX2hzflQfb9XLK0OiOszSXoeXKk&e=
>>>>
>>>> <snip>
>>>>
>>>>>>>                       RTE_LOG(ERR, EAL, "  cannot clear DMA remapping, error %i (%s)\n",
>>>>>>>                                       errno, strerror(errno));
>>>>>>>                       return -1;
>>>>>>> +           } else if (dma_unmap.size != len) {
>>>>>>> +                   RTE_LOG(ERR, EAL, "  unexpected size %"PRIu64" of DMA "
>>>>>>> +                           "remapping cleared instead of %"PRIu64"\n",
>>>>>>> +                           (uint64_t)dma_unmap.size, len);
>>>>>>> +                   rte_errno = EIO;
>>>>>>> +                   return -1;
>>>>>>>               }
>>>>>>>       }
>>>>>>> @@ -1853,6 +1869,12 @@ container_dma_unmap(struct vfio_config *vfio_cfg, uint64_t vaddr, uint64_t iova,
>>>>>>>               /* we're partially unmapping a previously mapped region, so we
>>>>>>>                * need to split entry into two.
>>>>>>>                */
>>>>>>> +           if (!vfio_cfg->vfio_iommu_type->partial_unmap) {
>>>>>>> +                   RTE_LOG(DEBUG, EAL, "DMA partial unmap unsupported\n");
>>>>>>> +                   rte_errno = ENOTSUP;
>>>>>>> +                   ret = -1;
>>>>>>> +                   goto out;
>>>>>>> +           }
>>>>>>
>>>>>> How would we ever arrive here if we never do more than 1 page worth of
>>>>>> memory anyway? I don't think this is needed.
>>>>>
>>>>> container_dma_unmap() is called by user via rte_vfio_container_dma_unmap()
>>>>> and when he maps we don't split it as we don't about his memory.
>>>>> So if he maps multiple pages and tries to unmap partially, then we should fail.
>>>>
>>>> Should we map it in page granularity then, instead of adding this
>>>> discrepancy between EAL and user mapping? I.e. instead of adding a
>>>> workaround, how about we just do the same thing for user mem mappings?
>>>>
>>> In heap mapping's we map and unmap it at huge page granularity as we will always
>>> maintain that.
>>>
>>> But here I think we don't know if user's allocation is huge page or
>>> collection of system
>>> pages. Only thing we can do here is map it at system page granularity which
>>> could waste entries if he say really is working with hugepages. Isn't ?
>>>
>>
>> Yeah we do. The API mandates the pages granularity, and it will check
>> against page size and number of IOVA entries, so yes, we do enforce the fact
>> that the IOVA addresses supplied by the user have to be page addresses.
> 
> If I see rte_vfio_container_dma_map(), there is no mention of Huge page size
> user is providing or we computing. He can call rte_vfio_container_dma_map()
> with 1GB huge page or 4K system page.
> 
> Am I missing something ?

Are you suggesting that a DMA mapping for hugepage-backed memory will be 
made at system page size granularity? E.g. will a 1GB page-backed 
segment be mapped for DMA as a contiguous 4K-based block?

>>
>> -- 
>> Thanks,
>> Anatoly


-- 
Thanks,
Anatoly

  reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20201012081106.10610-1-ndabilpuram@marvell.com>
2020-10-12  8:11 ` [dpdk-stable] " Nithin Dabilpuram
2020-10-14 15:07   ` Burakov, Anatoly
2020-10-15  6:09     ` [dpdk-stable] [EXT] " Nithin Dabilpuram
2020-10-15 10:00       ` Burakov, Anatoly
2020-10-15 11:38         ` Nithin Dabilpuram
2020-10-15 11:50         ` Nithin Dabilpuram
2020-10-15 11:57         ` [dpdk-stable] [dpdk-dev] " Nithin Dabilpuram
2020-10-15 15:10           ` Burakov, Anatoly
2020-10-16  7:10             ` Nithin Dabilpuram
2020-10-17 16:14               ` Burakov, Anatoly [this message]
2020-10-19  9:43                 ` Nithin Dabilpuram
2020-10-22 12:13                   ` Nithin Dabilpuram
2020-10-28 13:04                     ` Burakov, Anatoly
2020-10-28 14:17                       ` Nithin Dabilpuram
2020-10-28 16:07                         ` Burakov, Anatoly
2020-10-28 16:31                           ` Nithin Dabilpuram
     [not found] ` <20201105090423.11954-1-ndabilpuram@marvell.com>
2020-11-05  9:04   ` [dpdk-stable] [PATCH v2 1/3] vfio: revert changes for map contiguous areas in one go Nithin Dabilpuram
2020-11-05  9:04   ` [dpdk-stable] [PATCH v2 2/3] vfio: fix DMA mapping granularity for type1 iova as va Nithin Dabilpuram
2020-11-10 14:04     ` Burakov, Anatoly
2020-11-10 14:22       ` [dpdk-stable] [dpdk-dev] " Burakov, Anatoly
2020-11-10 14:17     ` [dpdk-stable] " Burakov, Anatoly
2020-11-11  5:08       ` [dpdk-stable] [dpdk-dev] " Nithin Dabilpuram
2020-11-11 10:00         ` Burakov, Anatoly

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4deaf00f-02d3-15b3-2ebe-4a2becc89251@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=nithind1988@gmail.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

patches for DPDK stable branches

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ http://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/ public-inbox