DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Takeshi T Yoshimura" <TYOS@jp.ibm.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: "Mo, YufengX" <yufengx.mo@intel.com>,
	dev@dpdk.org, "David Christensen" <drc@ibm.com>,
	"Pradeep Satyanarayana" <pradeep@us.ibm.com>
Subject: Re: [dpdk-dev] [PATCH] vfio: fix expanding DMA area in ppc64le
Date: Fri, 28 Jun 2019 11:38:56 +0000	[thread overview]
Message-ID: <OFD546F28C.B5422A21-ON00258427.003A2A38-00258427.003FFD5F@notes.na.collabserv.com> (raw)
In-Reply-To: <4b10a275-10b4-9806-997f-7241a9e5cfed@intel.com>

>To: "Mo, YufengX" <yufengx.mo@intel.com>, dev@dpdk.org
>From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
>Date: 06/26/2019 06:43PM
>Cc: drc@ibm.com, pradeep@us.ibm.com, Takeshi Yoshimura
><tyos@jp.ibm.com>
>Subject: [EXTERNAL] Re: [dpdk-dev] [PATCH] vfio: fix expanding DMA
>area in ppc64le
>
>On 18-Jun-19 3:37 AM, Mo, YufengX wrote:
>> From: Takeshi Yoshimura <tyos@jp.ibm.com>
>> 
>> In ppc64le, expanding DMA areas always fail because we cannot
>remove
>> a DMA window. As a result, we cannot allocate more than one memseg
>in
>> ppc64le. This is because vfio_spapr_dma_mem_map() doesn't unmap all
>> the mapped DMA before removing the window. This patch fixes this
>> incorrect behavior.
>> 
>> I added a global variable to track current window size since we do
>> not have better ways to get exact size of it than doing so. sPAPR
>> IOMMU seems not to provide any ways to get window size with ioctl
>> interfaces. rte_memseg_walk*() is currently used to calculate
>window
>> size, but it walks memsegs that are marked as used, not mapped. So,
>> we need to determine if a given memseg is mapped or not, otherwise
>> the ioctl reports errors due to attempting to unregister memory
>> addresses that are not registered. The global variable is excluded
>> in non-ppc64le binaries.
>> 
>> Similar problems happen in user maps. We need to avoid attempting
>to
>> unmap the address that is given as the function's parameter. The
>> compaction of user maps prevents us from passing correct length for
>> unmapping DMA at the window recreation. So, I removed it in
>ppc64le.
>> 
>> I also fixed the order of ioctl for unregister and unmap. The ioctl
>> for unregister sometimes report device busy errors due to the
>> existence of mapped area.
>> 
>> Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
>> ---
>
>OK there are three patches, and two v1's with two different authors
>in 
>reply to the same original patch. There's too much going on here, i 
>can't review this. Needs splitting.
>
>Also, #ifdef-ing out the map merging seems highly suspect.
>
>With regards to "walking used memsegs, not mapped", unless i'm 
>misunderstanding something, these are the same - whenever a segment
>is 
>mapped, it is marked as used, and whenever it is unmapped, it is
>marked 
>as free. Could you please explain what is the difference and why is
>this 
>needed?
>
>Is the point of contention here being the fact that whenever the
>unmap 
>callback arrives, the segments still appear used when iterating over
>the 
>map? If that's the case, then i think it would be OK to mark them as 
>unused *before* triggering callbacks, and chances are some of this
>code 
>wouldn't be needed. That would require a deprecation notice though, 
>because the API behavior will change (even if this fact is not 
>documented properly).
>
>-- 
>Thanks,
>Anatoly
>
>

I am the author of this patch. We should ignore a patch from YufengX Mo.

From my code reading, a memsg is at first marked as used when it is allocated. Then, the memseg is passed to vfio_spapr_dma_mem_map(). The callback iterates all the used (i.e., allocated) memsegs and call ioctl for mapping VA to IOVA. So, when vfio_spapr_dma_mem_map() is called, passed memsegs can be non-mapped but marked as used. As a result, an attempt to unmap non-mapped area happens during DMA window expansion. This is the difference and why this fix was needed.

> i think it would be OK to mark them as unused *before* triggering callbacks

Yes, my first idea was the same as yours, but I was also worried that it might cause inconsistent API behavior as you also pointed out. If you think so, I think I can rewrite the patch without ugly #ifdef. 

Unfortunately, I don't have enough time for writing code next week and next next week. So, I will resubmit the revised patch weeks later.

Regards,
Takeshi


  parent reply	other threads:[~2019-06-28 11:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-12  6:33 Takeshi Yoshimura
2019-06-12 14:06 ` Aaron Conole
2019-06-13  2:22 ` Takeshi Yoshimura
2019-06-13 17:37   ` David Christensen
2019-06-14  7:34   ` David Marchand
2019-06-14  7:49   ` [dpdk-dev] [PATCH v2] " Takeshi Yoshimura
2019-07-13  1:15     ` [dpdk-dev] [PATCH v3] " Takeshi Yoshimura
2019-07-16  0:20       ` David Christensen
2019-07-16 10:56         ` Thomas Monjalon
2019-06-18  2:37   ` [dpdk-dev] [PATCH] " Mo, YufengX
2019-06-18  2:39   ` Mo, YufengX
2019-06-26  9:43   ` Burakov, Anatoly
2019-06-28 11:38   ` Takeshi T Yoshimura [this message]
2019-06-28 13:47     ` Burakov, Anatoly
2019-06-28 14:04       ` Burakov, Anatoly
2019-06-13  2:30 ` Takeshi T Yoshimura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OFD546F28C.B5422A21-ON00258427.003A2A38-00258427.003FFD5F@notes.na.collabserv.com \
    --to=tyos@jp.ibm.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=drc@ibm.com \
    --cc=pradeep@us.ibm.com \
    --cc=yufengx.mo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).