From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 14A31A046B for ; Fri, 28 Jun 2019 13:39:15 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AAAC71E25; Fri, 28 Jun 2019 13:39:13 +0200 (CEST) Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by dpdk.org (Postfix) with ESMTP id D0C83F04 for ; Fri, 28 Jun 2019 13:39:11 +0200 (CEST) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5SBcFRg049095 for ; Fri, 28 Jun 2019 07:39:11 -0400 Received: from smtp.notes.na.collabserv.com (smtp.notes.na.collabserv.com [192.155.248.81]) by mx0a-001b2d01.pphosted.com with ESMTP id 2tdgc5cf0m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 28 Jun 2019 07:39:11 -0400 Received: from localhost by smtp.notes.na.collabserv.com with smtp.notes.na.collabserv.com ESMTP for from ; Fri, 28 Jun 2019 11:39:10 -0000 Received: from us1a3-smtp03.a3.dal06.isc4sb.com (10.106.154.98) by smtp.notes.na.collabserv.com (10.106.227.88) with smtp.notes.na.collabserv.com ESMTP; Fri, 28 Jun 2019 11:38:56 -0000 Received: from us1a3-mail157.a3.dal06.isc4sb.com ([10.146.71.179]) by us1a3-smtp03.a3.dal06.isc4sb.com with ESMTP id 2019062811385637-374728 ; Fri, 28 Jun 2019 11:38:56 +0000 In-Reply-To: <4b10a275-10b4-9806-997f-7241a9e5cfed@intel.com> From: "Takeshi T Yoshimura" To: "Burakov, Anatoly" Cc: "Mo, YufengX" , dev@dpdk.org, "David Christensen" , "Pradeep Satyanarayana" Date: Fri, 28 Jun 2019 11:38:56 +0000 MIME-Version: 1.0 Sensitivity: Importance: Normal X-Priority: 3 (Normal) References: <4b10a275-10b4-9806-997f-7241a9e5cfed@intel.com>, <20190613022239.6946-1-tyos@jp.ibm.com> X-Mailer: IBM iNotes ($HaikuForm 1054) | IBM Domino Build SCN1812108_20180501T0841_FP55 May 22, 2019 at 11:09 X-LLNOutbound: False X-Disclaimed: 1019 X-TNEFEvaluated: 1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 x-cbid: 19062811-7093-0000-0000-00000BE01FBA X-IBM-SpamModules-Scores: BY=0; FL=0; FP=0; FZ=0; HX=0; KW=0; PH=0; SC=0.40962; ST=0; TS=0; UL=0; ISC=; MB=0.194151 X-IBM-SpamModules-Versions: BY=3.00011346; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000286; SDB=6.01224453; UDB=6.00644459; IPR=6.01005646; BA=6.00006344; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00027506; XFM=3.00000015; UTC=2019-06-28 11:39:08 X-IBM-AV-DETECTION: SAVI=unsuspicious REMOTE=unsuspicious XFE=unused X-IBM-AV-VERSION: SAVI=2019-06-28 08:30:22 - 6.00010101 x-cbparentid: 19062811-7094-0000-0000-00008DD322D2 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-06-28_04:, , signatures=0 X-Proofpoint-Spam-Reason: safe Subject: Re: [dpdk-dev] [PATCH] vfio: fix expanding DMA area in ppc64le X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >To: "Mo, YufengX" , dev@dpdk.org >From: "Burakov, Anatoly" >Date: 06/26/2019 06:43PM >Cc: drc@ibm.com, pradeep@us.ibm.com, Takeshi Yoshimura > >Subject: [EXTERNAL] Re: [dpdk-dev] [PATCH] vfio: fix expanding DMA >area in ppc64le > >On 18-Jun-19 3:37 AM, Mo, YufengX wrote: >> From: Takeshi Yoshimura >>=20 >> In ppc64le, expanding DMA areas always fail because we cannot >remove >> a DMA window. As a result, we cannot allocate more than one memseg >in >> ppc64le. This is because vfio=5Fspapr=5Fdma=5Fmem=5Fmap() doesn't unmap = all >> the mapped DMA before removing the window. This patch fixes this >> incorrect behavior. >>=20 >> I added a global variable to track current window size since we do >> not have better ways to get exact size of it than doing so. sPAPR >> IOMMU seems not to provide any ways to get window size with ioctl >> interfaces. rte=5Fmemseg=5Fwalk*() is currently used to calculate >window >> size, but it walks memsegs that are marked as used, not mapped. So, >> we need to determine if a given memseg is mapped or not, otherwise >> the ioctl reports errors due to attempting to unregister memory >> addresses that are not registered. The global variable is excluded >> in non-ppc64le binaries. >>=20 >> Similar problems happen in user maps. We need to avoid attempting >to >> unmap the address that is given as the function's parameter. The >> compaction of user maps prevents us from passing correct length for >> unmapping DMA at the window recreation. So, I removed it in >ppc64le. >>=20 >> I also fixed the order of ioctl for unregister and unmap. The ioctl >> for unregister sometimes report device busy errors due to the >> existence of mapped area. >>=20 >> Signed-off-by: Takeshi Yoshimura >> --- > >OK there are three patches, and two v1's with two different authors >in=20 >reply to the same original patch. There's too much going on here, i=20 >can't review this. Needs splitting. > >Also, #ifdef-ing out the map merging seems highly suspect. > >With regards to "walking used memsegs, not mapped", unless i'm=20 >misunderstanding something, these are the same - whenever a segment >is=20 >mapped, it is marked as used, and whenever it is unmapped, it is >marked=20 >as free. Could you please explain what is the difference and why is >this=20 >needed? > >Is the point of contention here being the fact that whenever the >unmap=20 >callback arrives, the segments still appear used when iterating over >the=20 >map? If that's the case, then i think it would be OK to mark them as=20 >unused *before* triggering callbacks, and chances are some of this >code=20 >wouldn't be needed. That would require a deprecation notice though,=20 >because the API behavior will change (even if this fact is not=20 >documented properly). > >--=20 >Thanks, >Anatoly > > I am the author of this patch. We should ignore a patch from YufengX Mo. >From my code reading, a memsg is at first marked as used when it is allocat= ed. Then, the memseg is passed to vfio=5Fspapr=5Fdma=5Fmem=5Fmap(). The cal= lback iterates all the used (i.e., allocated) memsegs and call ioctl for ma= pping VA to IOVA. So, when vfio=5Fspapr=5Fdma=5Fmem=5Fmap() is called, pass= ed memsegs can be non-mapped but marked as used. As a result, an attempt to= unmap non-mapped area happens during DMA window expansion. This is the dif= ference and why this fix was needed. > i think it would be OK to mark them as unused *before* triggering callbac= ks Yes, my first idea was the same as yours, but I was also worried that it mi= ght cause inconsistent API behavior as you also pointed out. If you think s= o, I think I can rewrite the patch without ugly #ifdef.=20 Unfortunately, I don't have enough time for writing code next week and next= next week. So, I will resubmit the revised patch weeks later. Regards, Takeshi