From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by dpdk.org (Postfix) with ESMTP id 583F32B88 for ; Tue, 8 Aug 2017 13:01:50 +0200 (CEST) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v78AxZGY013332 for ; Tue, 8 Aug 2017 07:01:49 -0400 Received: from smtp.notes.na.collabserv.com (smtp.notes.na.collabserv.com [192.155.248.90]) by mx0a-001b2d01.pphosted.com with ESMTP id 2c799yrkdq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 08 Aug 2017 07:01:49 -0400 Received: from localhost by smtp.notes.na.collabserv.com with smtp.notes.na.collabserv.com ESMTP for from ; Tue, 8 Aug 2017 11:01:48 -0000 Received: from us1a3-smtp06.a3.dal06.isc4sb.com (10.146.103.243) by smtp.notes.na.collabserv.com (10.106.227.141) with smtp.notes.na.collabserv.com ESMTP; Tue, 8 Aug 2017 11:01:46 -0000 Received: from us1a3-mail173.a3.dal06.isc4sb.com ([10.146.71.126]) by us1a3-smtp06.a3.dal06.isc4sb.com with ESMTP id 2017080811014529-352478 ; Tue, 8 Aug 2017 11:01:45 +0000 MIME-Version: 1.0 In-Reply-To: To: "Burakov, Anatoly" Cc: "aik@ozlabs.ru" , "dev@dpdk.org" From: "Jonas Pfefferle1" Date: Tue, 8 Aug 2017 13:01:43 +0200 References: <1502181667-17949-1-git-send-email-jpf@zurich.ibm.com> X-KeepSent: DF0CEC49:CD50AE41-C1258176:003C80EF; type=4; name=$KeepSent X-Mailer: IBM Notes Release 9.0.1 October 14, 2013 X-LLNOutbound: False X-Disclaimed: 26371 X-TNEFEvaluated: 1 x-cbid: 17080811-9717-0000-0000-000003B858F6 X-IBM-SpamModules-Scores: BY=0; FL=0; FP=0; FZ=0; HX=0; KW=0; PH=0; SC=0.426071; ST=0; TS=0; UL=0; ISC=; MB=0.103413 X-IBM-SpamModules-Versions: BY=3.00007506; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000217; SDB=6.00899294; UDB=6.00450112; IPR=6.00679499; BA=6.00005518; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016591; XFM=3.00000015; UTC=2017-08-08 11:01:47 X-IBM-AV-DETECTION: SAVI=unsuspicious REMOTE=unsuspicious XFE=unused X-IBM-AV-VERSION: SAVI=2017-08-08 07:23:36 - 6.00007147 x-cbparentid: 17080811-9718-0000-0000-00009472A879 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-08-08_04:, , signatures=0 X-Proofpoint-Spam-Reason: safe Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Aug 2017 11:01:50 -0000 "Burakov, Anatoly" wrote on 08/08/2017 11:43:43 AM: > From: "Burakov, Anatoly" > To: Jonas Pfefferle1 > Cc: "aik@ozlabs.ru" , "dev@dpdk.org" > Date: 08/08/2017 11:43 AM > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > From: Jonas Pfefferle1 [mailto:JPF@zurich.ibm.com] > > Sent: Tuesday, August 8, 2017 10:30 AM > > To: Burakov, Anatoly > > Cc: aik@ozlabs.ru; dev@dpdk.org > > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > "Burakov, Anatoly" wrote on 08/08/2017 > > 11:15:24 AM: > > > > > From: "Burakov, Anatoly" > > > To: Jonas Pfefferle > > > Cc: "dev@dpdk.org" , "aik@ozlabs.ru" > > > Date: 08/08/2017 11:18 AM > > > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > > > From: Jonas Pfefferle [mailto:jpf@zurich.ibm.com] > > > > Sent: Tuesday, August 8, 2017 9:41 AM > > > > To: Burakov, Anatoly > > > > Cc: dev@dpdk.org; aik@ozlabs.ru; Jonas Pfefferle > > > > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > > > > > DMA window size needs to be big enough to span all memory segment's > > > > physical addresses. We do not need multiple levels of IOMMU tables > > > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > > > > > Signed-off-by: Jonas Pfefferle > > > > --- > > > > v2: > > > > * roundup to next power 2 function without loop. > > > > > > > > v3: > > > > * Replace roundup=5Fnext=5Fpow2 with rte=5Falign64pow2 > > > > > > > > =A0lib/librte=5Feal/linuxapp/eal/eal=5Fvfio.c | 13 ++++++++++--- > > > > =A01 file changed, 10 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/lib/librte=5Feal/linuxapp/eal/eal=5Fvfio.c > > > > b/lib/librte=5Feal/linuxapp/eal/eal=5Fvfio.c > > > > index 946df7e..550c41c 100644 > > > > --- a/lib/librte=5Feal/linuxapp/eal/eal=5Fvfio.c > > > > +++ b/lib/librte=5Feal/linuxapp/eal/eal=5Fvfio.c > > > > @@ -759,10 +759,12 @@ vfio=5Fspapr=5Fdma=5Fmap(int vfio=5Fcontainer= =5Ffd) > > > > =A0 =A0 =A0 =A0return -1; > > > > =A0 =A0 } > > > > > > > > - =A0 /* calculate window size based on number of hugepages configured > > > > */ > > > > - =A0 create.window=5Fsize =3D rte=5Feal=5Fget=5Fphysmem=5Fsize(); > > > > + =A0 /* physicaly pages are sorted descending i.e. ms[0].phys=5Fad= dr is max > > > > */ > > > > > > Do we always expect that to be the case in the future? Maybe it > > > would be safer to walk the memsegs list. > > > > > > Thanks, > > > Anatoly > > > > I had this loop in before but removed it in favor of simplicity. > > If we believe that the ordering is going to change in the future > > I'm happy to bring back the loop. Is there other code which is > > relying on the fact that the memsegs are sorted by their physical > > addresses? > > I don't think there is. In any case, I think making assumptions > about particulars of memseg organization is not a very good practice. > > I seem to recall us doing similar things in other places, so maybe > down the line we could introduce a new API (or internal-only) > function to get a memseg with min/max address. For now I think a > loop will do. Ok. Makes sense to me. Let me resubmit a new version with the loop. > > > > > > > > > > + =A0 /* create DMA window from 0 to max(phys=5Faddr + len) */ > > > > + =A0 /* sPAPR requires window size to be a power of 2 */ > > > > + =A0 create.window=5Fsize =3D rte=5Falign64pow2(ms[0].phys=5Faddr + > > > > ms[0].len); > > > > =A0 =A0 create.page=5Fshift =3D =5F=5Fbuiltin=5Fctzll(ms->hugepage= =5Fsz); > > > > - =A0 create.levels =3D 2; > > > > + =A0 create.levels =3D 1; > > > > > > > > =A0 =A0 ret =3D ioctl(vfio=5Fcontainer=5Ffd, VFIO=5FIOMMU=5FSPAPR= =5FTCE=5FCREATE, > > > > &create); > > > > =A0 =A0 if (ret) { > > > > @@ -771,6 +773,11 @@ vfio=5Fspapr=5Fdma=5Fmap(int vfio=5Fcontainer= =5Ffd) > > > > =A0 =A0 =A0 =A0return -1; > > > > =A0 =A0 } > > > > > > > > + =A0 if (create.start=5Faddr !=3D 0) { > > > > + =A0 =A0 =A0RTE=5FLOG(ERR, EAL, " =A0DMA window start address !=3D= 0\n"); > > > > + =A0 =A0 =A0return -1; > > > > + =A0 } > > > > + > > > > =A0 =A0 /* map all DPDK segments for DMA. use 1:1 PA to IOVA mappin= g */ > > > > =A0 =A0 for (i =3D 0; i < RTE=5FMAX=5FMEMSEG; i++) { > > > > =A0 =A0 =A0 =A0struct vfio=5Fiommu=5Ftype1=5Fdma=5Fmap dma=5Fmap; > > > > -- > > > > 2.7.4 > > > >