From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by dpdk.org (Postfix) with ESMTP id B98CA1BB44 for ; Fri, 27 Oct 2017 17:16:20 +0200 (CEST) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v9RFG3B6122687 for ; Fri, 27 Oct 2017 11:16:19 -0400 Received: from smtp.notes.na.collabserv.com (smtp.notes.na.collabserv.com [192.155.248.81]) by mx0a-001b2d01.pphosted.com with ESMTP id 2dv4662wew-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 27 Oct 2017 11:16:19 -0400 Received: from localhost by smtp.notes.na.collabserv.com with smtp.notes.na.collabserv.com ESMTP for from ; Fri, 27 Oct 2017 15:16:18 -0000 Received: from us1a3-smtp08.a3.dal06.isc4sb.com (10.146.103.57) by smtp.notes.na.collabserv.com (10.106.227.88) with smtp.notes.na.collabserv.com ESMTP; Fri, 27 Oct 2017 15:16:14 -0000 Received: from us1a3-mail173.a3.dal06.isc4sb.com ([10.146.71.126]) by us1a3-smtp08.a3.dal06.isc4sb.com with ESMTP id 2017102715161322-731748 ; Fri, 27 Oct 2017 15:16:13 +0000 MIME-Version: 1.0 In-Reply-To: To: "Jonas Pfefferle1" Cc: "Burakov, Anatoly" , bruce.richardson@intel.com, chaozhu@linux.vnet.ibm.com, dev@dpdk.org From: "Jonas Pfefferle1" Date: Fri, 27 Oct 2017 17:16:12 +0200 References: <921d836f-87dc-b017-2186-e70905f61612@intel.com> X-KeepSent: 34B92F11:92493F02-C12581C6:00539163; type=4; name=$KeepSent X-Mailer: IBM Notes Release 9.0.1 October 14, 2013 X-LLNOutbound: False X-Disclaimed: 6451 X-TNEFEvaluated: 1 x-cbid: 17102715-7093-0000-0000-000003B66C56 X-IBM-SpamModules-Scores: BY=0.292417; FL=0; FP=0; FZ=0; HX=0; KW=0; PH=0; SC=0.433748; ST=0; TS=0; UL=0; ISC=; MB=0.204469 X-IBM-SpamModules-Versions: BY=3.00007962; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000239; SDB=6.00937257; UDB=6.00472378; IPR=6.00717519; BA=6.00005660; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017744; XFM=3.00000015; UTC=2017-10-27 15:16:16 X-IBM-AV-DETECTION: SAVI=unsuspicious REMOTE=unsuspicious XFE=unused X-IBM-AV-VERSION: SAVI=2017-10-27 07:36:42 - 6.00007521 x-cbparentid: 17102715-7094-0000-0000-00001EAA9F76 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-10-27_08:, , signatures=0 X-Proofpoint-Spam-Reason: safe Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] Huge mapping secondary process linux X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Oct 2017 15:16:21 -0000 "dev" wrote on 10/27/2017 04:58:01 PM: > From: "Jonas Pfefferle1" > To: "Burakov, Anatoly" > Cc: bruce.richardson@intel.com, chaozhu@linux.vnet.ibm.com, dev@dpdk.org > Date: 10/27/2017 04:58 PM > Subject: Re: [dpdk-dev] Huge mapping secondary process linux > Sent by: "dev" > > > "Burakov, Anatoly" wrote on 10/27/2017 04:44:52 > PM: > > > From: "Burakov, Anatoly" > > To: Jonas Pfefferle1 > > Cc: bruce.richardson@intel.com, chaozhu@linux.vnet.ibm.com, dev@dpdk.org > > Date: 10/27/2017 04:45 PM > > Subject: Re: [dpdk-dev] Huge mapping secondary process linux > > > > On 27-Oct-17 3:28 PM, Jonas Pfefferle1 wrote: > > > "Burakov, Anatoly" wrote on 10/27/2017 > > > 04:06:44 PM: > > > > > > > From: "Burakov, Anatoly" > > > > To: Jonas Pfefferle1 , dev@dpdk.org > > > > Cc: chaozhu@linux.vnet.ibm.com, bruce.richardson@intel.com > > > > Date: 10/27/2017 04:06 PM > > > > Subject: Re: [dpdk-dev] Huge mapping secondary process linux > > > > > > > > On 27-Oct-17 1:43 PM, Jonas Pfefferle1 wrote: > > > > > > > > > > > > > > > Hi @all, > > > > > > > > > > I'm trying to make sense of the hugepage memory mappings in > > > > > librte=5Feal/linuxapp/eal/eal=5Fmemory.c: > > > > > * In rte=5Feal=5Fhugepage=5Fattach (line 1347) when we try to do= a > private > > > > > mapping on /dev/zero (line 1393) why do we not use MAP=5FFIXED if we > > > > need the > > > > > addresses to be identical with the primary process? > > > > > * On POWER we have this weird business going on where we use > > > MAP=5FHUGETLB > > > > > because according to this commit: > > > > > > > > > > commit 284ae3e9ff9a92575c28c858efd2c85c8de6d440 > > > > > Author: Chao Zhu > > > > > Date: =A0 Thu Apr 6 15:36:09 2017 +0530 > > > > > > > > > > =A0 =A0 =A0eal/ppc: fix mmap for memory initialization > > > > > > > > > > =A0 =A0 =A0On IBM POWER platform, when mapping /dev/zero file to > hugepage > > > memory > > > > > =A0 =A0 =A0space, mmap will not respect the requested address hint.This > will > > > > > cause > > > > > =A0 =A0 =A0the memory initialization for the second process fail= s. This > > > patch adds > > > > > =A0 =A0 =A0the required mmap flags to make it work. Beside this,= users > > > need to set > > > > > =A0 =A0 =A0the nr=5Fovercommit=5Fhugepages to expand the VA rang= e. When > > > > > =A0 =A0 =A0doing the initialization, users need to set both nr=5Fhugepages > and > > > > > =A0 =A0 =A0nr=5Fovercommit=5Fhugepages to the same value, like 6= 4, 128, etc. > > > > > > > > > > mmap address hints are not respected. Looking at the mmap code in > the > > > > > kernel this is not true entirely however under some circumstances > > > the hint > > > > > can be ignored ( > > > > > https://urldefense.proofpoint.com/v2/url? > > > > > > > > > > u=3Dhttp-3A=5F=5Felixir.free-2Delectrons.com=5Flinux=5Flatest=5Fsource=5Far= ch=5Fpowerpc=5Fmm=5Fmmap.c-23L103&d=3DDwICaQ&c=3Djf=5FiaSHvJObTbx- > > > > > siA1ZOg&r=3DrOdXhRsgn8Iur7bDE0vgwvo6TC8OpoDN- > > > > pXjigIjRW0&m=3DcttQcHlAYixhsYS3lz- > > > > > BAdEeg4dpbwGdPnj2R3I8Do0&s=3DGp0TIjUtIed05Jgb7XnlocpCYZdFXZXiH0LqIWiNMhA&= e=3D > > > > > ). However I believe we can remove the extra case for PPC if we > use > > > > > MAP=5FFIXED when doing the secondary process mappings because we > need > > > them to > > > > > be identical anyway. We could also use MAP=5FFIXED when doing the > primary > > > > > process mappings resp. get=5Fvirtual=5Farea if we want to have a= ny > > > guarantees > > > > > when specifying a base address. Any thoughts? > > > > > > > > > > Thanks, > > > > > Jonas > > > > > > > > > hi Jonas, > > > > > > > > MAP=5FFIXED is not used because it's dangerous, it unmaps anything > that is > > > > already mapped into that space. We would rather know that we can't > map > > > > something than unwittingly unmap something that was mapped before. > > > > > > Ok, I see. Maybe we can add a check to the primary process's memory > > > mappings whether the hint has been respected or not? At least warn if > it > > > hasn't. > > > > Hi Jonas, > > > > I'm unfamiliar with POWER platform, so i'm afraid you'd have to explain > > a bit more what you mean by "hint has been respected" :) > > Hi Anatoly, > > What I meant was the mmap address hint: > > "If addr is not NULL, then the kernel takes it as a hint > about where to place the mapping; on Linux, the mapping will be > created at a nearby page boundary." > > This is actually not true on POWER. It can happen that the address hint is > ignored and you get any address back that fits your mapping. > > Thanks, > Jonas Actually looking through the kernel code this is also not guaranteed on x86. ( http://elixir.free-electrons.com/linux/latest/source/arch/x86/kernel/sys=5F= x86=5F64.c#L165 ) So in any case the address hint can be ignored by the kernel and you get any address that fits your mapping. My suggestion is to check when we do the initial mapping in get=5Fvirtual=5Farea if the hint was respected or not, i.e. if the returned address =3D=3D PAGE=5FALIGN(address=5Fhint). Thanks, Jonas > > > > > > > -- > > Thanks, > > Anatoly > > >