From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail1.windriver.com (mail1.windriver.com [147.11.146.13]) by dpdk.org (Postfix) with ESMTP id 5121B6AB7 for ; Fri, 20 Jun 2014 16:36:14 +0200 (CEST) Received: from ALA-HCB.corp.ad.wrs.com (ala-hcb.corp.ad.wrs.com [147.11.189.41]) by mail1.windriver.com (8.14.5/8.14.5) with ESMTP id s5KEaUKS009904 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL) for ; Fri, 20 Jun 2014 07:36:30 -0700 (PDT) Received: from ALA-MBA.corp.ad.wrs.com ([169.254.2.48]) by ALA-HCB.corp.ad.wrs.com ([147.11.189.41]) with mapi id 14.03.0174.001; Fri, 20 Jun 2014 07:36:30 -0700 From: "Gooch, Stephen" To: "BURAKOV, ANATOLY" , "RICHARDSON, BRUCE" , "dev@dpdk.org" Thread-Topic: mmap() hint address Thread-Index: Ac+HSjModLMx+SPRQcmV/p2lSa3M/AAA1WOwAHrFVkAA1pSt4A== Date: Fri, 20 Jun 2014 14:36:30 +0000 Message-ID: <9205DC19ECCD944CA2FAC59508A772BABCF0F702@ALA-MBA.corp.ad.wrs.com> References: <9205DC19ECCD944CA2FAC59508A772BABCEFF60C@ALA-MBA.corp.ad.wrs.com> <59AF69C657FD0841A61C55336867B5B01AA361A6@IRSMSX103.ger.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [147.11.118.133] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] mmap() hint address X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jun 2014 14:36:14 -0000 Hello, One item I should have included is this device is running 32-bit 2.6.27, qu= ite old, and sharing 4GB of RAM with a number of applications. We were ab= le to find the issue. In the failure case vDSO is mapped lower (toward [he= ap]) than normal. As a result , .rte_config was mapped into the pre-mapped= pci uio resource virtual address range. The fix: (1) move uio mmap() out of the narrow range at the bottom of the m= emory maps and (2) creating spacing between the uio maps and rte_config mma= p(). It works with all huge page settings tested. - Stephen -----Original Message----- From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]=20 Sent: Monday, June 16, 2014 1:00 AM To: RICHARDSON, BRUCE; Gooch, Stephen; dev@dpdk.org Subject: RE: mmap() hint address Hi Bruce, Stephen, > > Hello, > > > > I have seen a case where a secondary DPDK process tries to map uio=20 > > resource in which mmap() normally sends the corresponding virtual=20 > > address as a hint address. However on some instances mmap() returns=20 > > a virtual address that is not the hint address, and it result in > > rte_panic() and the secondary process goes defunct. > > > > This happens from time to time on an embedded device when > nr_hugepages is > > set to 128, but never when nr_hugepage is set to 256 on the same device= . > My > > question is, if mmap() can find the correct memory regions when=20 > > hugepages is set to 256, would it not require less resources (and=20 > > therefore be more likely to > > pass) at a lower value such as 128? > > > > Any ideas what would cause this mmap() behavior at a lower=20 > > nr_hugepage value? > > > > - Stephen >=20 > Hi Stephen, >=20 > That's a strange one! > I don't know for definite why this is happening, but here is one=20 > possible theory. :-) >=20 > It could be due to the size of the memory blocks that are getting mmapped= . > When you use 256 pages, the blocks of memory getting mapped may well=20 > be larger (depending on how fragmented in memory the 2MB pages are),=20 > and so may be getting mapped at a higher set of address ranges where=20 > there is more free memory. This set of address ranges is then free in=20 > the secondary process and it is similarly able to map the memory. > With the 128 hugepages, you may be looking for smaller amounts of=20 > memory and so the addresses get mapped in at a different spot in the=20 > virtual address space, one that may be more heavily used. Then when=20 > the secondary process tries to duplicate the mappings, it already has=20 > memory in that region in use and the mapping fails. > In short - one theory is that having bigger blocks to map causes the=20 > memory to be mapped to a different location in memory which is free=20 > from conflicts in the secondary process. >=20 > So, how to confirm or refute this, and generally debug this issue? > Well, in general we would need to look at the messages printed out at=20 > startup in the primary process to see how big of blocks it is trying=20 > to map in each case, and where they end up in the virtual address-space. As I remember, OVDK project has had vaguely similar issues (only they were = trying to map hugepages into the space that QEMU has already occupied). Th= is resulted in us adding a --base-virtaddr EAL command-line flag that would= specify the start virtual address where primary process would start mappin= g pages. I guess you can try that as well (just remember that it needs to b= e done in the primary process, because the secondary one just copies the ma= ppings and succeeds or fails to do so). Best regards, Anatoly Burakov DPDK SW Engineer