From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id A3A278E60; Thu, 28 Jun 2018 12:03:58 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Jun 2018 03:03:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,282,1526367600"; d="scan'208";a="68007089" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.237.220.28]) ([10.237.220.28]) by fmsmga001.fm.intel.com with ESMTP; 28 Jun 2018 03:03:56 -0700 To: Alejandro Lucero Cc: dev , stable@dpdk.org References: <1530034653-28299-1-git-send-email-alejandro.lucero@netronome.com> <552f939e-f28e-0648-1796-8f42269887a2@intel.com> <03046f23-2466-cbb7-ae2b-f2770d5c6b0f@intel.com> <35c86511-7bf7-4840-d7ba-8362ddefc8ec@intel.com> From: "Burakov, Anatoly" Message-ID: <04086169-4102-2b23-c9c5-4be1784ef7c3@intel.com> Date: Thu, 28 Jun 2018 11:03:55 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [RFC] Add support for device dma mask X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jun 2018 10:03:59 -0000 On 28-Jun-18 10:56 AM, Alejandro Lucero wrote: > > > On Thu, Jun 28, 2018 at 9:54 AM, Burakov, Anatoly > > wrote: > > On 27-Jun-18 5:52 PM, Alejandro Lucero wrote: > > > > On Wed, Jun 27, 2018 at 2:24 PM, Burakov, Anatoly > > >> wrote: > >     On 27-Jun-18 11:13 AM, Alejandro Lucero wrote: > > > >         On Wed, Jun 27, 2018 at 9:17 AM, Burakov, Anatoly >         > > >         > >         >>> wrote: > >              On 26-Jun-18 6:37 PM, Alejandro Lucero wrote: > >                  This RFC tries to handle devices with addressing >         limitations. >                  NFP devices >                  4000/6000 can just handle addresses with 40 > bits implying >                  problems for handling >                  physical address when machines have more than > 1TB of >         memory. But >                  because how >                  iovas are configured, which can be equivalent > to physical >                  addresses or based on >                  virtual addresses, this can be a more likely > problem. > >                  I tried to solve this some time ago: > > https://www.mail-archive.com/dev@dpdk.org/msg45214.html > > > > > > > > >> > >                  It was delayed because there was some changes in >         progress with >                  EAL device >                  handling, and, being honest, I completely > forgot about this >                  until now, when >                  I have had to work on supporting NFP devices > with DPDK and >                  non-root users. > >                  I was working on a patch for being applied on > main DPDK >         branch >                  upstream, but >                  because changes to memory initialization > during the >         last months, >                  this can not >                  be backported to stable versions, at least the > part >         where the >                  hugepages iovas >                  are checked. > >                  I realize stable versions only allow bug > fixing, and this >                  patchset could >                  arguably not be considered as so. But without > this, it >         could be, >                  although >                  unlikely, a DPDK used in a machine with more > than 1TB, >         and then >                  NFP using >                  the wrong DMA host addresses. > >                  Although virtual addresses used as iovas are more >         dangerous, for >                  DPDK versions >                  before 18.05 this is not worse than with physical >         addresses, >                  because iovas, >                  when physical addresses are not available, are > based on a >                  starting address set >                  to 0x0. > > >              You might want to look at the following patch: > > http://patches.dpdk.org/patch/37149/ > >         > >              >         >> > >              Since this patch, IOVA as VA mode uses VA > addresses, and >         that has >              been backported to earlier releases. I don't think > there's >         any case >              where we used zero-based addresses any more. > > >         But memsegs get the iova based on hugepages physaddr, > and for VA >         mode that is based on 0x0 as starting point. > >         And as far as I know, memsegs iovas are what end up > being used >         for IOMMU mappings and what devices will use. > > >     For when physaddrs are available, IOVA as PA mode assigns IOVA >     addresses to PA, while IOVA as VA mode assigns IOVA > addresses to VA >     (both 18.05+ and pre-18.05 as per above patch, which was > applied to >     pre-18.05 stable releases). > >     When physaddrs aren't available, IOVA as VA mode assigns IOVA >     addresses to VA, both 18.05+ and pre-18.05, as per above patch. > > > This is right. > >     If physaddrs aren't available and IOVA as PA mode is used, > then i as >     far as i can remember, even though technically memsegs get > their >     addresses set to 0x0 onwards, the actual addresses we get in >     memzones etc. are RTE_BAD_IOVA. > > > This is not right. Not sure if this was the intention, but if PA > mode and physaddrs not available, this code inside > vfio_type1_dma_map: > > if(rte_eal_iova_mode() == RTE_IOVA_VA) > > dma_map.iova = dma_map.vaddr; > > else > > dma_map.iova = ms[i].iova; > > > does the IOMMU mapping using the iovas and not the vaddr, with > the iovas starting at 0x0. > > > Yep, you're right, apologies. I confused this with no-huge option. > > > So, what do you think about the patchset? Could it be this applied to > stable versions? > > I'll send a patch for current 18.05 code which will have the dma mask > and the hugepage check, along with changes for doing the mmaps below the > dma mask limit. I've looked through the code, it looks OK to me (bar some things like missing .map file additions and a gratuitous rte_panic :) ). There was a patch/discussion not too long ago about DMA masks for some IOMMU's - perhaps we can also extend this approach to that? https://patches.dpdk.org/patch/33192/ > > > > -- > Thanks, > Anatoly > > -- Thanks, Anatoly