From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id EADDB5688 for ; Tue, 6 Jun 2017 11:57:24 +0200 (CEST) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP; 06 Jun 2017 02:57:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,305,1493708400"; d="scan'208";a="111427222" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.28]) by fmsmga006.fm.intel.com with SMTP; 06 Jun 2017 02:57:21 -0700 Received: by (sSMTP sendmail emulation); Tue, 06 Jun 2017 10:57:20 +0100 Date: Tue, 6 Jun 2017 10:57:20 +0100 From: Bruce Richardson To: santosh Cc: thomas@monjalon.net, dev@dpdk.org, jerin.jacob@caviumnetworks.com, hemant.agrawal@nxp.com Message-ID: <20170606095719.GA50888@bricha3-MOBL3.ger.corp.intel.com> References: <20170524161101.22863-1-santosh.shukla@caviumnetworks.com> <79f1e8ae-d2cc-e49e-a17d-73c7185b26f8@caviumnetworks.com> <20170602092735.GA51388@bricha3-MOBL3.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organization: Intel Research and =?iso-8859-1?Q?De=ACvel?= =?iso-8859-1?Q?opment?= Ireland Ltd. User-Agent: Mutt/1.8.1 (2017-04-11) Subject: Re: [dpdk-dev] [RFC] eal/memory: introducing an option to set iova as va X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jun 2017 09:57:25 -0000 On Mon, Jun 05, 2017 at 10:24:11AM +0530, santosh wrote: > Hi Bruce, > > > On Friday 02 June 2017 02:57 PM, Bruce Richardson wrote: > > On Fri, Jun 02, 2017 at 09:54:46AM +0530, santosh wrote: > >> Ping? > >> > >> On Wednesday 24 May 2017 09:41 PM, Santosh Shukla wrote: > >> > >>> Some NPU hardware like OCTEONTX follows push model to get > >>> the packet from the pktio device. Where packet allocation > >>> and freeing done by the HW. Since HW can operate only on > >>> IOVA with help of SMMU/IOMMU, When packet receives from the > >>> Ethernet device, It is the IOVA address(which is PA in existing scheme). > >>> > >>> Mapping IOVA as PA is expensive on those HW, where every > >>> packet needs to be converted to VA from PA/IOVA. > >>> > >>> This patch proposes the scheme where the user can set IOVA > >>> as VA by using an eal command line argument. That helps to > >>> avoid costly lookup for VA in SW by leveraging the SMMU > >>> translation feature. > >>> > >>> Signed-off-by: Santosh Shukla > >>> --- > > Hi, > > > > I agree this is a problem that needs to be solved, but this doesn't look > > like a particularly future-proofed solution. Given that we should > > use the IOMMU on as many platforms as possible for protection, we > > probably need to find an automatic way for DPDK to use IO addresses > > correctly. Is this therefore better done as part of the VFIO and > > UIO-specific code in EAL - as that is the part that knows how the memory > > mapping is done, and in the VFIO case, what address ranges were > > programmed in. The mempool driver was something else I considered but it > > is probably too high a level to implement this. > > The other approach which we evaluated, Its detail: > 0) Introduce a new bus api whose job is to detect iommu capable devices on that > bus {/ are those devices bind to iommu capable driver or not?}. Let's call that > api rte_bus_chk_iommu_dev(); > > 1) The scheme is like If _all_ the devices bind to iommu kdrv then return iova=va > 2) Otherwise switch to default mode i.e.. iova=pa. > 3) Based on rte_bus_chk_iommu_dev() return value, > accordingly program iova=va Or iova=pa in vfio_type1/spapr_map(). > > 4) User from the command line can always override iova=va, > in case if he wants to default scheme( iova=pa mode). For that purpose - Introduce eal > option something like --iova-pa Or --override-iova Or --iova-default > or some better name. > > Proposed API snap: > > enum iova_mode { > iova_va; > iova_pa; > iova_unknown; > }; > > /** > * Look for iommu devices on that Bus. > * And find out that those devices bind to iommu > * capable driver example vfio. > * > * > * @return > * On success return valid iova mode (iova_va or iova_pa) > * On failure return iova_unkown. > */ > typedef int (*rte_bus_chk_iommu_dev_t)(void); > > > By this approach, > - We can automatically detect iova is va or pa > and then program accordingly. > - Also, the user can always switch to default iova mode. > - Drivers like dpaa2 can use this API to detect iova mode then > program dma_map accordingly. Currently they are doing in ifdef-way. > > Comments? thoughts? Or if anyone has better proposal then, please > suggest. > That sounds a more complete solution. However, it's probably a lot of work to implement. :-) I also wonder if we want to simplify things a little and disallow mixed-mode operation i.e. all devices have to use UIO or all use VFIO? Would that help to allow simplification or other options. Having a whole new bus type seems strange for this. Can each bus just report whether it's members require physical addresses. Then the EAL can manage a single flag to report whether we are using VA or PA? /Bruce