From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54]) by dpdk.org (Postfix) with ESMTP id 1D0012C1A for ; Tue, 6 Jun 2017 12:13:15 +0200 (CEST) Received: by mail-wm0-f54.google.com with SMTP id n195so97079120wmg.1 for ; Tue, 06 Jun 2017 03:13:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=FiJP+/yPD4whwEqbt5e1IXFtQACg3oJw2YS2I/e3IKo=; b=Q0/XTKmNi7x4My9BYhICGoWRTHhml/t418Yz9LlZNvJV0pnXxcKA9wGDGJxZpwEHVg uYAriNZfW78bNjoayr29PyMx0RxWhfGYncRMBUGpHF/zsNTYHXq9UCG0o4kL/JPBLQyR OM7S2q9T0KB1CLCh/isP5J4oZ/h6Z5amRmGlS15O1yC7DioC0OWb9E9ED2WuBXVopH99 XIAg4zt+enl2k5N1mR7zivlYS6vX5Bxtqby0Rm0lk5IrElvCa+tHZwuIUt3UmA94MKU+ GZPgPsrwfGgpIVlMNlUEEZxbAugkQZIvKXxEktMHsuz9adDb88dyBIDErhCLk/Gy8lql oJGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=FiJP+/yPD4whwEqbt5e1IXFtQACg3oJw2YS2I/e3IKo=; b=dTfEwUev02W3joicJGErTPoLHTiIdcKUP5L1WROQzQHdbupX8WoRbe8QwS74cOLjZQ lGY3cKZJbmIdU5dEprROFEokKxZwT2wqAXTaukplRVrGd9WywK1pjUqLCBSUd8n9Su2h Aa32Vc3GvaBJ7sgxdOT6XO8Qt8IAlRFFw/baHTvO07/8VR4mJG32XRCz5HEEiFK6ieqt Z1ucPlJflMofL+HP+6VE/2FSuv9RXyf9EwVijkKMRnD0uz9Susw5Aq6zM49TlsX4syZZ 7btcRcmYSKZ/lVRoi5zdflJc3IZFAz8WC4RnYSMSDVS62MeuCKCtpvsRuURoNz2B2GC/ WCeQ== X-Gm-Message-State: AODbwcBOYtfJof6/Cc11lze0L8V52Mwbtj7PNai1KcyHqldRNTF783zu i/PAVPyVIWdCBm3B X-Received: by 10.28.113.200 with SMTP id d69mr3285034wmi.106.1496743994695; Tue, 06 Jun 2017 03:13:14 -0700 (PDT) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id k56sm5928089wrk.45.2017.06.06.03.13.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Jun 2017 03:13:13 -0700 (PDT) Date: Tue, 6 Jun 2017 12:13:08 +0200 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Bruce Richardson Cc: santosh , thomas@monjalon.net, dev@dpdk.org, jerin.jacob@caviumnetworks.com, hemant.agrawal@nxp.com Message-ID: <20170606101308.GL18840@bidouze.vm.6wind.com> References: <20170524161101.22863-1-santosh.shukla@caviumnetworks.com> <79f1e8ae-d2cc-e49e-a17d-73c7185b26f8@caviumnetworks.com> <20170602092735.GA51388@bricha3-MOBL3.ger.corp.intel.com> <20170606095719.GA50888@bricha3-MOBL3.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170606095719.GA50888@bricha3-MOBL3.ger.corp.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [RFC] eal/memory: introducing an option to set iova as va X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jun 2017 10:13:15 -0000 On Tue, Jun 06, 2017 at 10:57:20AM +0100, Bruce Richardson wrote: > On Mon, Jun 05, 2017 at 10:24:11AM +0530, santosh wrote: > > Hi Bruce, > > > > > > On Friday 02 June 2017 02:57 PM, Bruce Richardson wrote: > > > On Fri, Jun 02, 2017 at 09:54:46AM +0530, santosh wrote: > > >> Ping? > > >> > > >> On Wednesday 24 May 2017 09:41 PM, Santosh Shukla wrote: > > >> > > >>> Some NPU hardware like OCTEONTX follows push model to get > > >>> the packet from the pktio device. Where packet allocation > > >>> and freeing done by the HW. Since HW can operate only on > > >>> IOVA with help of SMMU/IOMMU, When packet receives from the > > >>> Ethernet device, It is the IOVA address(which is PA in existing scheme). > > >>> > > >>> Mapping IOVA as PA is expensive on those HW, where every > > >>> packet needs to be converted to VA from PA/IOVA. > > >>> > > >>> This patch proposes the scheme where the user can set IOVA > > >>> as VA by using an eal command line argument. That helps to > > >>> avoid costly lookup for VA in SW by leveraging the SMMU > > >>> translation feature. > > >>> > > >>> Signed-off-by: Santosh Shukla > > >>> --- > > > Hi, > > > > > > I agree this is a problem that needs to be solved, but this doesn't look > > > like a particularly future-proofed solution. Given that we should > > > use the IOMMU on as many platforms as possible for protection, we > > > probably need to find an automatic way for DPDK to use IO addresses > > > correctly. Is this therefore better done as part of the VFIO and > > > UIO-specific code in EAL - as that is the part that knows how the memory > > > mapping is done, and in the VFIO case, what address ranges were > > > programmed in. The mempool driver was something else I considered but it > > > is probably too high a level to implement this. > > > > The other approach which we evaluated, Its detail: > > 0) Introduce a new bus api whose job is to detect iommu capable devices on that > > bus {/ are those devices bind to iommu capable driver or not?}. Let's call that > > api rte_bus_chk_iommu_dev(); > > > > 1) The scheme is like If _all_ the devices bind to iommu kdrv then return iova=va > > 2) Otherwise switch to default mode i.e.. iova=pa. > > 3) Based on rte_bus_chk_iommu_dev() return value, > > accordingly program iova=va Or iova=pa in vfio_type1/spapr_map(). > > > > 4) User from the command line can always override iova=va, > > in case if he wants to default scheme( iova=pa mode). For that purpose - Introduce eal > > option something like --iova-pa Or --override-iova Or --iova-default > > or some better name. > > > > Proposed API snap: > > > > enum iova_mode { > > iova_va; > > iova_pa; > > iova_unknown; > > }; > > > > /** > > * Look for iommu devices on that Bus. > > * And find out that those devices bind to iommu > > * capable driver example vfio. > > * > > * > > * @return > > * On success return valid iova mode (iova_va or iova_pa) > > * On failure return iova_unkown. > > */ > > typedef int (*rte_bus_chk_iommu_dev_t)(void); > > > > > > By this approach, > > - We can automatically detect iova is va or pa > > and then program accordingly. > > - Also, the user can always switch to default iova mode. > > - Drivers like dpaa2 can use this API to detect iova mode then > > program dma_map accordingly. Currently they are doing in ifdef-way. > > > > Comments? thoughts? Or if anyone has better proposal then, please > > suggest. > > > > That sounds a more complete solution. However, it's probably a lot of > work to implement. :-) > > I also wonder if we want to simplify things a little and disallow > mixed-mode operation i.e. all devices have to use UIO or all use VFIO? > Would that help to allow simplification or other options. Having a whole > new bus type seems strange for this. Can each bus just report whether > it's members require physical addresses. Then the EAL can manage a > single flag to report whether we are using VA or PA? > Implementing this at a bus level requires all buses to have drivers iterators, which are currently not exposed, or force all buses to actively report drivers capabilities upon successful probing. The former is a sizeable evolution while the latter leads to having duplicated code in all bus->probe() implementation, which seems unsound. I may be mistaken, but is this iova mode not currently limited to VFIO? Should this API be made generic for all buses or is it only relevant to the PCI bus? If it can stay specific to the PCI bus, then it should simplify greatly the implementation. -- Gaëtan Rivet 6WIND