From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54]) by dpdk.org (Postfix) with ESMTP id E49262B9E for ; Tue, 28 Nov 2017 15:16:15 +0100 (CET) Received: by mail-wm0-f54.google.com with SMTP id x63so2172855wmf.2 for ; Tue, 28 Nov 2017 06:16:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Y8fkUb+Qp8GzKx2UkgAZJDY10wYBtrk5fi7WWWWdJdU=; b=uMsfuYHF6A98zqGqcIKYfQ2b8cn5bfGjeYA2bii0p+XFA/isPkOjDABcEzekyjP2ci k/F9HuYp4xhXucacNoUB7p5JOX7/2lkrDKTFUSwVF8192wHM3IVvDHMzSAKBT9DnIAA5 oSJE3k02UZRPeD3SwG5sT1yKIc0iYBG6kmvNsdV4My+9wVYDQM8TlvkK1dcC89/pBmI4 ShoYcowx4t8FMajpa6TQQlcTkFOQn+/IqAqyN9LlPubTqT7ysefksnCjxi1ZnorM0U/b ob/x90b//3ENgeIMLmR02fRWxMcxb14peX239IeMh+e7IxU1H+1/QGF/h82wr9Ek2KRW Fkiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Y8fkUb+Qp8GzKx2UkgAZJDY10wYBtrk5fi7WWWWdJdU=; b=WINLd6wvZB9/LYAAVkRQasmn5kGaWPKxekuHw9pQVMrMVDAzSmpkTCx4k+m/EDgxKt PPIJ5buWJCKmRtc3m0xLGvim5tEJKITsMvJpfWa1V/Hjs21SMwtS8bz1sxFjcG7vCd6C 8B4L9cq3LHxFmz80cTDTwbCZWb5IAbAovt9b0hThctScdYNMJ/Roq5e2RGf6086W0eP8 drB7ygqor2pLD7moHxiJfdkVh25t3xFAGC9lQwU3vdHU4I8Q4vkuDxxNJmVEAIzEKWir xZFQj9fpaGeQIFq6Zrt6zv4Rf8SpuUl+zCycYYgOiOMo9CPgds4n753xddWDCFRB2ZAj wr8Q== X-Gm-Message-State: AJaThX4Aj709nW4Ng10WsJE2L5d6KKPkclQ6WOY0IjtvikmKyXgEQYRt 1WdJz6djxXem9AhoMjaaxBNi0Kzv2RkpI6FRo8aySg== X-Google-Smtp-Source: AGs4zMYglhVkdTJp5bx0+xMojZpkddIMtZDCTMC/pLiNVYukVIOCNSc80tNoAFvs99Sg1xuz/RLF1I7FO8T8/nRTZM0= X-Received: by 10.80.168.2 with SMTP id j2mr2887231edc.287.1511878575346; Tue, 28 Nov 2017 06:16:15 -0800 (PST) MIME-Version: 1.0 Received: by 10.80.226.9 with HTTP; Tue, 28 Nov 2017 06:16:14 -0800 (PST) In-Reply-To: <1511805495.2692.82.camel@intel.com> References: <1483044080.11975.1.camel@intel.com> <1483565664.9482.3.camel@intel.com> <6c6766f0-145e-9354-e275-d107d69173c3@intel.com> <2179627.cU6MQpMJOa@xps> <1511805495.2692.82.camel@intel.com> From: Alejandro Lucero Date: Tue, 28 Nov 2017 14:16:14 +0000 Message-ID: To: "Walker, Benjamin" Cc: "thomas@monjalon.net" , "Gonzalez Monroy, Sergio" , "Burakov, Anatoly" , "Tan, Jianfeng" , "dev@dpdk.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] Running DPDK as an unprivileged user X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Nov 2017 14:16:16 -0000 On Mon, Nov 27, 2017 at 5:58 PM, Walker, Benjamin wrote: > On Sun, 2017-11-05 at 01:17 +0100, Thomas Monjalon wrote: > > Hi, restarting an old topic, > > > > 05/01/2017 16:52, Tan, Jianfeng: > > > On 1/5/2017 5:34 AM, Walker, Benjamin wrote: > > > > > > Note that this > > > > > > probably means that using uio on recent kernels is subtly > > > > > > broken and cannot be supported going forward because there > > > > > > is no uio mechanism to pin the memory. > > > > > > > > > > > > The first open question I have is whether DPDK should allow > > > > > > uio at all on recent (4.x) kernels. My current understanding > > > > > > is that there is no way to pin memory and hugepages can now > > > > > > be moved around, so uio would be unsafe. What does the > > > > > > community think here? > > > > > > Back to this question, removing uio support in DPDK seems a little > > > overkill to me. Can we just document it down? Like, firstly warn users > > > do not invoke migrate_pages() or move_pages() to a DPDK process; as for > > > the kcompactd daemon and some more cases (like compaction could be > > > triggered by alloc_pages()), could we just recommend to disable > > > CONFIG_COMPACTION? > > > > We really need to better document the limitations of UIO. > > May we have some suggestions here? > > > > > Another side, how does vfio pin those memory? Through memlock (from > code > > > in vfio_pin_pages())? So why not just mlock those hugepages? > > > > Good question. Why not mlock the hugepages? > > mlock just guarantees that a virtual page is always backed by *some* > physical > page of memory. It does not guarantee that over the lifetime of the > process a > virtual page is mapped to the *same* physical page. The kernel is free to > transparently move memory around, compress it, dedupe it, etc. > > vfio is not pinning the memory, but instead is using the IOMMU (a piece of > hardware) to participate in the memory management on the platform. If a > device > begins a DMA transfer to an I/O virtual address, the IOMMU will coordinate > with > the main MMU to make sure that the data ends up in the correct location, > even as > the virtual to physical mappings are being modified. This last comment confused me because you said VFIO did the page pinning in your first email. I have been looking at the kernel code and the VFIO driver does pin the pages, at least the iommu type 1. I can see a problem adding support to UIO for doing the same, because that implies there is a device doing DMAs and programmed from user space, which is something the UIO maintainer is against. But because vfio-noiommu mode was implemented just for this, I guess that could be added to the VFIO driver. This does not solve the problem of software not using vfio though. Apart from improving the UIO documentation when used with DPDK, maybe some sort of check could be done and DPDK requiring a explicit parameter for making the user aware of the potential risk when UIO is used and the kernel page migration is enabled. Not sure if this last thing could be easily known from user space. On another side, we suffered a similar problem when VMs were using SRIOV and memory balloning. The IOMMU was removing the mapping for the memory removed, but the kernel inside the VM did not get any event and the device ended up doing some wrong DMA operation.