From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 589494CBB for ; Fri, 13 Jul 2018 13:00:51 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Jul 2018 04:00:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,347,1526367600"; d="scan'208";a="64432274" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.0.191]) ([10.252.0.191]) by FMSMGA003.fm.intel.com with ESMTP; 13 Jul 2018 04:00:49 -0700 To: Takeshi Yoshimura Cc: dev@dpdk.org References: <20180712030833.4887-1-t.yoshimura8869@gmail.com> <20180713101145.4795-1-t.yoshimura8869@gmail.com> From: "Burakov, Anatoly" Message-ID: <4c46da4e-ab67-bb76-b42a-25646c79cd99@intel.com> Date: Fri, 13 Jul 2018 12:00:48 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180713101145.4795-1-t.yoshimura8869@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3] vfio: fix workaround of BAR0 mapping X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Jul 2018 11:00:52 -0000 On 13-Jul-18 11:11 AM, Takeshi Yoshimura wrote: > The workaround of BAR0 mapping gives up and immediately returns an > error if it cannot map around the MSI-X. However, recent version > of VFIO allows MSIX mapping (*). > > I fixed not to return immediately but try mapping. In old Linux, mmap > just fails and returns the same error as the code before my fix . In > recent Linux, mmap succeeds and this patch enables running DPDK in > specific environments (e.g., ppc64le with HGST NVMe) > > (*): "vfio-pci: Allow mapping MSIX BAR", > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ > commit/id=a32295c612c57990d17fb0f41e7134394b2f35f6 > > Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables") > > Signed-off-by: Takeshi Yoshimura > --- > > Thanks, Anatoly. > > I updated the patch not to affect behaviors of older Linux and > other environments as well as possible. This patch adds another > chance to mmap BAR0. > > I noticed that the check at line 350 already includes the check > of page size, so this patch does not fix the check. > > Regards, > Takeshi Hi Takeshi, Please correct me if i'm wrong, but i'm not sure the old behavior is kept. Let's say we're running an old kernel, which doesn't allow mapping MSI-X BARs. If MSI-X starts at beginning of the BAR (floor-aligned to page size), and ends at or beyond end of BAR (ceiling-aligned to page size). In that situation, old code just skipped the BAR and returned 0. We then exited the function, and there's a check for return value right after pci_vfio_mmap_bar() that stop continuing if we fail to map something. In the old code, we would continue as we went, and finish the rest of our mappings. With your new code, you're attempting to map the BAR, it fails, and you will return -1 on older kernels. I believe what we really need here is the following: 1) If this is a BAR containing MSI-X vector, first try mapping the entire BAR. If it succeeds, great - that would be your new kernel behavior. 2) If we failed on step 1), check to see if we can map around the BAR. If we can, try to map around it like the current code does. If we cannot map around it (i.e. if MSI-X vector, page aligned, occupies entire BAR), then we simply return 0 and skip the BAR. That, i would think, would keep the old behavior and enable the new one. Does that make sense? -- Thanks, Anatoly