From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id E6C40A04B5; Mon, 7 Sep 2020 17:43:27 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C86BE1C0CC; Mon, 7 Sep 2020 17:43:26 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 1BB301C0C9 for ; Mon, 7 Sep 2020 17:43:24 +0200 (CEST) IronPort-SDR: 2V2uzAxaex8b7CtykU3g+ymJ2W4NuFQEUO3y2ecWjG61nxPVgIqRoP2JBLQQspwRQr5ThcaKBE nTtCpPN36pHg== X-IronPort-AV: E=McAfee;i="6000,8403,9737"; a="155417640" X-IronPort-AV: E=Sophos;i="5.76,402,1592895600"; d="scan'208";a="155417640" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2020 08:43:24 -0700 IronPort-SDR: jx9TJbIHMNbmVVTmWna3l0XyULKCZsLn6ykKtvmn/O4u5SuEgBf6IiLtlOwoh4ck6uf2lIWqXl E1jwXUCKpoKA== X-IronPort-AV: E=Sophos;i="5.76,402,1592895600"; d="scan'208";a="448452782" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.58.78]) ([10.252.58.78]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2020 08:43:22 -0700 To: Vikas Gupta , dev@dpdk.org Cc: Ajit Kumar Khaparde , Vikram Prakash References: <828830f7-8f39-72e4-2f64-c20f6b56d8a9@intel.com> <4774f1c16093c58e7f93f339f65f42cd@mail.gmail.com> From: "Burakov, Anatoly" Message-ID: Date: Mon, 7 Sep 2020 16:43:20 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <4774f1c16093c58e7f93f339f65f42cd@mail.gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] Issue with VFIO/IOMMU X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 07-Sep-20 2:31 PM, Vikas Gupta wrote: > Hi Burakov, > > -----Original Message----- > From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com] > Sent: Friday, September 04, 2020 7:20 PM > To: Vikas Gupta >; dev@dpdk.org > Cc: Ajit Kumar Khaparde >; Vikram Prakash > > > Subject: Re: [dpdk-dev] Issue with VFIO/IOMMU > > On 03-Sep-20 12:09 PM, Vikas Gupta wrote: > >> Hi, > >> > >>    I observe an issue with IOVA address returned by api > >> rte_memzone_reserve_aligned (flags= RTE_MEMZONE_IOVA_CONTIG) used for > >> queue memory allocation. With high level debugging, I notice that IOVA > >> address returned in mz->iova is not mapped by VFIO_IOMMU_MAP_DMA so in > >> turn SMMU exception is seen. > > I'm not sure i follow. > > How did you determine that to be the case, given that, by your own > admission below, `vfio_type1_dma_mem_map` function is executed several > times? > > [Vikas]: > > I`ll mention map and unmap as below in explaining through one of the example > > map = function vfio_type1_dma_mem_map called with argument do_map = 1 > > unamp = function  type1_dma_mem_map  called with argument do_map = 0 > > What I notice that for some particular address received in mz->iova, > after rte_memzone_reserve_aligned is successfully returned, the map > function (vfio_type1_dma_mem_map do_map =1) was not called prior to > return of function rte_memzone_reserve_aligned. If the function wasn't called, that most likely means that the memory region in question is still in use. This happens when, for example, your memzone is less than one page size long, and there is something else that's already allocated on that page (such as a subsequent/preceding call to rte_malloc). Calling memzone reserve doesn't *necessarily* have to result in a call to IOVA map - this only happens when the memory allocator determines that it needs more pages to fulfill the request - it's those pages that are mapped for IOVA, not the memzone. Similarly, freeing memzones doesn't *necessarily* result in a call to VFIO unmap - the unmap will only happen if the allocator determines that these pages can be freed. So, not calling VFIO (un)map after memzone reserve/free is, in and of itself, not something that is out of the ordinary and is in fact expected in certain cases. The mapping granularity is page-based, not memzone-based, so the map/unmap only happens when new *pages* are reserved or freed. Not every memory (de)allocation triggers (de)allocation of new pages. > > Below is one of the sequence to understand. > > Let’s say there is an address ‘*//**/iova_fail’/*, for which exception > is raised by SMMU while dpdk-test runs with Crypto PMD. > > When dpdk-test is run with Crypto test suit I see that for an > address*//**/iova_fail/*several times vfio_type1_dma_mem_map is called > with (do_map = 0/1 with length = 2MB). I believe this happens due to > call for memory allocation/free for buffers/queues. The test runs fine > as long as the map is called before rte_memzone_reserve_aligned returns > and similarly for unmap when same memory is freed. But after several > times with map/unmap for*//**/iova_fail/*, map is NOT called before > rte_memzone_reserve_aligned is retuned though iova_fail was previously > unmapped. Since it’s not mapped, SMMU raises an exception. > If there is a case where VFIO unmap erroneously happens (or doesn't happen when it should), i would very much like to know, but given the length of the allocation/mapping is 2MB, this sounds exactly like the use case i have described above - something else is holding onto that memory, and repeated memzone reserve/free does not cause map/unmap any more. I would advise adding a custom mem event callback that simply prints out any new memory being added/removed, and see if indeed you observe that the pages are indeed being allocated but not mapped. I would also advise checking the IOVA address with which you get an exception, and whether it really is a valid IOVA address *at the time of the exception* (by checking whether the address belongs to one of the allocated memory segments - see either memseg_walk or dump_physmem_layout functions). Since you are running the test multiple times, a plausible alternative explanation could be stale data from a previous run causing a DMA into an address that was, at one point, valid, but no longer is. > Please note issue is not frequently visible and might reproduce after > pmd_crypto_autotest is run multiple timesoverdpdk-test. > > If you are not able to follow I`ll try to send the debug printfs for test. > > Thanks, > > Vikas > >> > >> > >> > >> *Details for the setup* > >> > >> Platform: Armv8 (Broadcom Stingray) > >> > >> DPDK release: DPDK 20.08 > >> > >> PMD patch: > >> > >>https://patches.dpdk.org/project/dpdk/list/?series=&submitter=1907&sta > >> te=&q=&archive=&delegate= > >> > >> dpdk-test is launched using below command > >> > >> *dpdk-test --vdev -w 0000:00:00.0 --iova-mode pa * > >> > >> > >> > >> The test suite is launched over dpdk-test application command prompt > >> using command ‘cryptodev__autotest’ > >> > >> The issue is seen when several iterations of above test_suite is > >> executed which in turn do multiple calls to > >> rte_memzone_reserve_aligned, rte_mempool_create and rte_memzone_free, rte_mempool_free. > >> > >>   Function *vfio_type1_dma_mem_map* with map/unmap event is executed > >> several times during test_suite run. > >> > >> > >> > >> Any inputs would be helpful. > >> > >> > >> > >> Thanks, > >> > >> Vikas > >> > > > -- > > Thanks, > > Anatoly > -- Thanks, Anatoly