From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 747F9A0353; Tue, 5 Nov 2019 12:41:16 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B5EA334F0; Tue, 5 Nov 2019 12:41:15 +0100 (CET) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id D9F141515 for ; Tue, 5 Nov 2019 12:41:13 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Nov 2019 03:41:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,271,1569308400"; d="scan'208";a="223440274" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.92]) ([10.237.220.92]) by FMSMGA003.fm.intel.com with ESMTP; 05 Nov 2019 03:41:11 -0800 To: Rajesh Ravi Cc: Ajit Khaparde , dev@dpdk.org, Jonathan Richardson , Scott Branden , Vikram Mysore Prakash , Srinath Mannam References: <20191015053047.52260-1-ajit.khaparde@broadcom.com> <83009bb3-1e0c-a22e-eff8-41a437817cb7@intel.com> <64edebee-3686-beca-2b30-c6ec1f26c162@intel.com> <31bff5b7-169c-e158-3a87-6448272c571c@intel.com> <32f3cd1a-08bb-e4bc-c22c-53453b936dd3@intel.com> From: "Burakov, Anatoly" Message-ID: <3a954a3f-1fc5-47f0-64f2-a7c8e3fbf964@intel.com> Date: Tue, 5 Nov 2019 11:41:10 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <32f3cd1a-08bb-e4bc-c22c-53453b936dd3@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH] eal: add option --iso-cmem for external custom memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 04-Nov-19 10:25 AM, Burakov, Anatoly wrote: > On 30-Oct-19 7:50 PM, Rajesh Ravi wrote: >> Thanks Anatoly. >> Please find  inline below: >> >> [Anatoly] vfio_mem_event_callback() is called every time memory is >> added to a >> heap. That includes internal and external memory >> >> [Rajesh] malloc_heap_add_external_memory() does call >> eal_memalloc_mem_event_notify [ instead of vfio_mem_event_callback() ] >>                But, no callback function is getting called from inside >> eal_memalloc_mem_event_notify() >>                execution flow is not entering inside following loop: >> >> /TAILQ_FOREACH(entry, &mem_event_callback_list, next) {/ >> /                 RTE_LOG(DEBUG, EAL, "Calling mem event callback >> '%s:%p'\n", >>                           entry->name, entry->arg); >>                   entry->clb(event, start, len, entry->arg); >>                }/ >> >> Do you mean to say,  we are supposed to explicitly register a callback >> which separately builds  iommu tables in addition to calling >> rte_malloc_heap_memory_add()  API? > > Hi, > > No, the callback in VFIO should be registered automatically [1] at EAL > initialization (or, more precisely, when default container is > initialized). Does that not happen in your case? > > [1] http://git.dpdk.org/dpdk/tree/lib/librte_eal/linux/eal/eal_vfio.c#n791 > Hi Rajesh, I think i figured it out. It is a defect in design of how external memory heaps are handled. When VFIO initializes, it will find first VFIO-bound device, initialize the container, and set up DMA mappings. Then, you can add more memory through creating custom memory regions without adding them to heap (mmap() + rte_extmem_register() + rte_dev_dma_map()), or with adding them to heap (mmap() + rte_malloc_heap_add_memory()). The problem is, memory registered through rte_dev_dma_map() will get added into a list of user maps, while heap memory will not - the assumption is that the DMA mapping will happen through the callback, but there is no record left anywhere that this memory is supposed to be mapped. This makes it so that, if there are no VFIO-bound devices at startup, then you create a heap, and *then* you hotplug a device, the heap will not be mapped because (as you have correctly pointed out) type1_map() skips it, and it's not present in a list of user mem maps either, because it is heap memory, so EAL is supposed to handle it by itself and not through user map list. There could be two fixes here. The easiest one is to just add another flag to the memseglist - that will work for 19.11, and that's what i intend on doing since we're breaking ABI anyway. For older releases, a different approach would be required (i think scanning heaps is best we can do here) in order to keep the ABI and not introduce new stuff into rte_memseg_list. I'll submit a patch shortly, it would be great if you could test it. -- Thanks, Anatoly