From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 9752737B4 for ; Thu, 22 Mar 2018 10:24:38 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Mar 2018 02:24:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,344,1517904000"; d="scan'208";a="44330673" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.52.175]) ([10.252.52.175]) by orsmga002.jf.intel.com with ESMTP; 22 Mar 2018 02:24:35 -0700 To: Shreyansh Jain Cc: "dev@dpdk.org" , Hemant Agrawal References: <20180308101805.GA9526@ltp-pvn> <20180308111337.GA11638@ltp-pvn> <20180308133612.GA16647@ltp-pvn> <57c18da9-7377-3c0b-4aa2-9b97ef206f4f@intel.com> <55a2a182-27d5-b59a-0993-5b988f041e98@intel.com> <20180309091513.GA5781@ltp-pvn> <02f5b614-a65d-2512-0e57-999defdb4201@intel.com> From: "Burakov, Anatoly" Message-ID: <7c68125d-8927-c3b5-dfea-e540544fd6ab@intel.com> Date: Thu, 22 Mar 2018 09:24:34 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2018 09:24:40 -0000 On 22-Mar-18 5:09 AM, Shreyansh Jain wrote: > Hello Anatoly, > >> -----Original Message----- >> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com] >> Sent: Wednesday, March 21, 2018 8:18 PM >> To: Shreyansh Jain >> Cc: dev@dpdk.org; Hemant Agrawal >> Subject: Re: [dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK >> > > [...] > >>>> >>> >>> While working on issue reported in [1], I have found another issue >>> which I might need you help. >>> >>> [1] >> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpdk.o >> rg%2Fml%2Farchives%2Fdev%2F2018- >> March%2F093202.html&data=02%7C01%7Cshreyansh.jain%40nxp.com%7C5faee716e6 >> fc4908bdb608d58f3ad1e5%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6365 >> 72405182868376&sdata=WohDdktHHAuNDnss1atuixSa%2FqC7HRMSDVCtFC9Vnto%3D&re >> served=0 >>> >>> For [1], I bypassed by changing the mempool_add_elem code for time >>> being - it now allows non-contiguous (not explicitly demanded >>> contiguous) allocations to go through rte_mempool_populate_iova. With >>> that, I was able to get DPAA2 working. >>> >>> Problem is: >>> 1. When I am working with 1GB pages, I/O is working fine. >>> 2. When using 2MB pages (1024 num), the initialization somewhere after >>> VFIO layer fails. >>> >>> All with IOVA=VA mode. >>> >>> Some logs: >>> >>> This is the output of the virtual memory layout demanded by DPDK: >>> >>> --->8--- >>> EAL: Ask a virtual area of 0x2e000 bytes >>> EAL: Virtual area found at 0xffffb6561000 (size = 0x2e000) >>> EAL: Setting up physically contiguous memory... >>> EAL: Ask a virtual area of 0x59000 bytes >>> EAL: Virtual area found at 0xffffb6508000 (size = 0x59000) >>> EAL: Memseg list allocated: 0x800kB at socket 0 >>> EAL: Ask a virtual area of 0x400000000 bytes >>> EAL: Virtual area found at 0xfffbb6400000 (size = 0x400000000) >>> EAL: Ask a virtual area of 0x59000 bytes >>> EAL: Virtual area found at 0xfffbb62af000 (size = 0x59000) >>> EAL: Memseg list allocated: 0x800kB at socket 0 >>> EAL: Ask a virtual area of 0x400000000 bytes >>> EAL: Virtual area found at 0xfff7b6200000 (size = 0x400000000) >>> EAL: Ask a virtual area of 0x59000 bytes >>> EAL: Virtual area found at 0xfff7b6056000 (size = 0x59000) >>> EAL: Memseg list allocated: 0x800kB at socket 0 >>> EAL: Ask a virtual area of 0x400000000 bytes >>> EAL: Virtual area found at 0xfff3b6000000 (size = 0x400000000) >>> EAL: Ask a virtual area of 0x59000 bytes >>> EAL: Virtual area found at 0xfff3b5dfd000 (size = 0x59000) >>> EAL: Memseg list allocated: 0x800kB at socket 0 >>> EAL: Ask a virtual area of 0x400000000 bytes >>> EAL: Virtual area found at 0xffefb5c00000 (size = 0x400000000) >>> --->8--- >>> >>> Then, somehow VFIO mapping is able to find only a single page to map >>> >>> --->8--- >>> EAL: Device (dpci.1) abstracted from VFIO >>> EAL: -->Initial SHM Virtual ADDR FFFBB6400000 >>> EAL: -----> DMA size 0x200000 >>> EAL: Total 1 segments found. >>> --->8--- >>> >>> Then, these logs appear probably when DPAA2 code requests for memory. >>> I am not sure why it repeats the same '...expanded by 10MB'. >>> >>> --->8--- >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 2MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> EAL: Calling mem event callback vfio_mem_event_clbEAL: request: >> mp_malloc_sync >>> EAL: Heap on socket 0 was expanded by 10MB >>> LPM or EM none selected, default LPM on >>> Initializing port 0 ... >>> --->8--- >>> >>> l3fwd is stuck at this point. What I observe is that DPAA2 driver has >>> gone ahead to register the queues (queue_setup) with hardware and the >>> memory has either overrun (smaller than requested size mapped) or the >>> addresses are corrupt (that is, not dma-able). (I get SMMU faults, >>> indicating one of these cases) >>> >>> There is some change from you in the fslmc/fslmc_vfio.c file >>> (rte_fslmc_vfio_dmamap()). Ideally, that code should have walked over >>> all the available pages for mapping but that didn't happen and only a >>> single virtual area got dma-mapped. >>> >>> --->8--- >>> EAL: Device (dpci.1) abstracted from VFIO >>> EAL: -->Initial SHM Virtual ADDR FFFBB6400000 >>> EAL: -----> DMA size 0x200000 >>> EAL: Total 1 segments found. >>> --->8--- >>> >>> I am looking into this but if there is some hint which come to your >>> mind, it might help. >>> >>> Regards, >>> Shreyansh >>> >> >> Hi Shreyansh, >> >> Thanks for the feedback. >> >> The "heap on socket 0 was expanded by 10MB" has to do with >> synchronization requests in primary/secondary processes. I can see >> you're allocating LPM tables - that's most likely what these allocations >> are about (it's hotplugging memory). > > I get that but why same message multiple times without any change in the expansion. Further, I don't have multiple process - in fact, I'm working with a single datapath thread. > Anyways, I will look through the code for this. > Hi Shreyansh, I've misspoke - this has nothing to do with multiprocess. The "request: mp_malloc_sync" does, but it's an attempt to notify other processes of the allocation - if there are no processes, nothing happens. However, multiple heap expansions do correspond to multiple allocations. If you allocate an LPM table that takes up 10M of hugepage memory - you expand heap by 10M. If you do it multiple times (e.g. per-NIC?), you do multiple heap expansions. This message will be triggered on every heap expansion. >> >> I think i might have an idea what is going on. I am assuming that you >> are starting up your DPDK application without any -m or --socket-mem >> flags, which means you are starting with empty heap. > > Yes, no specific --socket-mem passed as argument. > >> >> During initialization, certain DPDK features (such as service cores, >> PMD's) allocate memory. Most likely you have essentially started up with >> 1 2M page, which is what you see in fslmc logs: this page gets mapped >> for VFIO. > > Agree. > >> >> Then, you allocate a bunch of LPM tables, which trigger more memory >> allocation, and trigger memory allocation callbacks registered through >> rte_mem_event_register_callback(). One of these callbacks is a VFIO >> callback, which is registered in eal_vfio.c:rte_vfio_enable(). However, >> since fslmc bus has its own VFIO implementation that is independent of >> what happens in EAL VFIO code, what probably happens is that the fslmc >> bus misses the necessary messages from the memory hotplug to map >> additional resources for DMA. > > Makes sense > >> >> Try adding a rte_mem_event_register_callback() somewhere in fslmc init >> so that it calls necessary map function. >> eal_vfio.c:vfio_mem_event_callback() should provide a good template on >> how to approach creating such a callback. Let me know if this works! > > OK. I will give this a try and update you. > >> >> (as a side note, how can we extend VFIO to move this stuff back into EAL >> and expose it as an API?) > > The problem is that FSLMC VFIO driver is slightly different from generic VFIO layer in the sense that device in a VFIO container is actually another level of container. Anyways, I will have a look how much generalization is possible. Or else, I will work with the vfio_mem_event_callback() as suggested above. This can wait :) The callback is probably the proper way to do it right now. > > Thanks for suggestions. > >> >> -- >> Thanks, >> Anatoly -- Thanks, Anatoly