From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 3D1E04CC0 for ; Fri, 9 Mar 2018 11:42:09 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Mar 2018 02:42:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,445,1515484800"; d="scan'208";a="181286636" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.24.59]) ([10.252.24.59]) by orsmga004.jf.intel.com with ESMTP; 09 Mar 2018 02:42:04 -0800 To: Pavan Nikhilesh , keith.wiles@intel.com, jianfeng.tan@intel.com, andras.kovacs@ericsson.com, laszlo.vadkeri@ericsson.com, benjamin.walker@intel.com, bruce.richardson@intel.com, thomas@monjalon.net, konstantin.ananyev@intel.com, kuralamudhan.ramakrishnan@intel.com, louise.m.daly@intel.com, nelio.laranjeiro@6wind.com, yskoh@mellanox.com, pepperjo@japf.ch, jerin.jacob@caviumnetworks.com, hemant.agrawal@nxp.com, olivier.matz@6wind.com Cc: dev@dpdk.org References: <20180308101805.GA9526@ltp-pvn> <20180308111337.GA11638@ltp-pvn> <20180308133612.GA16647@ltp-pvn> <57c18da9-7377-3c0b-4aa2-9b97ef206f4f@intel.com> <55a2a182-27d5-b59a-0993-5b988f041e98@intel.com> <20180309091513.GA5781@ltp-pvn> From: "Burakov, Anatoly" Message-ID: Date: Fri, 9 Mar 2018 10:42:03 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180309091513.GA5781@ltp-pvn> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v2 00/41] Memory Hotplug for DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Mar 2018 10:42:09 -0000 On 09-Mar-18 9:15 AM, Pavan Nikhilesh wrote: > On Thu, Mar 08, 2018 at 08:33:21PM +0000, Burakov, Anatoly wrote: >> On 08-Mar-18 8:11 PM, Burakov, Anatoly wrote: >>> On 08-Mar-18 2:36 PM, Burakov, Anatoly wrote: >>>> On 08-Mar-18 1:36 PM, Pavan Nikhilesh wrote: >>>>> Hi Anatoly, >>>>> >>>>> We are currently facing issues with running testpmd on thunderx >>>>> platform. >>>>> The issue seems to be with vfio >>>>> >>>>> EAL: Detected 24 lcore(s) >>>>> EAL: Detected 1 NUMA nodes >>>>> EAL: No free hugepages reported in hugepages-2048kB >>>>> EAL: Multi-process socket /var/run/.rte_unix >>>>> EAL: Probing VFIO support... >>>>> EAL: VFIO support initialized >>>>> EAL:   VFIO support not initialized >>>>> >>>>> >>>>> >>>>> EAL:   probe driver: 177d:a053 octeontx_fpavf >>>>> EAL: PCI device 0001:01:00.1 on NUMA socket 0 >>>>> EAL:   probe driver: 177d:a034 net_thunderx >>>>> EAL:   using IOMMU type 1 (Type 1) >>>>> EAL:   cannot set up DMA remapping, error 22 (Invalid argument) >>>>> EAL:   0001:01:00.1 DMA remapping failed, error 22 (Invalid argument) >>>>> EAL: Requested device 0001:01:00.1 cannot be used >>>>> EAL: PCI device 0001:01:00.2 on NUMA socket 0 >>>>> >>>>> testpmd: No probed ethernet devices >>>>> testpmd: create a new mbuf pool : n=251456, >>>>> size=2176, socket=0 >>>>> testpmd: preferred mempool ops selected: ring_mp_mc >>>>> EAL:   VFIO support not initialized >>>>> EAL:   VFIO support not initialized >>>>> EAL:   VFIO support not initialized >>>>> Done >>>>> >>>>> >>>>> This is because rte_service_init() calls rte_calloc() before >>>>> rte_bus_probe() and vfio_dma_mem_map fails because iommu type is >>>>> not set. >>>>> >>>>> Call stack: >>>>> gdb) bt >>>>> #0  vfio_dma_mem_map (vaddr=281439006359552, iova=11274289152, >>>>> len=536870912, do_map=1) at >>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal_vfio.c:967 >>>>> #1  0x00000000004fd974 in rte_vfio_dma_map >>>>> (vaddr=281439006359552, iova=11274289152, len=536870912) at >>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal_vfio.c:988 >>>>> #2  0x00000000004fbe78 in vfio_mem_event_callback >>>>> (type=RTE_MEM_EVENT_ALLOC, addr=0xfff7a0000000, len=536870912) >>>>> at /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal_vfio.c:240 >>>>> #3  0x00000000005070ac in eal_memalloc_notify >>>>> (event=RTE_MEM_EVENT_ALLOC, start=0xfff7a0000000, len=536870912) >>>>> at >>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_memalloc.c:177 >>>>> #4  0x0000000000515c98 in try_expand_heap_primary >>>>> (heap=0xffffb7fb167c, pg_sz=536870912, elt_size=8192, socket=0, >>>>> flags=0, align=128, bound=0, contig=false) at >>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:247 >>>>> #5  0x0000000000515e94 in try_expand_heap (heap=0xffffb7fb167c, >>>>> pg_sz=536870912, elt_size=8192, socket=0, flags=0, align=128, >>>>> bound=0, contig=false) at >>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:327 >>>>> #6  0x00000000005163a0 in alloc_more_mem_on_socket >>>>> (heap=0xffffb7fb167c, size=8192, socket=0, flags=0, align=128, >>>>> bound=0, contig=false) at >>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:455 >>>>> #7  0x0000000000516514 in heap_alloc_on_socket (type=0x85bf90 >>>>> "rte_services", size=8192, socket=0, flags=0, align=128, >>>>> bound=0, contig=false) at >>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:491 >>>>> #8  0x0000000000516664 in malloc_heap_alloc (type=0x85bf90 >>>>> "rte_services", size=8192, socket_arg=-1, flags=0, align=128, >>>>> bound=0, contig=false) at >>>>> /root/clean/dpdk/lib/librte_eal/common/malloc_heap.c:527 >>>>> #9  0x0000000000513b54 in rte_malloc_socket (type=0x85bf90 >>>>> "rte_services", size=8192, align=128, socket_arg=-1) at >>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:54 >>>>> #10 0x0000000000513bc8 in rte_zmalloc_socket (type=0x85bf90 >>>>> "rte_services", size=8192, align=128, socket=-1) at >>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:72 >>>>> #11 0x0000000000513c00 in rte_zmalloc (type=0x85bf90 >>>>> "rte_services", size=8192, align=128) at >>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:81 >>>>> #12 0x0000000000513c90 in rte_calloc (type=0x85bf90 >>>>> "rte_services", num=64, size=128, align=128) at >>>>> /root/clean/dpdk/lib/librte_eal/common/rte_malloc.c:99 >>>>> #13 0x0000000000518cec in rte_service_init () at >>>>> /root/clean/dpdk/lib/librte_eal/common/rte_service.c:81 >>>>> #14 0x00000000004f55f4 in rte_eal_init (argc=3, >>>>> argv=0xfffffffff488) at >>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal.c:959 >>>>> #15 0x000000000045af5c in main (argc=3, argv=0xfffffffff488) at >>>>> /root/clean/dpdk/app/test-pmd/testpmd.c:2483 >>>>> >>>>> >>>>> Also, I have tried running with --legacy-mem but I'm stuck in >>>>> `pci_find_max_end_va` loop  because `rte_fbarray_find_next_used` >>>>> always return >>>>> 0. > >>>>> HugePages_Total:      15 >>>>> HugePages_Free:       11 >>>>> HugePages_Rsvd:        0 >>>>> HugePages_Surp:        0 >>>>> Hugepagesize:     524288 kB >>>>> >>>>> Call Stack: >>>>> (gdb) bt >>>>> #0  find_next (arr=0xffffb7fb009c, start=0, used=true) at >>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_fbarray.c:248 >>>>> #1  0x00000000005132a8 in rte_fbarray_find_next_used >>>>> (arr=0xffffb7fb009c, start=0) at >>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_fbarray.c:700 >>>>> #2  0x000000000052d030 in pci_find_max_end_va () at >>>>> /root/clean/dpdk/drivers/bus/pci/linux/pci.c:138 >>>>> #3  0x0000000000530ab8 in pci_vfio_map_resource_primary >>>>> (dev=0xeae700) at >>>>> /root/clean/dpdk/drivers/bus/pci/linux/pci_vfio.c:499 >>>>> #4  0x0000000000530ffc in pci_vfio_map_resource (dev=0xeae700) >>>>> at /root/clean/dpdk/drivers/bus/pci/linux/pci_vfio.c:601 >>>>> #5  0x000000000052ce90 in rte_pci_map_device (dev=0xeae700) at >>>>> /root/clean/dpdk/drivers/bus/pci/linux/pci.c:75 >>>>> #6  0x0000000000531a20 in rte_pci_probe_one_driver (dr=0x997e20 >>>>> , dev=0xeae700) at >>>>> /root/clean/dpdk/drivers/bus/pci/pci_common.c:164 >>>>> #7  0x0000000000531c68 in pci_probe_all_drivers (dev=0xeae700) >>>>> at /root/clean/dpdk/drivers/bus/pci/pci_common.c:249 >>>>> #8  0x0000000000531f68 in rte_pci_probe () at >>>>> /root/clean/dpdk/drivers/bus/pci/pci_common.c:359 >>>>> #9  0x000000000050a140 in rte_bus_probe () at >>>>> /root/clean/dpdk/lib/librte_eal/common/eal_common_bus.c:98 >>>>> #10 0x00000000004f55f4 in rte_eal_init (argc=1, >>>>> argv=0xfffffffff498) at >>>>> /root/clean/dpdk/lib/librte_eal/linuxapp/eal/eal.c:967 >>>>> #11 0x000000000045af5c in main (argc=1, argv=0xfffffffff498) at >>>>> /root/clean/dpdk/app/test-pmd/testpmd.c:2483 >>>>> >>>>> Am I missing something here? >>>> >>>> I'll look into those, thanks! >>>> >>>> Btw, i've now set up a github repo with the patchset applied: >>>> >>>> https://github.com/anatolyburakov/dpdk >>>> >>>> I will be pushing quick fixes there before spinning new revisions, >>>> so we can discover and fix bugs more rapidly. I'll fix compile >>>> issues reported earlier, then i'll take a look at your issues. The >>>> latter one seems like a typo, the former is probably a matter of >>>> moving things around a bit. >>>> >>>> (also, pull requests welcome if you find it easier to fix things >>>> yourself and submit patches against my tree!) >>>> >>>> Thanks for testing. >>>> >>> >>> I've looked into the failures. >>> >>> The VFIO one is not actually a failure. It only prints out errors >>> because rte_malloc is called before VFIO is initialized. However, once >>> VFIO *is* initialized, all of that memory would be added to VFIO, so >>> these error messages are harmless. Regardless, i've added a check to see >>> if init is finished before printing out those errors, so they won't be >>> printed out any more. >>> >>> Second one is a typo on my part that got lost in one of the rebases. >>> >>> I've pushed fixes for both into the github repo. >>> >> >> Although i do wonder where do the DMA remapping errors come from. The error >> message says "invalid argument", so that doesn't come from rte_service or >> anything to do with rte_malloc - this is us not providing valid arguments to >> VFIO. I'm not seeing these errors on my system. I'll check on others to be >> sure. > > I have taken a look at the github tree the issues with VFIO are gone, Although > compilation issues with dpaa/dpaa2 are still present due to their dependency on > `rte_eal_get_physmem_layout`. I've fixed the dpaa compile issue and pushed it to github. I've tried to keep the semantics the same as before, but i can't compile-test (let alone test-test) them as i don't have access to a system with dpaa bus. Also, you might want to know that dpaa bus driver references RTE_LIBRTE_DPAA_MAX_CRYPTODEV which is only found in config/common_armv8a_linuxapp but is not present in base config. Not sure if that's an issue. > >> >> -- >> Thanks, >> Anatoly > > Thanks, > Pavan > -- Thanks, Anatoly