From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 1EAD75589 for ; Tue, 7 Mar 2017 09:50:04 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP; 07 Mar 2017 00:50:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.35,257,1484035200"; d="scan'208";a="941466170" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by orsmga003.jf.intel.com with ESMTP; 07 Mar 2017 00:49:58 -0800 Date: Tue, 7 Mar 2017 16:48:25 +0800 From: Yuanhan Liu To: wang.yong19@zte.com.cn Cc: dev@dpdk.org, "Tan, Jianfeng" Message-ID: <20170307084825.GP18844@yliu-dev.sh.intel.com> References: <201703061915159926040@zte.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <201703061915159926040@zte.com.cn> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [vhost] segment fault when virtio_user port init X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Mar 2017 08:50:05 -0000 Cc Jianfeng, who added the doc and wrote the virtio-user code. On Mon, Mar 06, 2017 at 07:15:15PM +0800, wang.yong19@zte.com.cn wrote: > Following the description of "Virtio_user for Container Networking" in > > "how to guides", I ran this example in a VM created by qemu. I started > > a testpmd in the VM with a vhost-user port. The command is: > > $(testpmd) -c 0x3 -n 4 --socket-mem 1024,1024 \ > > --vdev 'eth_vhost0,iface=/tmp/sock0' --no-pci -- -i > > And then, I started a container instance with a virtio-user port. The > > command is: > > docker run -i -t -v /tmp/sock0:/var/run/usvhost \ > > -v /dev/hugepages:/dev/hugepages \ > > dpdk-app-testpmd testpmd -c 0xc -n 4 -m 1024 --no-pci \ > > --vdev=virtio_user0,path=/var/run/usvhost \ > > -- -i --txqflags=0xf00 --disable-hw-vlan Hmm, should not you add --file-prefix option to distinguish the huge page file names while starting two DPDK apps in the same host? --yliu > > > Then, a segment fault occured in the VM's testpmd. > > testpmd> VHOST_CONFIG: new vhost user connection is 15 > > Segmentation fault (core dumped) > > > As a result, the container could not complete the initialization of the > > virtio_user port. > > EAL: failed to initialize virtio_user0 device > > PANIC in rte_eal_init(): > > Cannot init pmd devices > > 6: [testpmd() [0x4497f9]] > > 5: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) > > [0x7f41c9b19ec5]] > > 4: [testpmd(main+0x42) [0x448a32]] > > 3: [testpmd(rte_eal_init+0xde2) [0x49e322]] > > 2: [testpmd(__rte_panic+0xbe) [0x442a56]] > > 1: [testpmd(rte_dump_stack+0x1a) [0x4a57ba]] > > > I opened the "CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT" and > > "CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER" switch to try again. I got the > > following output in container. > > PMD: vhost_user_sock(): VHOST_SET_OWNER > > PMD: vhost_user_sock(): VHOST_GET_FEATURES > > PMD: vhost_user_read(): Failed to recv msg hdr: -1 instead of 12. > > PMD: vhost_user_sock(): Received msg failed: Connection reset by peer > > PMD: virtio_user_dev_init(): get_features failed: Connection reset by > > peer > > PMD: virtio_user_pmd_probe(): virtio_user_dev_init fails > > > According to the output in above, I realized the reason was at VM side. > > By adding some logs to the code, I found the segment fault was > > occured in the following code: vhost_user_server_new_connection()-> > > vhost_user_add_connection()->vhost_new_device()->rte_zmalloc()-> > > rte_zmalloc_socket()->rte_malloc_socket()->malloc_heap_alloc()-> > > malloc_elem_alloc()->elem_free_list_remove()->LIST_REMOVE(); > > > When there is a new vhost-user connection established, it need to > > malloc 528512 bytes memory(sizeof(struct virtio_net)) from heap. In my > > environment, malloc_heap_alloc() found the suitable element in heap. > > But when malloc_elem_alloc() called the elem_free_list_remove(), a > > segment fault occured in LIST_REMOVE(). > > > Would you please do me a favor to resolve this problem? > > BTY, the VM used 1G hugepages and hugepage num was 4. > > > > > >