From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 4467D5689 for ; Wed, 21 Nov 2018 08:42:30 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Nov 2018 23:42:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,260,1539673200"; d="scan'208,217";a="92826069" Received: from jguo15x-mobl.ccr.corp.intel.com (HELO [10.67.68.89]) ([10.67.68.89]) by orsmga006.jf.intel.com with ESMTP; 20 Nov 2018 23:42:26 -0800 To: Stephen Hemminger Cc: marko.kovacevic@intel.com, john.mcnamara@intel.com, qi.z.zhang@intel.com, ferruh.yigit@intel.com, thomas@monjalon.net, dev@dpdk.org, helin.zhang@intel.com, konstantin.ananyev@intel.com, shaopeng.he@intel.com, bruce.richardson@intel.com, gaetan.rivet@6wind.com References: <1542726571-121934-1-git-send-email-jia.guo@intel.com> <20181120100212.150a6c73@xeon-e3> From: Jeff Guo Message-ID: <2464fbf5-20dd-519d-596b-5fa12e92dc2a@intel.com> Date: Wed, 21 Nov 2018 15:42:26 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <20181120100212.150a6c73@xeon-e3> Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH] doc: add known igb_uio device hot-unplug issue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Nov 2018 07:42:32 -0000 On 11/21/2018 2:02 AM, Stephen Hemminger wrote: > On Tue, 20 Nov 2018 23:09:31 +0800 > Jeff Guo wrote: > >> When device has been bound to igb_uio driver and application is running, >> hot-unplugging the device may cause kernel crash. >> >> Signed-off-by: Jeff Guo >> --- >> doc/guides/rel_notes/known_issues.rst | 21 +++++++++++++++++++++ >> 1 file changed, 21 insertions(+) >> >> diff --git a/doc/guides/rel_notes/known_issues.rst b/doc/guides/rel_notes/known_issues.rst >> index 95e4ce6..dfe0565 100644 >> --- a/doc/guides/rel_notes/known_issues.rst >> +++ b/doc/guides/rel_notes/known_issues.rst >> @@ -759,3 +759,24 @@ Netvsc driver and application restart >> >> **Driver/Module**: >> ``uio_hv_generic`` module. >> + >> + >> +kernel crash when hot-unplug igb_uio device while DPDK application is running >> +----------------------------------------------------------------------------- >> + >> +**Description**: >> + When device has been bound to igb_uio driver and application is running, hot-unplugging >> + the device may cause kernel crash. >> + >> +**Reason**: >> + When device is hot-unplugged, igb_uio driver will be removed which will destroy uio resources. >> + Later trying to access any uio resource will cause kernel crash. >> + >> +**Resolution/Workaround**: >> + If using DPDK for PCI HW hot-unplug, prefer to bind device with VFIO instead of IGB_UIO. >> + >> +**Affected Environment/Platform**: >> + ALL. >> + >> +**Driver/Module**: >> + ``igb_uio`` module. > Surely this is fixable. What is the back trace in the kernel? How can it be reproduced with some > common hardware (or hypervisor). Will it happen with KVM? I think the final fix should be at uio_module in the linux kernel,  and workaround could be in user space and igb_uio kernel driver if there is a better one. So that is why we need a document here. you could reference the back trace as below. [ 1078.006709] RIP: 0010:uio_write+0x2e/0xc0 [uio] [ 1078.006727] Call Trace: [ 1078.006765]  __vfs_write+0x18/0x40 [ 1078.006768]  vfs_write+0xb8/0x1b0 [ 1078.006770]  SyS_write+0x55/0xc0 [ 1078.006791]  entry_SYSCALL_64_fastpath+0x1e/0xad [ 1078.006793] RIP: 0033:0x7f75a10224bd you could check the whole info  at below link which i have attach. http://patches.dpdk.org/patch/47923/ The system env: Host kernel: 4.17.0-041700rc1-generic Vm kernel: Linux ubuntu 4.10.0-28-generic #32~16.04.2-Ubuntu. QEMU emulator version: 2.5.0 DPDK: version: 18.11-rc4 NIC: ixgbe or i40e nic or other(igb_uio pci nic) Reproduce step: Host environment 1. Host: Bind port 0 to vfio-pci    modprobe vfio_pci ./usertools/dpdk-devbind.py -b vfio-pci 81:10.0 2. start qemu scripts taskset -c 12-21 qemu-system-x86_64 \ -enable-kvm -m 8192 -smp cores=10,sockets=1 -cpu host -name dpdk1-vm1 \ -monitor stdio \ -drive file=/home/vm/ubuntu-14.04.img \ -device vfio-pci,host=0000:81:10.0,id=dev1 \ -netdev tap,id=ipvm1,ifname=tap5,script=/etc/qemu-ifup -device rtl8139,netdev=ipvm1,id=net0,mac=00:00:00:00:00:01 \ -localtime -vnc :2 VM environment 1. Bind port 0 to igb_uio ./usertools/dpdk-devbind.py --st ./usertools/dpdk-devbind.py -b igb_uio 00:03.0 2. Start testpmd and enable hotplug feature ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 4 -- -i --hot-plug 3. testpmd>set fwd txonly 4. testpmd>start 5. Qemu: remove device for unplug: (qemu) device_del dev1 6.Qemu : add device for plug: (qemu) device_add vfio-pci,host=0000:81:10.0,id=dev1 7. Bind port 0 to igb_uio: ./usertools/dpdk-devbind.py -b igb_uio 00:03.0 8. testpmd>stop 9. testpmd>port attach 0000: 00:03.0 10. testpmd>port start all 11. testpmd>start 12. Repeat 5 -- 12 until the kernel crash occur.