From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 01A0D49E0 for ; Fri, 9 Nov 2018 04:26:24 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Nov 2018 19:26:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,481,1534834800"; d="scan'208";a="272601623" Received: from jguo15x-mobl.ccr.corp.intel.com (HELO [10.67.68.85]) ([10.67.68.85]) by orsmga005.jf.intel.com with ESMTP; 08 Nov 2018 19:26:21 -0800 To: Stephen Hemminger Cc: ferruh.yigit@intel.com, konstantin.ananyev@intel.com, anatoly.burakov@intel.com, thomas@monjalon.net, bernard.iremonger@intel.com, dev@dpdk.org, helin.zhang@intel.com, gaetan.rivet@6wind.com, qi.z.zhang@intel.com, arybchenko@solarflare.com, bruce.richardson@intel.com, matan@mellanox.com, shaopeng.he@intel.com References: <1541583691-145432-1-git-send-email-jia.guo@intel.com> <1541583691-145432-3-git-send-email-jia.guo@intel.com> <20181107113344.79229734@shemminger-XPS-13-9360> <4bcbd2be-240e-9f60-86f8-f0d9ca7e1293@intel.com> <20181108135520.2037ce6f@shemminger-XPS-13-9360> From: Jeff Guo Message-ID: <44bf2b19-1fdc-93ca-f367-b9476422d3fd@intel.com> Date: Fri, 9 Nov 2018 11:26:20 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <20181108135520.2037ce6f@shemminger-XPS-13-9360> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Subject: Re: [dpdk-dev] [PATCH 2/2] but/pci: fix fd close for hot-unplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Nov 2018 03:26:25 -0000 On 11/9/2018 5:55 AM, Stephen Hemminger wrote: > On Thu, 8 Nov 2018 11:10:29 +0800 > Jeff Guo wrote: > >> hi, stephen >> >> On 11/8/2018 3:33 AM, Stephen Hemminger wrote: >>> On Wed, 7 Nov 2018 17:41:31 +0800 >>> Jeff Guo wrote: >>> >>>> When device is hot-unplugged, the device fd will be deleted in kernel. >>>> Then in the progress of detaching device, if it try to close the fd, >>>> it will cause a kernel crash, which shown a kernel null pointer error. >>> If this happens, then it is a kernel bug and the kernel should be fixed. >>> Working around it in userspace is not a great long term solution. >> >> agree with you. The key is sometime hold by kernel. But i think it is at >> least reasonable for avoiding no-use process in user space, whatever >> kernel's behavior. I am not sure if there is any better idea we can >> find, but seems it is an option now. > Are you using igb_uio? If so the problem is an DPDK supplied driver. > Let's fix that. What is the backtrace on kernel crash. yes, i am using the igb_uio. The back trace shown as below. It is an kernel NULL pointer result of uio wirte error in the eal-intr-thread thread, ################ [ 1052.262696] igb_uio 0000:00:03.0: mapping 1K dma=0x61cea000 host=ffff8e94e1cea000 [ 1052.262699] igb_uio 0000:00:03.0: unmapping 1K dma=0x61cea000 host=ffff8e94e1cea000 [ 1065.140839] igb_uio 0000:00:03.0: uio device registered with irq 24 [ 1078.006619] BUG: unable to handle kernel NULL pointer dereference at 00000000000001d0 [ 1078.006657] IP: uio_write+0x2e/0xc0 [uio] [ 1078.006659] PGD 70837067 [ 1078.006659] PUD 3440c067 [ 1078.006660] PMD 0 [ 1078.006667] Oops: 0000 [#1] SMP [ 1078.006677] Modules linked in: igb_uio(OE) uio sb_edac edac_core kvm_intel kvm irqbypass crct10dif_pclmul joydev input_leds mac_hid i2c_piix4 crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd serio_raw parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp ixgbevf mii pata_acpi floppy [ 1078.006703] CPU: 3 PID: 14043 Comm: eal-intr-thread Tainted: G           OE   4.10.0-28-generic #32~16.04.2-Ubuntu [ 1078.006705] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 1078.006706] task: ffff8e94f3904380 task.stack: ffffa97b810dc000 [ 1078.006709] RIP: 0010:uio_write+0x2e/0xc0 [uio] [ 1078.006712] RSP: 0018:ffffa97b810dfe90 EFLAGS: 00010246 [ 1078.006714] RAX: 0000000000000000 RBX: 0000000000000004 RCX: ffffa97b810dff18 [ 1078.006715] RDX: 0000000000000004 RSI: 00007f75a0c45294 RDI: ffff8e94f3ff8d00 [ 1078.006716] RBP: ffffa97b810dfeb0 R08: ffffffffc01b6320 R09: ffff8e94e1de9370 [ 1078.006717] R10: ffff8e94f3ff8d38 R11: ffff8e94f382f9c0 R12: ffff8e94f2f94218 [ 1078.006718] R13: 00007f75a0c45294 R14: ffff8e94f3ff8d00 R15: 0000000000000000 [ 1078.006719] FS:  00007f75a0c47700(0000) GS:ffff8e95bfcc0000(0000) knlGS:0000000000000000 [ 1078.006720] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1078.006721] CR2: 00000000000001d0 CR3: 0000000073abc000 CR4: 00000000001406e0 [ 1078.006727] Call Trace: [ 1078.006765]  __vfs_write+0x18/0x40 [ 1078.006768]  vfs_write+0xb8/0x1b0 [ 1078.006770]  SyS_write+0x55/0xc0 [ 1078.006791]  entry_SYSCALL_64_fastpath+0x1e/0xad [ 1078.006793] RIP: 0033:0x7f75a10224bd [ 1078.006796] RSP: 002b:00007f75a0c45270 EFLAGS: 00003293 ORIG_RAX: 0000000000000001 [ 1078.006800] RAX: ffffffffffffffda RBX: 000000000000010c RCX: 00007f75a10224bd [ 1078.006801] RDX: 0000000000000004 RSI: 00007f75a0c45294 RDI: 0000000000000010 [ 1078.006802] RBP: 00007f75a0c45250 R08: 00000000000010ed R09: 00000000010922e7 [ 1078.006803] R10: 0000000001777781 R11: 0000000000003293 R12: 00007f75a0c45680 [ 1078.006804] R13: 00007fff9036bb4f R14: 00007f75a0c479c0 R15: 0000000000000000 [ 1078.006805] Code: 00 00 55 48 89 e5 41 54 53 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 48 8b 87 c8 00 00 00 4c 8b 20 49 8b 44 24 38 <48> 83 b8 d0 01 00 00 00 74 6a 48 89 d3 48 c7 c2 ea ff ff ff 48 [ 1078.006832] RIP: uio_write+0x2e/0xc0 [uio] RSP: ffffa97b810dfe90 [ 1078.006832] CR2: 00000000000001d0 [ 1078.006837] ---[ end trace 88c298c83fcf2c2d ]--- #######