* [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove @ 2019-02-13 2:48 wangyunjian 2019-02-26 5:34 ` Zhang, Qi Z 0 siblings, 1 reply; 4+ messages in thread From: wangyunjian @ 2019-02-13 2:48 UTC (permalink / raw) To: dev; +Cc: qi.z.zhang, xudingke, Yunjian Wang, stable From: Yunjian Wang <wangyunjian@huawei.com> The nic's interrupt source has some active handler, when the port remove. We should cancel the delay handler before remove dev to prevent executing the delay handler. Call Trace: #0 ixgbe_disable_intr (hw=0x0, hw=0x0) at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852 #1 ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0 <rte_eth_devices@@DPDK_2.2+33024>) at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386 #2 0x00007f05782147af in eal_alarm_callback (arg=<optimized out>) at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ eal_alarm.c:90 #3 0x00007f057821320a in eal_intr_process_interrupts (nfds=1, events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/ librte_eal/linuxapp/eal/eal_interrupts.c:838 #4 eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18) at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ eal_interrupts.c:885 #5 eal_intr_thread_main (arg=<optimized out>) at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ eal_interrupts.c:965 #6 0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0 #7 0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6 Fixes: 2866c5f1b87e ("ixgbe: support port hotplug") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> --- drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c index 7493110..e9533e5 100644 --- a/drivers/net/ixgbe/ixgbe_ethdev.c +++ b/drivers/net/ixgbe/ixgbe_ethdev.c @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off { rte_delay_ms(100); } while (retries++ < (10 + IXGBE_LINK_UP_TIME)); + /* cancel the delay handler before remove dev */ + rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler, eth_dev); + /* uninitialize PF if max_vfs not zero */ ixgbe_pf_host_uninit(eth_dev); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove 2019-02-13 2:48 [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove wangyunjian @ 2019-02-26 5:34 ` Zhang, Qi Z 2019-02-26 7:21 ` wangyunjian 0 siblings, 1 reply; 4+ messages in thread From: Zhang, Qi Z @ 2019-02-26 5:34 UTC (permalink / raw) To: wangyunjian, dev; +Cc: xudingke, stable Hi: > -----Original Message----- > From: wangyunjian [mailto:wangyunjian@huawei.com] > Sent: Wednesday, February 13, 2019 10:49 AM > To: dev@dpdk.org > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian Wang > <wangyunjian@huawei.com>; stable@dpdk.org > Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove > > From: Yunjian Wang <wangyunjian@huawei.com> > > The nic's interrupt source has some active handler, when the port remove. We > should cancel the delay handler before remove dev to prevent executing the > delay handler. Agree, thanks to capture this. > > Call Trace: > #0 ixgbe_disable_intr (hw=0x0, hw=0x0) > at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852 > #1 ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0 > <rte_eth_devices@@DPDK_2.2+33024>) > at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386 > #2 0x00007f05782147af in eal_alarm_callback (arg=<optimized out>) > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > eal_alarm.c:90 > #3 0x00007f057821320a in eal_intr_process_interrupts (nfds=1, > events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/ > librte_eal/linuxapp/eal/eal_interrupts.c:838 > #4 eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18) > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > eal_interrupts.c:885 > #5 eal_intr_thread_main (arg=<optimized out>) > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > eal_interrupts.c:965 > #6 0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0 > #7 0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6 > > Fixes: 2866c5f1b87e ("ixgbe: support port hotplug") > Cc: stable@dpdk.org > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- > drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > b/drivers/net/ixgbe/ixgbe_ethdev.c > index 7493110..e9533e5 100644 > --- a/drivers/net/ixgbe/ixgbe_ethdev.c > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off { > rte_delay_ms(100); > } while (retries++ < (10 + IXGBE_LINK_UP_TIME)); > > + /* cancel the delay handler before remove dev */ > + rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler, eth_dev); > + I think it will be more safe to move this call ahead, the delayed handler may invoked application callback which may also invoke the ethdev API, but at this moment, we already reset ethdev, we still have chance to get problem, right? Is it better that we add this call at the beginning of dev_close? Regards Qi > /* uninitialize PF if max_vfs not zero */ > ixgbe_pf_host_uninit(eth_dev); > > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove 2019-02-26 5:34 ` Zhang, Qi Z @ 2019-02-26 7:21 ` wangyunjian 2019-03-07 13:38 ` Zhang, Qi Z 0 siblings, 1 reply; 4+ messages in thread From: wangyunjian @ 2019-02-26 7:21 UTC (permalink / raw) To: Zhang, Qi Z, dev; +Cc: xudingke, stable > -----Original Message----- > From: Zhang, Qi Z [mailto:qi.z.zhang@intel.com] > Sent: Tuesday, February 26, 2019 1:35 PM > To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org > Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove > > Hi: > > > -----Original Message----- > > From: wangyunjian [mailto:wangyunjian@huawei.com] > > Sent: Wednesday, February 13, 2019 10:49 AM > > To: dev@dpdk.org > > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian > > Wang <wangyunjian@huawei.com>; stable@dpdk.org > > Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove > > > > From: Yunjian Wang <wangyunjian@huawei.com> > > > > The nic's interrupt source has some active handler, when the port > > remove. We should cancel the delay handler before remove dev to > > prevent executing the delay handler. > > Agree, thanks to capture this. > > > > > Call Trace: > > #0 ixgbe_disable_intr (hw=0x0, hw=0x0) > > at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852 > > #1 ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0 > > <rte_eth_devices@@DPDK_2.2+33024>) > > at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386 > > #2 0x00007f05782147af in eal_alarm_callback (arg=<optimized out>) > > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > > eal_alarm.c:90 > > #3 0x00007f057821320a in eal_intr_process_interrupts (nfds=1, > > events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/ > > librte_eal/linuxapp/eal/eal_interrupts.c:838 > > #4 eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18) > > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > > eal_interrupts.c:885 > > #5 eal_intr_thread_main (arg=<optimized out>) > > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > > eal_interrupts.c:965 > > #6 0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0 > > #7 0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6 > > > > Fixes: 2866c5f1b87e ("ixgbe: support port hotplug") > > Cc: stable@dpdk.org > > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > > --- > > drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > > b/drivers/net/ixgbe/ixgbe_ethdev.c > > index 7493110..e9533e5 100644 > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > > @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off { > > rte_delay_ms(100); > > } while (retries++ < (10 + IXGBE_LINK_UP_TIME)); > > > > + /* cancel the delay handler before remove dev */ > > + rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler, > eth_dev); > > + > > I think it will be more safe to move this call ahead, the delayed handler may > invoked application callback which may also invoke the ethdev API, but at > this moment, we already reset ethdev, we still have chance to get problem, > right? > Is it better that we add this call at the beginning of dev_close? > > Regards > Qi The delay handler callback was canceled after unregistered interrupt handler and the interrupt handler callback was unregistered after disabled uio. So I added the call after rte_intr_callback_unregister. I am not sure if we can add the disable uio, unregister interrupt handler at the beginning of dev_close. Thanks Yunjian > > > > /* uninitialize PF if max_vfs not zero */ > > ixgbe_pf_host_uninit(eth_dev); > > > > -- > > 1.8.3.1 > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove 2019-02-26 7:21 ` wangyunjian @ 2019-03-07 13:38 ` Zhang, Qi Z 0 siblings, 0 replies; 4+ messages in thread From: Zhang, Qi Z @ 2019-03-07 13:38 UTC (permalink / raw) To: wangyunjian, dev; +Cc: xudingke, stable > -----Original Message----- > From: wangyunjian [mailto:wangyunjian@huawei.com] > Sent: Tuesday, February 26, 2019 3:22 PM > To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org > Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove > > > > > -----Original Message----- > > From: Zhang, Qi Z [mailto:qi.z.zhang@intel.com] > > Sent: Tuesday, February 26, 2019 1:35 PM > > To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org > > Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove > > > > Hi: > > > > > -----Original Message----- > > > From: wangyunjian [mailto:wangyunjian@huawei.com] > > > Sent: Wednesday, February 13, 2019 10:49 AM > > > To: dev@dpdk.org > > > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian > > > Wang <wangyunjian@huawei.com>; stable@dpdk.org > > > Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove > > > > > > From: Yunjian Wang <wangyunjian@huawei.com> > > > > > > The nic's interrupt source has some active handler, when the port > > > remove. We should cancel the delay handler before remove dev to > > > prevent executing the delay handler. > > > > Agree, thanks to capture this. > > > > > > > > Call Trace: > > > #0 ixgbe_disable_intr (hw=0x0, hw=0x0) > > > at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852 > > > #1 ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0 > > > <rte_eth_devices@@DPDK_2.2+33024>) > > > at > /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386 > > > #2 0x00007f05782147af in eal_alarm_callback (arg=<optimized out>) > > > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > > > eal_alarm.c:90 > > > #3 0x00007f057821320a in eal_intr_process_interrupts (nfds=1, > > > events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/ > > > librte_eal/linuxapp/eal/eal_interrupts.c:838 > > > #4 eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18) > > > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > > > eal_interrupts.c:885 > > > #5 eal_intr_thread_main (arg=<optimized out>) > > > at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/ > > > eal_interrupts.c:965 > > > #6 0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0 > > > #7 0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6 > > > > > > Fixes: 2866c5f1b87e ("ixgbe: support port hotplug") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > > > --- > > > drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > > > b/drivers/net/ixgbe/ixgbe_ethdev.c > > > index 7493110..e9533e5 100644 > > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c > > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > > > @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off { > > > rte_delay_ms(100); > > > } while (retries++ < (10 + IXGBE_LINK_UP_TIME)); > > > > > > + /* cancel the delay handler before remove dev */ > > > + rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler, > > eth_dev); > > > + > > > > I think it will be more safe to move this call ahead, the delayed > > handler may invoked application callback which may also invoke the > > ethdev API, but at this moment, we already reset ethdev, we still have > > chance to get problem, right? > > Is it better that we add this call at the beginning of dev_close? > > > > Regards > > Qi > > The delay handler callback was canceled after unregistered interrupt handler and > the interrupt handler callback was unregistered after disabled uio. So I added the > call after rte_intr_callback_unregister. > > I am not sure if we can add the disable uio, unregister interrupt handler at the > beginning of dev_close. OK, I think the idea scenario is Disable interrupt -> unregister interrupt handler -> unregister delayed handler -> reset ethdev ... Now we reset ethdev too early give potential risk in interrupt handler, but that could be fixed by a separate patch. So Acked-by: Qi Zhang <qi.z.zhang@intel.com> Applied to dpdk-next-net-intel. Thanks Qi > > Thanks > Yunjian > > > > > > > /* uninitialize PF if max_vfs not zero */ > > > ixgbe_pf_host_uninit(eth_dev); > > > > > > -- > > > 1.8.3.1 > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-03-07 13:39 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-02-13 2:48 [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove wangyunjian 2019-02-26 5:34 ` Zhang, Qi Z 2019-02-26 7:21 ` wangyunjian 2019-03-07 13:38 ` Zhang, Qi Z
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).