DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  [PATCH] net/ixgbe: fix crash when on remove
@ 2019-02-13  2:48 wangyunjian
  2019-02-26  5:34 ` Zhang, Qi Z
  0 siblings, 1 reply; 4+ messages in thread
From: wangyunjian @ 2019-02-13  2:48 UTC (permalink / raw)
  To: dev; +Cc: qi.z.zhang, xudingke, Yunjian Wang, stable

From: Yunjian Wang <wangyunjian@huawei.com>

The nic's interrupt source has some active handler, when the
port remove. We should cancel the delay handler before remove
dev to prevent executing the delay handler.

Call Trace:
  #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
      at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
  #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
      <rte_eth_devices@@DPDK_2.2+33024>)
      at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
  #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_alarm.c:90
  #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
      events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
      librte_eal/linuxapp/eal/eal_interrupts.c:838
  #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_interrupts.c:885
  #5  eal_intr_thread_main (arg=<optimized out>)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_interrupts.c:965
  #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
  #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6

Fixes: 2866c5f1b87e ("ixgbe: support port hotplug")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7493110..e9533e5 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off {
 		rte_delay_ms(100);
 	} while (retries++ < (10 + IXGBE_LINK_UP_TIME));
 
+	/* cancel the delay handler before remove dev */
+	rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler, eth_dev);
+
 	/* uninitialize PF if max_vfs not zero */
 	ixgbe_pf_host_uninit(eth_dev);
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
  2019-02-13  2:48 [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove wangyunjian
@ 2019-02-26  5:34 ` Zhang, Qi Z
  2019-02-26  7:21   ` wangyunjian
  0 siblings, 1 reply; 4+ messages in thread
From: Zhang, Qi Z @ 2019-02-26  5:34 UTC (permalink / raw)
  To: wangyunjian, dev; +Cc: xudingke, stable

Hi:

> -----Original Message-----
> From: wangyunjian [mailto:wangyunjian@huawei.com]
> Sent: Wednesday, February 13, 2019 10:49 AM
> To: dev@dpdk.org
> Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian Wang
> <wangyunjian@huawei.com>; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> 
> From: Yunjian Wang <wangyunjian@huawei.com>
> 
> The nic's interrupt source has some active handler, when the port remove. We
> should cancel the delay handler before remove dev to prevent executing the
> delay handler.

Agree, thanks to capture this.

> 
> Call Trace:
>   #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
>       at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
>   #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
>       <rte_eth_devices@@DPDK_2.2+33024>)
>       at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
>   #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
>       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
>       eal_alarm.c:90
>   #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
>       events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
>       librte_eal/linuxapp/eal/eal_interrupts.c:838
>   #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
>       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
>       eal_interrupts.c:885
>   #5  eal_intr_thread_main (arg=<optimized out>)
>       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
>       eal_interrupts.c:965
>   #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
>   #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6
> 
> Fixes: 2866c5f1b87e ("ixgbe: support port hotplug")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 7493110..e9533e5 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off {
>  		rte_delay_ms(100);
>  	} while (retries++ < (10 + IXGBE_LINK_UP_TIME));
> 
> +	/* cancel the delay handler before remove dev */
> +	rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler, eth_dev);
> +

I think it will be more safe to move this call ahead,
the delayed handler may invoked application callback which may also invoke the ethdev API, but at this moment, we already reset ethdev, we still have chance to get problem, right?
Is it better that we add this call at the beginning of dev_close?

Regards
Qi


>  	/* uninitialize PF if max_vfs not zero */
>  	ixgbe_pf_host_uninit(eth_dev);
> 
> --
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
  2019-02-26  5:34 ` Zhang, Qi Z
@ 2019-02-26  7:21   ` wangyunjian
  2019-03-07 13:38     ` Zhang, Qi Z
  0 siblings, 1 reply; 4+ messages in thread
From: wangyunjian @ 2019-02-26  7:21 UTC (permalink / raw)
  To: Zhang, Qi Z, dev; +Cc: xudingke, stable



> -----Original Message-----
> From: Zhang, Qi Z [mailto:qi.z.zhang@intel.com]
> Sent: Tuesday, February 26, 2019 1:35 PM
> To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org
> Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> 
> Hi:
> 
> > -----Original Message-----
> > From: wangyunjian [mailto:wangyunjian@huawei.com]
> > Sent: Wednesday, February 13, 2019 10:49 AM
> > To: dev@dpdk.org
> > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian
> > Wang <wangyunjian@huawei.com>; stable@dpdk.org
> > Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> >
> > From: Yunjian Wang <wangyunjian@huawei.com>
> >
> > The nic's interrupt source has some active handler, when the port
> > remove. We should cancel the delay handler before remove dev to
> > prevent executing the delay handler.
> 
> Agree, thanks to capture this.
> 
> >
> > Call Trace:
> >   #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
> >       at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
> >   #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
> >       <rte_eth_devices@@DPDK_2.2+33024>)
> >       at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
> >   #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
> >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> >       eal_alarm.c:90
> >   #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
> >       events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
> >       librte_eal/linuxapp/eal/eal_interrupts.c:838
> >   #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
> >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> >       eal_interrupts.c:885
> >   #5  eal_intr_thread_main (arg=<optimized out>)
> >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> >       eal_interrupts.c:965
> >   #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
> >   #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6
> >
> > Fixes: 2866c5f1b87e ("ixgbe: support port hotplug")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > ---
> >  drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > index 7493110..e9533e5 100644
> > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off {
> >  		rte_delay_ms(100);
> >  	} while (retries++ < (10 + IXGBE_LINK_UP_TIME));
> >
> > +	/* cancel the delay handler before remove dev */
> > +	rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler,
> eth_dev);
> > +
> 
> I think it will be more safe to move this call ahead, the delayed handler may
> invoked application callback which may also invoke the ethdev API, but at
> this moment, we already reset ethdev, we still have chance to get problem,
> right?
> Is it better that we add this call at the beginning of dev_close?
> 
> Regards
> Qi

The delay handler callback was canceled after unregistered interrupt handler
and the interrupt handler callback was unregistered after disabled uio. So I
added the call after rte_intr_callback_unregister.

I am not sure if we can add the disable uio, unregister interrupt handler at
the beginning of dev_close.

Thanks
Yunjian
> 
> 
> >  	/* uninitialize PF if max_vfs not zero */
> >  	ixgbe_pf_host_uninit(eth_dev);
> >
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
  2019-02-26  7:21   ` wangyunjian
@ 2019-03-07 13:38     ` Zhang, Qi Z
  0 siblings, 0 replies; 4+ messages in thread
From: Zhang, Qi Z @ 2019-03-07 13:38 UTC (permalink / raw)
  To: wangyunjian, dev; +Cc: xudingke, stable



> -----Original Message-----
> From: wangyunjian [mailto:wangyunjian@huawei.com]
> Sent: Tuesday, February 26, 2019 3:22 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Qi Z [mailto:qi.z.zhang@intel.com]
> > Sent: Tuesday, February 26, 2019 1:35 PM
> > To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org
> > Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> >
> > Hi:
> >
> > > -----Original Message-----
> > > From: wangyunjian [mailto:wangyunjian@huawei.com]
> > > Sent: Wednesday, February 13, 2019 10:49 AM
> > > To: dev@dpdk.org
> > > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian
> > > Wang <wangyunjian@huawei.com>; stable@dpdk.org
> > > Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> > >
> > > From: Yunjian Wang <wangyunjian@huawei.com>
> > >
> > > The nic's interrupt source has some active handler, when the port
> > > remove. We should cancel the delay handler before remove dev to
> > > prevent executing the delay handler.
> >
> > Agree, thanks to capture this.
> >
> > >
> > > Call Trace:
> > >   #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
> > >       at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
> > >   #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
> > >       <rte_eth_devices@@DPDK_2.2+33024>)
> > >       at
> /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
> > >   #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
> > >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> > >       eal_alarm.c:90
> > >   #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
> > >       events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
> > >       librte_eal/linuxapp/eal/eal_interrupts.c:838
> > >   #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
> > >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> > >       eal_interrupts.c:885
> > >   #5  eal_intr_thread_main (arg=<optimized out>)
> > >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> > >       eal_interrupts.c:965
> > >   #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
> > >   #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6
> > >
> > > Fixes: 2866c5f1b87e ("ixgbe: support port hotplug")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > ---
> > >  drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > index 7493110..e9533e5 100644
> > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off {
> > >  		rte_delay_ms(100);
> > >  	} while (retries++ < (10 + IXGBE_LINK_UP_TIME));
> > >
> > > +	/* cancel the delay handler before remove dev */
> > > +	rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler,
> > eth_dev);
> > > +
> >
> > I think it will be more safe to move this call ahead, the delayed
> > handler may invoked application callback which may also invoke the
> > ethdev API, but at this moment, we already reset ethdev, we still have
> > chance to get problem, right?
> > Is it better that we add this call at the beginning of dev_close?
> >
> > Regards
> > Qi
> 
> The delay handler callback was canceled after unregistered interrupt handler and
> the interrupt handler callback was unregistered after disabled uio. So I added the
> call after rte_intr_callback_unregister.
> 
> I am not sure if we can add the disable uio, unregister interrupt handler at the
> beginning of dev_close.

OK, I think the idea scenario is

Disable interrupt -> unregister interrupt handler -> unregister delayed handler -> reset ethdev ...
Now we reset ethdev too early give potential risk in interrupt handler, but that could be fixed by a separate patch.

So 
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi



> 
> Thanks
> Yunjian
> >
> >
> > >  	/* uninitialize PF if max_vfs not zero */
> > >  	ixgbe_pf_host_uninit(eth_dev);
> > >
> > > --
> > > 1.8.3.1
> > >

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-03-07 13:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-13  2:48 [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove wangyunjian
2019-02-26  5:34 ` Zhang, Qi Z
2019-02-26  7:21   ` wangyunjian
2019-03-07 13:38     ` Zhang, Qi Z

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).