DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Zhang, Qi Z" <qi.z.zhang@intel.com>
To: wangyunjian <wangyunjian@huawei.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: xudingke <xudingke@huawei.com>, "stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
Date: Thu, 7 Mar 2019 13:38:04 +0000	[thread overview]
Message-ID: <039ED4275CED7440929022BC67E7061153349104@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <34EFBCA9F01B0748BEB6B629CE643AE60CA392C3@DGGEMM533-MBX.china.huawei.com>



> -----Original Message-----
> From: wangyunjian [mailto:wangyunjian@huawei.com]
> Sent: Tuesday, February 26, 2019 3:22 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Qi Z [mailto:qi.z.zhang@intel.com]
> > Sent: Tuesday, February 26, 2019 1:35 PM
> > To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org
> > Cc: xudingke <xudingke@huawei.com>; stable@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> >
> > Hi:
> >
> > > -----Original Message-----
> > > From: wangyunjian [mailto:wangyunjian@huawei.com]
> > > Sent: Wednesday, February 13, 2019 10:49 AM
> > > To: dev@dpdk.org
> > > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; xudingke@huawei.com; Yunjian
> > > Wang <wangyunjian@huawei.com>; stable@dpdk.org
> > > Subject: [dpdk-dev] [PATCH] net/ixgbe: fix crash when on remove
> > >
> > > From: Yunjian Wang <wangyunjian@huawei.com>
> > >
> > > The nic's interrupt source has some active handler, when the port
> > > remove. We should cancel the delay handler before remove dev to
> > > prevent executing the delay handler.
> >
> > Agree, thanks to capture this.
> >
> > >
> > > Call Trace:
> > >   #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
> > >       at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
> > >   #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
> > >       <rte_eth_devices@@DPDK_2.2+33024>)
> > >       at
> /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
> > >   #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
> > >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> > >       eal_alarm.c:90
> > >   #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
> > >       events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
> > >       librte_eal/linuxapp/eal/eal_interrupts.c:838
> > >   #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
> > >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> > >       eal_interrupts.c:885
> > >   #5  eal_intr_thread_main (arg=<optimized out>)
> > >       at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
> > >       eal_interrupts.c:965
> > >   #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
> > >   #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6
> > >
> > > Fixes: 2866c5f1b87e ("ixgbe: support port hotplug")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > ---
> > >  drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > index 7493110..e9533e5 100644
> > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > @@ -1336,6 +1336,9 @@ struct rte_ixgbe_xstats_name_off {
> > >  		rte_delay_ms(100);
> > >  	} while (retries++ < (10 + IXGBE_LINK_UP_TIME));
> > >
> > > +	/* cancel the delay handler before remove dev */
> > > +	rte_eal_alarm_cancel(ixgbe_dev_interrupt_delayed_handler,
> > eth_dev);
> > > +
> >
> > I think it will be more safe to move this call ahead, the delayed
> > handler may invoked application callback which may also invoke the
> > ethdev API, but at this moment, we already reset ethdev, we still have
> > chance to get problem, right?
> > Is it better that we add this call at the beginning of dev_close?
> >
> > Regards
> > Qi
> 
> The delay handler callback was canceled after unregistered interrupt handler and
> the interrupt handler callback was unregistered after disabled uio. So I added the
> call after rte_intr_callback_unregister.
> 
> I am not sure if we can add the disable uio, unregister interrupt handler at the
> beginning of dev_close.

OK, I think the idea scenario is

Disable interrupt -> unregister interrupt handler -> unregister delayed handler -> reset ethdev ...
Now we reset ethdev too early give potential risk in interrupt handler, but that could be fixed by a separate patch.

So 
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi



> 
> Thanks
> Yunjian
> >
> >
> > >  	/* uninitialize PF if max_vfs not zero */
> > >  	ixgbe_pf_host_uninit(eth_dev);
> > >
> > > --
> > > 1.8.3.1
> > >

      reply	other threads:[~2019-03-07 13:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-13  2:48 wangyunjian
2019-02-26  5:34 ` Zhang, Qi Z
2019-02-26  7:21   ` wangyunjian
2019-03-07 13:38     ` Zhang, Qi Z [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=039ED4275CED7440929022BC67E7061153349104@SHSMSX103.ccr.corp.intel.com \
    --to=qi.z.zhang@intel.com \
    --cc=dev@dpdk.org \
    --cc=stable@dpdk.org \
    --cc=wangyunjian@huawei.com \
    --cc=xudingke@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).