* [PATCH] net/iavf: unregister intr handler before FD close @ 2023-09-04 13:01 Saurabh Singhal 2023-09-06 5:24 ` Zhang, Qi Z 0 siblings, 1 reply; 7+ messages in thread From: Saurabh Singhal @ 2023-09-04 13:01 UTC (permalink / raw) To: Thomas Monjalon, Jingjing Wu, Beilei Xing; +Cc: dev, Saurabh Singhal Unregister VFIO interrupt handler before the interrupt fd gets closed in case iavf_dev_init() returns an error. dpdk creates a standalone thread named eal-intr-thread for processing interrupts for the PCI devices. The interrupt handler callbacks are registered by the VF driver(iavf, in this case). When we do a PCI probe of the network interfaces, we register an interrupt handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These interrupt sources are registered in a global linked list that the eal-intr-thread keeps iterating over for handling the interrupts. In our internal testing, we see eal-intr-thread crash in these two ways: Error adding fd 660 epoll_ctl, Operation not permitted or Error adding fd 660 epoll_ctl, Bad file descriptor epoll_ctl() returns EPERM if the target fd does not support poll. It returns EBADF when the epoll fd itself is closed or the target fd is closed. When the first type of crash happens, we see that the fd 660 is anon_inode:[vfio-device] which does not support poll. When the second type of crash happens, we could see from the fd map of the crashing process that the fd 660 was already closed. This means the said fd has been closed and in certain cases may have been reassigned to a different device by the operating system but the eal-intr-thread does not know about it. We observed that these crashes were always accompanied by an error in iavf_dev_init() after rte_intr_callback_register() and iavf_enable_irq0() have already happened. In the error path, the intr_handle_fd was being closed but the interrupt handler wasn't being unregistered. The fix is to unregister the interrupt handle in the iavf_dev_init() error path. Signed-off-by: Saurabh Singhal <saurabhs@arista.com> --- .mailmap | 1 + drivers/net/iavf/iavf_ethdev.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+) diff --git a/.mailmap b/.mailmap index 864d33ee46..4dac53011b 100644 --- a/.mailmap +++ b/.mailmap @@ -1227,6 +1227,7 @@ Satananda Burla <sburla@marvell.com> Satha Rao <skoteshwar@marvell.com> <skoteshwar@caviumnetworks.com> Satheesh Paul <psatheesh@marvell.com> Sathesh Edara <sedara@marvell.com> +Saurabh Singhal <saurabhs@arista.com> Savinay Dharmappa <savinay.dharmappa@intel.com> Scott Branden <scott.branden@broadcom.com> Scott Daniels <daniels@research.att.com> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index f2fc5a5621..df87a553db 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -133,6 +133,8 @@ static int iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id); static int iavf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id); +static void iavf_dev_interrupt_handler(void *param); +static inline void iavf_disable_irq0(struct iavf_hw *hw); static int iavf_dev_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops); static int iavf_set_mc_addr_list(struct rte_eth_dev *dev, @@ -2490,9 +2492,22 @@ iavf_uninit_vf(struct rte_eth_dev *dev) { struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(dev->data->dev_private); struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); + struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev); + struct rte_intr_handle *intr_handle = pci_dev->intr_handle; iavf_shutdown_adminq(hw); + if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) { + /* disable uio intr before callback unregiser */ + rte_intr_disable(intr_handle); + + /* unregister callback func from eal lib */ + rte_intr_callback_unregister(intr_handle, + iavf_dev_interrupt_handler, dev); + } else { + rte_eal_alarm_cancel(iavf_dev_alarm_handler, dev); + } + rte_free(vf->vf_res); vf->vsi_res = NULL; vf->vf_res = NULL; -- 2.41.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH] net/iavf: unregister intr handler before FD close 2023-09-04 13:01 [PATCH] net/iavf: unregister intr handler before FD close Saurabh Singhal @ 2023-09-06 5:24 ` Zhang, Qi Z 2023-09-06 11:47 ` [PATCH v2] " Saurabh Singhal 2023-09-07 3:15 ` [PATCH v4] " Saurabh Singhal 0 siblings, 2 replies; 7+ messages in thread From: Zhang, Qi Z @ 2023-09-06 5:24 UTC (permalink / raw) To: Singhal, Saurabh, Thomas Monjalon, Wu, Jingjing, Xing, Beilei Cc: dev, Singhal, Saurabh > -----Original Message----- > From: Saurabh Singhal <saurabhs@arista.com> > Sent: Monday, September 4, 2023 9:01 PM > To: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing > <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com> > Cc: dev@dpdk.org; Singhal, Saurabh <saurabhs@arista.com> > Subject: [PATCH] net/iavf: unregister intr handler before FD close > > Unregister VFIO interrupt handler before the interrupt fd gets closed in case > iavf_dev_init() returns an error. > > dpdk creates a standalone thread named eal-intr-thread for processing > interrupts for the PCI devices. The interrupt handler callbacks are registered > by the VF driver(iavf, in this case). > > When we do a PCI probe of the network interfaces, we register an interrupt > handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These > interrupt sources are registered in a global linked list that the eal-intr-thread > keeps iterating over for handling the interrupts. In our internal testing, we see > eal-intr-thread crash in these two ways: > > Error adding fd 660 epoll_ctl, Operation not permitted > > or > > Error adding fd 660 epoll_ctl, Bad file descriptor > > epoll_ctl() returns EPERM if the target fd does not support poll. > It returns EBADF when the epoll fd itself is closed or the target fd is closed. > > When the first type of crash happens, we see that the fd 660 is > anon_inode:[vfio-device] which does not support poll. > > When the second type of crash happens, we could see from the fd map of > the crashing process that the fd 660 was already closed. > > This means the said fd has been closed and in certain cases may have been > reassigned to a different device by the operating system but the eal-intr- > thread does not know about it. > > We observed that these crashes were always accompanied by an error in > iavf_dev_init() after rte_intr_callback_register() and > iavf_enable_irq0() have already happened. In the error path, the > intr_handle_fd was being closed but the interrupt handler wasn't being > unregistered. > > The fix is to unregister the interrupt handle in the > iavf_dev_init() error path. Thanks for all these explanations! > > Signed-off-by: Saurabh Singhal <saurabhs@arista.com> > --- > .mailmap | 1 + > drivers/net/iavf/iavf_ethdev.c | 15 +++++++++++++++ > 2 files changed, 16 insertions(+) > > diff --git a/.mailmap b/.mailmap > index 864d33ee46..4dac53011b 100644 > --- a/.mailmap > +++ b/.mailmap > @@ -1227,6 +1227,7 @@ Satananda Burla <sburla@marvell.com> Satha Rao > <skoteshwar@marvell.com> <skoteshwar@caviumnetworks.com> Satheesh > Paul <psatheesh@marvell.com> Sathesh Edara <sedara@marvell.com> > +Saurabh Singhal <saurabhs@arista.com> > Savinay Dharmappa <savinay.dharmappa@intel.com> Scott Branden > <scott.branden@broadcom.com> Scott Daniels <daniels@research.att.com> > diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c > index f2fc5a5621..df87a553db 100644 > --- a/drivers/net/iavf/iavf_ethdev.c > +++ b/drivers/net/iavf/iavf_ethdev.c > @@ -133,6 +133,8 @@ static int iavf_dev_rx_queue_intr_enable(struct > rte_eth_dev *dev, > uint16_t queue_id); > static int iavf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, > uint16_t queue_id); > +static void iavf_dev_interrupt_handler(void *param); static inline void > +iavf_disable_irq0(struct iavf_hw *hw); > static int iavf_dev_flow_ops_get(struct rte_eth_dev *dev, > const struct rte_flow_ops **ops); > static int iavf_set_mc_addr_list(struct rte_eth_dev *dev, @@ -2490,9 > +2492,22 @@ iavf_uninit_vf(struct rte_eth_dev *dev) { > struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(dev->data- > >dev_private); > struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data- > >dev_private); > + struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev); > + struct rte_intr_handle *intr_handle = pci_dev->intr_handle; > > iavf_shutdown_adminq(hw); > > + if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) { > + /* disable uio intr before callback unregiser */ > + rte_intr_disable(intr_handle); > + > + /* unregister callback func from eal lib */ > + rte_intr_callback_unregister(intr_handle, > + iavf_dev_interrupt_handler, dev); > + } else { > + rte_eal_alarm_cancel(iavf_dev_alarm_handler, dev); > + } > + I assume the error handling should follow the reversed order. So, can we move above code right after the goto label "flow_init_err", like below? .... iavf_enable_irq0(hw); ret = iavf_flow_init(adapter); if (ret) { PMD_INIT_LOG(ERR, "Failed to initialize flow"); goto flow_init_err; } ... flow_init_err: iavf_disable_irq0(hw) if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) { ... } Btw, I saw missing error handling in iavf_ipsec_crypto_supported branch which should be fixed, if you are hesitating to apply above fix because this inconsistent, please ignore this. Thanks Qi > rte_free(vf->vf_res); > vf->vsi_res = NULL; > vf->vf_res = NULL; > -- > 2.41.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] net/iavf: unregister intr handler before FD close 2023-09-06 5:24 ` Zhang, Qi Z @ 2023-09-06 11:47 ` Saurabh Singhal 2023-09-06 14:05 ` Stephen Hemminger 2023-09-07 3:15 ` [PATCH v4] " Saurabh Singhal 1 sibling, 1 reply; 7+ messages in thread From: Saurabh Singhal @ 2023-09-06 11:47 UTC (permalink / raw) To: Thomas Monjalon, Jingjing Wu, Beilei Xing; +Cc: dev, Saurabh Singhal Unregister VFIO interrupt handler before the interrupt fd gets closed in case iavf_dev_init() returns an error. dpdk creates a standalone thread named eal-intr-thread for processing interrupts for the PCI devices. The interrupt handler callbacks are registered by the VF driver(iavf, in this case). When we do a PCI probe of the network interfaces, we register an interrupt handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These interrupt sources are registered in a global linked list that the eal-intr-thread keeps iterating over for handling the interrupts. In our internal testing, we see eal-intr-thread crash in these two ways: Error adding fd 660 epoll_ctl, Operation not permitted or Error adding fd 660 epoll_ctl, Bad file descriptor epoll_ctl() returns EPERM if the target fd does not support poll. It returns EBADF when the epoll fd itself is closed or the target fd is closed. When the first type of crash happens, we see that the fd 660 is anon_inode:[vfio-device] which does not support poll. When the second type of crash happens, we could see from the fd map of the crashing process that the fd 660 was already closed. This means the said fd has been closed and in certain cases may have been reassigned to a different device by the operating system but the eal-intr-thread does not know about it. We observed that these crashes were always accompanied by an error in iavf_dev_init() after rte_intr_callback_register() and iavf_enable_irq0() have already happened. In the error path, the intr_handle_fd was being closed but the interrupt handler wasn't being unregistered. The fix is to unregister the interrupt handle in the iavf_dev_init() error path. Ensure proper cleanup if iavf_security_init() or iavf_security_ctx_create() fail. Earlier, we were leaking memory by simply returning from iavf_dev_init(). Signed-off-by: Saurabh Singhal <saurabhs@arista.com> --- .mailmap | 1 + drivers/net/iavf/iavf_ethdev.c | 22 ++++++++++++++++++++-- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/.mailmap b/.mailmap index 864d33ee46..4dac53011b 100644 --- a/.mailmap +++ b/.mailmap @@ -1227,6 +1227,7 @@ Satananda Burla <sburla@marvell.com> Satha Rao <skoteshwar@marvell.com> <skoteshwar@caviumnetworks.com> Satheesh Paul <psatheesh@marvell.com> Sathesh Edara <sedara@marvell.com> +Saurabh Singhal <saurabhs@arista.com> Savinay Dharmappa <savinay.dharmappa@intel.com> Scott Branden <scott.branden@broadcom.com> Scott Daniels <daniels@research.att.com> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index f2fc5a5621..b7ed22889a 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -133,6 +133,8 @@ static int iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id); static int iavf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id); +static void iavf_dev_interrupt_handler(void *param); +static inline void iavf_disable_irq0(struct iavf_hw *hw); static int iavf_dev_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops); static int iavf_set_mc_addr_list(struct rte_eth_dev *dev, @@ -2709,13 +2711,13 @@ iavf_dev_init(struct rte_eth_dev *eth_dev) ret = iavf_security_ctx_create(adapter); if (ret) { PMD_INIT_LOG(ERR, "failed to create ipsec crypto security instance"); - return ret; + goto flow_init_err; } ret = iavf_security_init(adapter); if (ret) { PMD_INIT_LOG(ERR, "failed to initialized ipsec crypto resources"); - return ret; + goto security_init_err; } } @@ -2728,7 +2730,23 @@ iavf_dev_init(struct rte_eth_dev *eth_dev) return 0; +security_init_err: + iavf_security_ctx_destroy(adapter); + flow_init_err: + iavf_disable_irq0(hw); + + if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) { + /* disable uio intr before callback unregiser */ + rte_intr_disable(intr_handle); + + /* unregister callback func from eal lib */ + rte_intr_callback_unregister(intr_handle, + iavf_dev_interrupt_handler, dev); + } else { + rte_eal_alarm_cancel(iavf_dev_alarm_handler, dev); + } + rte_free(eth_dev->data->mac_addrs); eth_dev->data->mac_addrs = NULL; -- 2.41.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] net/iavf: unregister intr handler before FD close 2023-09-06 11:47 ` [PATCH v2] " Saurabh Singhal @ 2023-09-06 14:05 ` Stephen Hemminger 2023-09-07 1:41 ` Zhang, Qi Z 0 siblings, 1 reply; 7+ messages in thread From: Stephen Hemminger @ 2023-09-06 14:05 UTC (permalink / raw) To: Saurabh Singhal; +Cc: Thomas Monjalon, Jingjing Wu, Beilei Xing, dev On Wed, 6 Sep 2023 04:47:49 -0700 Saurabh Singhal <saurabhs@arista.com> wrote: > +static inline void iavf_disable_irq0(struct iavf_hw *hw); inline on function prototype is meaningless and not needed. ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v2] net/iavf: unregister intr handler before FD close 2023-09-06 14:05 ` Stephen Hemminger @ 2023-09-07 1:41 ` Zhang, Qi Z 0 siblings, 0 replies; 7+ messages in thread From: Zhang, Qi Z @ 2023-09-07 1:41 UTC (permalink / raw) To: Stephen Hemminger, Singhal, Saurabh Cc: Thomas Monjalon, Wu, Jingjing, Xing, Beilei, dev > -----Original Message----- > From: Stephen Hemminger <stephen@networkplumber.org> > Sent: Wednesday, September 6, 2023 10:05 PM > To: Singhal, Saurabh <saurabhs@arista.com> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing > <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; dev@dpdk.org > Subject: Re: [PATCH v2] net/iavf: unregister intr handler before FD close > > On Wed, 6 Sep 2023 04:47:49 -0700 > Saurabh Singhal <saurabhs@arista.com> wrote: > > > +static inline void iavf_disable_irq0(struct iavf_hw *hw); > > inline on function prototype is meaningless and not needed. Please also fix the compile error. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4] net/iavf: unregister intr handler before FD close 2023-09-06 5:24 ` Zhang, Qi Z 2023-09-06 11:47 ` [PATCH v2] " Saurabh Singhal @ 2023-09-07 3:15 ` Saurabh Singhal 2023-09-11 0:55 ` Zhang, Qi Z 1 sibling, 1 reply; 7+ messages in thread From: Saurabh Singhal @ 2023-09-07 3:15 UTC (permalink / raw) To: Thomas Monjalon, Jingjing Wu, Beilei Xing; +Cc: dev, Saurabh Singhal Unregister VFIO interrupt handler before the interrupt fd gets closed in case iavf_dev_init() returns an error. dpdk creates a standalone thread named eal-intr-thread for processing interrupts for the PCI devices. The interrupt handler callbacks are registered by the VF driver(iavf, in this case). When we do a PCI probe of the network interfaces, we register an interrupt handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These interrupt sources are registered in a global linked list that the eal-intr-thread keeps iterating over for handling the interrupts. In our internal testing, we see eal-intr-thread crash in these two ways: Error adding fd 660 epoll_ctl, Operation not permitted or Error adding fd 660 epoll_ctl, Bad file descriptor epoll_ctl() returns EPERM if the target fd does not support poll. It returns EBADF when the epoll fd itself is closed or the target fd is closed. When the first type of crash happens, we see that the fd 660 is anon_inode:[vfio-device] which does not support poll. When the second type of crash happens, we could see from the fd map of the crashing process that the fd 660 was already closed. This means the said fd has been closed and in certain cases may have been reassigned to a different device by the operating system but the eal-intr-thread does not know about it. We observed that these crashes were always accompanied by an error in iavf_dev_init() after rte_intr_callback_register() and iavf_enable_irq0() have already happened. In the error path, the intr_handle_fd was being closed but the interrupt handler wasn't being unregistered. The fix is to unregister the interrupt handle in the iavf_dev_init() error path. Ensure proper cleanup if iavf_security_init() or iavf_security_ctx_create() fail. Earlier, we were leaking memory by simply returning from iavf_dev_init(). Signed-off-by: Saurabh Singhal <saurabhs@arista.com> --- .mailmap | 1 + drivers/net/iavf/iavf_ethdev.c | 22 ++++++++++++++++++++-- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/.mailmap b/.mailmap index 864d33ee46..4dac53011b 100644 --- a/.mailmap +++ b/.mailmap @@ -1227,6 +1227,7 @@ Satananda Burla <sburla@marvell.com> Satha Rao <skoteshwar@marvell.com> <skoteshwar@caviumnetworks.com> Satheesh Paul <psatheesh@marvell.com> Sathesh Edara <sedara@marvell.com> +Saurabh Singhal <saurabhs@arista.com> Savinay Dharmappa <savinay.dharmappa@intel.com> Scott Branden <scott.branden@broadcom.com> Scott Daniels <daniels@research.att.com> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index f2fc5a5621..47c1399a52 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -133,6 +133,8 @@ static int iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id); static int iavf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id); +static void iavf_dev_interrupt_handler(void *param); +static void iavf_disable_irq0(struct iavf_hw *hw); static int iavf_dev_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops); static int iavf_set_mc_addr_list(struct rte_eth_dev *dev, @@ -2709,13 +2711,13 @@ iavf_dev_init(struct rte_eth_dev *eth_dev) ret = iavf_security_ctx_create(adapter); if (ret) { PMD_INIT_LOG(ERR, "failed to create ipsec crypto security instance"); - return ret; + goto flow_init_err; } ret = iavf_security_init(adapter); if (ret) { PMD_INIT_LOG(ERR, "failed to initialized ipsec crypto resources"); - return ret; + goto security_init_err; } } @@ -2728,7 +2730,23 @@ iavf_dev_init(struct rte_eth_dev *eth_dev) return 0; +security_init_err: + iavf_security_ctx_destroy(adapter); + flow_init_err: + iavf_disable_irq0(hw); + + if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) { + /* disable uio intr before callback unregiser */ + rte_intr_disable(pci_dev->intr_handle); + + /* unregister callback func from eal lib */ + rte_intr_callback_unregister(pci_dev->intr_handle, + iavf_dev_interrupt_handler, eth_dev); + } else { + rte_eal_alarm_cancel(iavf_dev_alarm_handler, eth_dev); + } + rte_free(eth_dev->data->mac_addrs); eth_dev->data->mac_addrs = NULL; -- 2.41.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v4] net/iavf: unregister intr handler before FD close 2023-09-07 3:15 ` [PATCH v4] " Saurabh Singhal @ 2023-09-11 0:55 ` Zhang, Qi Z 0 siblings, 0 replies; 7+ messages in thread From: Zhang, Qi Z @ 2023-09-11 0:55 UTC (permalink / raw) To: Singhal, Saurabh, Thomas Monjalon, Wu, Jingjing, Xing, Beilei Cc: dev, Singhal, Saurabh > -----Original Message----- > From: Saurabh Singhal <saurabhs@arista.com> > Sent: Thursday, September 7, 2023 11:15 AM > To: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing > <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com> > Cc: dev@dpdk.org; Singhal, Saurabh <saurabhs@arista.com> > Subject: [PATCH v4] net/iavf: unregister intr handler before FD close > > Unregister VFIO interrupt handler before the interrupt fd gets closed in case > iavf_dev_init() returns an error. > > dpdk creates a standalone thread named eal-intr-thread for processing > interrupts for the PCI devices. The interrupt handler callbacks are registered > by the VF driver(iavf, in this case). > > When we do a PCI probe of the network interfaces, we register an interrupt > handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These > interrupt sources are registered in a global linked list that the eal-intr-thread > keeps iterating over for handling the interrupts. In our internal testing, we see > eal-intr-thread crash in these two ways: > > Error adding fd 660 epoll_ctl, Operation not permitted > > or > > Error adding fd 660 epoll_ctl, Bad file descriptor > > epoll_ctl() returns EPERM if the target fd does not support poll. > It returns EBADF when the epoll fd itself is closed or the target fd is closed. > > When the first type of crash happens, we see that the fd 660 is > anon_inode:[vfio-device] which does not support poll. > > When the second type of crash happens, we could see from the fd map of > the crashing process that the fd 660 was already closed. > > This means the said fd has been closed and in certain cases may have been > reassigned to a different device by the operating system but the eal-intr- > thread does not know about it. > > We observed that these crashes were always accompanied by an error in > iavf_dev_init() after rte_intr_callback_register() and > iavf_enable_irq0() have already happened. In the error path, the > intr_handle_fd was being closed but the interrupt handler wasn't being > unregistered. > > The fix is to unregister the interrupt handle in the > iavf_dev_init() error path. > > Ensure proper cleanup if iavf_security_init() or > iavf_security_ctx_create() fail. Earlier, we were leaking memory by simply > returning from iavf_dev_init(). Fixes: 22b123a36d07 ("net/avf: initialize PMD") Fixes: 6bc987ecb860 ("net/iavf: support IPsec inline crypto") Cc: stable@dpdk.org > > Signed-off-by: Saurabh Singhal <saurabhs@arista.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com> Applied to dpdk-next-net-intel. Thanks Qi ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-09-11 0:55 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-09-04 13:01 [PATCH] net/iavf: unregister intr handler before FD close Saurabh Singhal 2023-09-06 5:24 ` Zhang, Qi Z 2023-09-06 11:47 ` [PATCH v2] " Saurabh Singhal 2023-09-06 14:05 ` Stephen Hemminger 2023-09-07 1:41 ` Zhang, Qi Z 2023-09-07 3:15 ` [PATCH v4] " Saurabh Singhal 2023-09-11 0:55 ` Zhang, Qi Z
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).