DPDK patches and discussions
 help / color / mirror / Atom feed
From: Renata Saiakhova <renata.saiakhova@ekinops.com>
To: Harman Kalra <hkalra@marvell.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>,
	Bruce Richardson <bruce.richardson@intel.com>,
	Ray Kinsella <mdr@ashroe.eu>, Neil Horman <nhorman@tuxdriver.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [EXT] [PATCH v2 1/1] librte_eal: rte_intr_callback_unregister_sync() - wrapper around rte_intr_callback_unregister().
Date: Mon, 30 Nov 2020 17:20:48 +0000
Message-ID: <MRXP264MB01206424EC6EF913742235C992F50@MRXP264MB0120.FRAP264.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <20201028203632.GA137989@outlook.office365.com>

Hi Harman,

sorry for late reply...

Yes, indeed, this is a race between an application which calls rte_dev_remove() and a kernel event which is sent as a result of unbinding the device from vfio_pci driver.
(dpdk-devbind.py -u 0000:05:00.0)

rte_intr_callback_unregister() may fail and return -EAGAIN, if an interrupt source (kernel) has some active callbacks at the moment. As a result, the callback (req notifier) can never be unregistered,
and vfio_req_intr_handle.fd can never be closed.
The kernel will continuously try to notify the user space using req notifier, but as the device is already removed, in this case it even cannot find a bus for that device, below is the log which illustrates it:
EAL: fail to unregister req notifier handler.
EAL: fail to disable req notifier.
dpdk_disconnect 1545: Device '0000:05:00.0' has been removed and detached
dpdk_disconnect 1557: All devices shared with device '0000:05:00.0' have been detached
EAL: Cannot find bus for device (05:00.0)
EAL: Cannot find bus for device (05:00.0)
EAL: Cannot find bus for device (05:00.0)
EAL: Cannot find bus for device (05:00.0)
EAL: Cannot find bus for device (05:00.0)
etc.

This continues eternally, and application stops to work properly.
So, at least the retry logic should be put somewhere to avoid this kind of race. Or bus->hot_unplug_handler(dev) called from pci_vfio_req_handler() should do some work to release the above resources.

 Regarding the tight polling loop in the patch and fixed retry logic to avoid infinite looping, what could be an option? As it continues to loop only in -EAGAIN case, which means kernel event is processed, doesn't it guarantee that it won't last forever?

Kind regards,
Renata

________________________________
From: Harman Kalra <hkalra@marvell.com>
Sent: Wednesday, October 28, 2020 9:36 PM
To: Renata Saiakhova <renata.saiakhova@ekinops.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>; Bruce Richardson <bruce.richardson@intel.com>; Ray Kinsella <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>; dev@dpdk.org <dev@dpdk.org>
Subject: Re: [EXT] [PATCH v2 1/1] librte_eal: rte_intr_callback_unregister_sync() - wrapper around rte_intr_callback_unregister().

On Mon, Aug 17, 2020 at 04:08:27PM +0200, Renata Saiakhova wrote:
> External Email
>
> ----------------------------------------------------------------------
> Avoid race with unregister interrupt hanlder if interrupt
> source has some active callbacks at the moment, use wrapper
> around rte_intr_callback_unregister() to check for -EAGAIN
> return value and to loop until rte_intr_callback_unregister()
> succeeds.
>

Hi Renata,

   Just trying to understand the scenario, as you mentioned "while
   removing the device by rte_dev_remove()" are you calling
   rte_eal_hotplug_remove or kernel has sent an event to remove the
   device. As far as I know vfio notifier mechanism is used by kernel
   vfio driver to notify user to release the resources and as you are
   observing EAGAIN means same callback is executing.
   Regarding the tight polling loop in the patch, I think its good to
   have a fixed retry logic to avoid any unidentified corner case which
   might lead to infinite looping.

Thanks
Harman


> Signed-off-by: Renata Saiakhova <Renata.Saiakhova@ekinops.com>
> ---
>  drivers/bus/pci/linux/pci_vfio.c        |  2 +-
>  lib/librte_eal/freebsd/eal_interrupts.c | 12 ++++++++++++
>  lib/librte_eal/include/rte_interrupts.h | 25 +++++++++++++++++++++++++
>  lib/librte_eal/linux/eal_interrupts.c   | 12 ++++++++++++
>  lib/librte_eal/rte_eal_version.map      |  1 +
>  5 files changed, 51 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
> index 07e072e13..a4bfdf553 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -415,7 +415,7 @@ pci_vfio_disable_notifier(struct rte_pci_device *dev)
>                return -1;
>        }
>
> -     ret = rte_intr_callback_unregister(&dev->vfio_req_intr_handle,
> +     ret = rte_intr_callback_unregister_sync(&dev->vfio_req_intr_handle,
>                                           pci_vfio_req_handler,
>                                           (void *)&dev->device);
>        if (ret < 0) {
> diff --git a/lib/librte_eal/freebsd/eal_interrupts.c b/lib/librte_eal/freebsd/eal_interrupts.c
> index 6d53d33c8..7d99bdaff 100644
> --- a/lib/librte_eal/freebsd/eal_interrupts.c
> +++ b/lib/librte_eal/freebsd/eal_interrupts.c
> @@ -345,6 +345,18 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
>        return ret;
>  }
>
> +int
> +rte_intr_callback_unregister_sync(const struct rte_intr_handle *intr_handle,
> +             rte_intr_callback_fn cb_fn, void *cb_arg)
> +{
> +     int ret = 0;
> +
> +     while ((ret = rte_intr_callback_unregister(intr_handle, cb_fn, cb_arg)) == -EAGAIN)
> +             rte_pause();
> +
> +     return ret;
> +}
> +
>  int
>  rte_intr_enable(const struct rte_intr_handle *intr_handle)
>  {
> diff --git a/lib/librte_eal/include/rte_interrupts.h b/lib/librte_eal/include/rte_interrupts.h
> index e3b406abc..cc3bf45d8 100644
> --- a/lib/librte_eal/include/rte_interrupts.h
> +++ b/lib/librte_eal/include/rte_interrupts.h
> @@ -94,6 +94,31 @@ rte_intr_callback_unregister_pending(const struct rte_intr_handle *intr_handle,
>                                rte_intr_callback_fn cb_fn, void *cb_arg,
>                                rte_intr_unregister_callback_fn ucb_fn);
>
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Loop until rte_intr_callback_unregister() succeeds.
> + * After a call to this function,
> + * the callback provided by the specified interrupt handle is unregistered.
> + *
> + * @param intr_handle
> + *  pointer to the interrupt handle.
> + * @param cb
> + *  callback address.
> + * @param cb_arg
> + *  address of parameter for callback, (void *)-1 means to remove all
> + *  registered which has the same callback address.
> + *
> + * @return
> + *  - On success, return the number of callback entities removed.
> + *  - On failure, a negative value.
> + */
> +__rte_experimental
> +int
> +rte_intr_callback_unregister_sync(const struct rte_intr_handle *intr_handle,
> +                             rte_intr_callback_fn cb, void *cb_arg);
> +
>  /**
>   * It enables the interrupt for the specified handle.
>   *
> diff --git a/lib/librte_eal/linux/eal_interrupts.c b/lib/librte_eal/linux/eal_interrupts.c
> index 13db5c4e8..c99d5dbd4 100644
> --- a/lib/librte_eal/linux/eal_interrupts.c
> +++ b/lib/librte_eal/linux/eal_interrupts.c
> @@ -662,6 +662,18 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
>        return ret;
>  }
>
> +int
> +rte_intr_callback_unregister_sync(const struct rte_intr_handle *intr_handle,
> +                     rte_intr_callback_fn cb_fn, void *cb_arg)
> +{
> +     int ret = 0;
> +
> +     while ((ret = rte_intr_callback_unregister(intr_handle, cb_fn, cb_arg)) == -EAGAIN)
> +             rte_pause();
> +
> +     return ret;
> +}
> +
>  int
>  rte_intr_enable(const struct rte_intr_handle *intr_handle)
>  {
> diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
> index bf0c17c23..b1d824f59 100644
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -325,6 +325,7 @@ EXPERIMENTAL {
>        rte_fbarray_find_rev_biggest_free;
>        rte_fbarray_find_rev_biggest_used;
>        rte_intr_callback_unregister_pending;
> +     rte_intr_callback_unregister_sync;
>        rte_realloc_socket;
>
>        # added in 19.08
> --
> 2.17.2
>

  reply	other threads:[~2020-11-30 17:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-17 14:08 [dpdk-dev] [PATCH v2 0/1] pci_vfio_disable_notifier(): avoid race with unregister Renata Saiakhova
2020-08-17 14:08 ` [dpdk-dev] [PATCH v2 1/1] librte_eal: rte_intr_callback_unregister_sync() - wrapper around rte_intr_callback_unregister() Renata Saiakhova
2020-10-08  7:47   ` David Marchand
2020-10-20 13:40     ` David Marchand
2020-10-28 20:36   ` [dpdk-dev] [EXT] " Harman Kalra
2020-11-30 17:20     ` Renata Saiakhova [this message]
2021-02-11 10:48   ` [dpdk-dev] " Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MRXP264MB01206424EC6EF913742235C992F50@MRXP264MB0120.FRAP264.PROD.OUTLOOK.COM \
    --to=renata.saiakhova@ekinops.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=hkalra@marvell.com \
    --cc=mdr@ashroe.eu \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git