DPDK patches and discussions
 help / color / mirror / Atom feed
From: Gaëtan Rivet <grive@u256.net>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: Matan Azrad <matan@nvidia.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	dev@dpdk.org, Raslan Darawsheh <rasland@nvidia.com>,
	Long Li <longli@microsoft.com>
Subject: Re: [dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of associated pci device
Date: Tue, 20 Oct 2020 11:13:52 +0200
Message-ID: <20201020091352.kcmxkecj3vt77x7z@u256.net> (raw)
In-Reply-To: <7708623.dZStS9cY3P@thomas>

Hi Thomas,

This issue has already been fixed, see:
http://mails.dpdk.org/archives/dev/2020-October/185921.html

It has been integrated, Long was able to test it and confirm it fixed
this issue.

On 20/10/20 00:36 +0200, Thomas Monjalon wrote:
> Fixing Gaetan's address
> 
> 20/10/2020 00:33, Thomas Monjalon:
> > Gaetan, Matan,
> > Please could you check?
> > 
> > 
> > 25/09/2020 22:30, Long Li:
> > > HI Matan,
> > > 
> > > While troubleshooting a failure in DPDK on device removal when VF device briefly disappears and comes back, I notice the failsafe driver is trying repeatedly to start a sub device (after this sub device has been successfully configured, but later hot removed from the kernel). This is due to repeated alarms calling fs_dev_start(). I trace into this commit:
> > > 
> > > ae80146 net/failsafe: fix removed device handling
> > > 
> > > The implementation of fs_err() is interesting:
> > > 
> > > +fs_err(struct sub_device *sdev, int err)
> > > +{
> > > +       /* A device removal shouldn't be reported as an error. */
> > > +       if (sdev->remove == 1 || err == -EIO)
> > > +               return rte_errno = 0;
> > > +       return err;
> > > +}
> > > 
> > > If I change this function to:
> > > @@ -497,7 +497,7 @@ int failsafe_eth_new_event_callback(uint16_t port_id
> > >  fs_err(struct sub_device *sdev, int err)
> > >  {
> > >         /* A device removal shouldn't be reported as an error. */
> > > -       if (sdev->remove == 1 || err == -EIO)
> > > +       if (sdev->remove == 1 && err == -EIO)
> > >                 return rte_errno = 0;
> > >         return err;
> > >  }
> > > 
> > > The hung is going away. I don't know the reason why we use a || in the if(). If a call to rte_eth_dev_start() returning EIO (as the case in fs_dev_start), the best choice would be bail out and fail this sub device.
> > > 
> > > Can you please take a look?
> > > 
> > > Thanks,
> > > Long
> > > 
> > > ________________________________
> > > From: Matan Azrad <matan@nvidia.com>
> > > Sent: Tuesday, September 15, 2020 12:00 AM
> > > To: Long Li <longli@microsoft.com>; Stephen Hemminger <stephen@networkplumber.org>
> > > Cc: matan@mellanox.com <matan@mellanox.com>; grive@u246.net <grive@u246.net>; dev@dpdk.org <dev@dpdk.org>; Raslan Darawsheh <rasland@nvidia.com>
> > > Subject: RE: [dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of associated pci device
> > > 
> > > Hi Li
> > > 
> > > From: Long Li <longli@microsoft.com>
> > > > >Subject: Re: [dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of
> > > > >associated pci device
> > > > >
> > > > >Hi Stephen
> > > > >
> > > > >From: Stephen Hemminger:
> > > > >> On Sun, 6 Sep 2020 12:38:18 +0000
> > > > >> Matan Azrad <matan@nvidia.com> wrote:
> > > > >>
> > > > >> > Hi Stephen
> > > > >> >
> > > > >> > From: Stephen Hemminger:
> > > > >> > > The vdev_netvsc was not detecting when the associated PCI device
> > > > >> > > (SRIOV) was removed. Because of that it would keep feeding the
> > > > >> > > same
> > > > >> > > (removed) device to failsafe PMD which would then unsuccessfully
> > > > >> > > try and probe for it.
> > > > >> > >
> > > > >> > > Change to use a mark/sweep method to detect that PCI device was
> > > > >> > > removed, and also only tell failsafe about new PCI devices.
> > > > >> > > Vdev_netvsc does not have to keep stuffing the pipe with the same
> > > > >> > > already existing PCI device.
> > > > >> >
> > > > >> > As I know, the vdev_netvsc driver doesn't call to failsafe if the
> > > > >> > PCI device is
> > > > >> not detected by the readlink command(considered as removed)...
> > > > >> > Am I missing something?
> > > > >>
> > > > >> The original code is broken because ctx_yield is not cleared, it
> > > > >> keeps sending the same value.
> > > > >
> > > > >Looking on the code again, It looks like ctx->yield has no effect on
> > > > >the next pipe write, It is just used for log.
> > > > >
> > > > >After the PCI interface matching to the netvsc interface, the pipe
> > > > >write is triggered only if the readlink commands success to see the
> > > > >plugged-in PCI
> > > > >device:
> > > > >readlink /sys/class/net/[iface]/device/subsystem shows "pci"
> > > > >readlink /sys/class/net/[iface]/device shows the pci device ID.
> > > > >
> > > > >So, the assumption is when the above readlink failed on the interface
> > > > >the device is removed(plugged-out) and the fd write will not happen.
> > > > >
> > > > >The code will continue to retry probe again and again until success
> > > > >only for plugged-in pci device matched the netvsc device.
> > > >
> > > > Hi Matan,
> > > >
> > > > The original code keeps writing to pipe even it's the same PCI device.
> > > 
> > > Yes, the vdev_netvsc writes any plugged-in device to the associated netvsc device fd.
> > > 
> > > > The
> > > > new code writes to pipe for a new device, only once. See the following code:
> > > >
> > > > +     /* Skip if this is same device already sent to failsafe */
> > > > +     if (strcmp(addr, ctx->yield) == 0)
> > > > +             return 0;
> > > >
> > > 
> > > I understand you want to optimize the pipe writing to be written only after plugged-in hot event.
> > > 
> > > The current solution suffers from race: the PCI device may be plugged-out and plugged-in in short time shorter than the driver alarm delay, then the PCI device plugged-in detection will lost.
> > > 
> > > My suggestion:
> > > Add validation to the plugged-in device probing state and that it is owned by failsafe(using ownership API) - don't write the pipe if so.
> > > 
> > > Matan
> > > 
> > > 
> > > 
> > > > This patch also saves lots of CPU since it no longer writes to pipe all the time.
> > > > You are correct about the code will continue to probe on a new PCI device.
> > > > But someone has to do it to handle hot-add.
> > > >
> > > > Thanks,
> > > > Long
> > > >
> > > >
> > > > >
> > > > >> It looks like device removal and add was never tested.
> > > > >
> > > > >This is basic test we have to test plug-in plug-out and it passed every
> > > > >day in the last years.
> > > > >
> > > > >Maybe something new and special in your setup?
> > > > >
> > > > >> If you test removal you will see that vdev_netvsc:
> > > > >>  1. Sends same PCI device repeatedly to failsafe (every alarm call)
> > > > >>     This is harmless, but useless.
> > > > >>  2. When device is removed, keeps doing #1
> > > 
> > 
> > 
> > 
> 
> 
> 
> 
> 

-- 
Gaëtan

      reply	other threads:[~2020-10-20  9:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-19 17:53 Stephen Hemminger
2020-09-06  8:11 ` Long Li
2020-09-06 12:38 ` Matan Azrad
2020-09-06 18:33   ` Stephen Hemminger
2020-09-07  8:09     ` Matan Azrad
2020-09-15  4:53       ` Long Li
2020-09-15  7:00         ` Matan Azrad
2020-09-25 20:30           ` Long Li
2020-10-19 22:33             ` Thomas Monjalon
2020-10-19 22:36               ` Thomas Monjalon
2020-10-20  9:13                 ` Gaëtan Rivet [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201020091352.kcmxkecj3vt77x7z@u256.net \
    --to=grive@u256.net \
    --cc=dev@dpdk.org \
    --cc=longli@microsoft.com \
    --cc=matan@nvidia.com \
    --cc=rasland@nvidia.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git