patches for DPDK stable branches
 help / color / Atom feed
From: Gaëtan Rivet <grive@u256.net>
To: Long Li <longli@microsoft.com>
Cc: Long Li <longli@linuxonhyperv.com>, "dev@dpdk.org" <dev@dpdk.org>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH] net/failsafe: check correct error code while handling sub-device add
Date: Mon, 12 Oct 2020 16:22:17 +0200
Message-ID: <20201012142217.4s3l6wrt2qyiezcu@u256.net> (raw)
In-Reply-To: <BN8PR21MB11556CB2041982D497B8A8CFCE080@BN8PR21MB1155.namprd21.prod.outlook.com>

On 09/10/20 20:30 +0000, Long Li wrote:
> >Subject: Re: [dpdk-dev] [PATCH] net/failsafe: check correct error code while
> >handling sub-device add
> >
> >On 05/10/20 11:42 +0200, Gaëtan Rivet wrote:
> >> Hi,
> >>
> >> On 02/10/20 17:01 -0700, Long Li wrote:
> >> > From: Long Li <longli@microsoft.com>
> >> >
> >> > When adding a sub-device, it's possible that the sub-device is
> >> > configured successfully but later fails to start. This error should not be
> >masked.
> >>
> >> Some of those errors are meant to be masked: -EIO, when the device is
> >> marked as removed at the ethdev level (see eth_err() in rte_ethdev.c:819).
> >>
> >> > The driver needs to check the error status to prevent endless loop
> >> > of trying to start the sub-device.
> >>
> >> If the ethdev layer error is due to the device being removed, and
> >> failsafe loops on trying to sync the eth device to its own state, then
> >> an RMV event should have been emitted but wasn't or it was missed by
> >> failsafe.
> >>
> >> If the ethdev layer error is *not* due to the device being removed,
> >> the error should be != -EIO, and sdev->remove should not be set, so
> >> fs_err() should not mask it and it should be seen by the app.
> >>
> >> Can you provide the following details:
> >>
> >>  * What is the return code of rte_eth_dev_start() that is masked in your
> >>    start loop?
> >>
> >>  * Is the device marked as removed in failsafe?
> >>
> >>  * Is the device marked as removed in ethdev?
> >>
> >>  * Was there an RMV event generated for the device? Whether yes or no,
> >>    is it correct?
> >>
> >> Thanks,
> >>
> >
> >Hello Li,
> >
> >I've found the previous mail thread [1] where you described how you got this
> >error. In your description, you say that you try unplug then quick replug,
> >before any event is processed?
> 
> Hi Gaëtan,
> 
> Sorry for getting back late. I had trouble with my email.
> 
> I think the issue is that: when the failsafe driver tries to start the sub device, it may fail. The failsafe driver needs to deal with the failure. Here is another log that I captured:
> 
> net_failsafe: Starting sub_device 0
> net_mlx4: error while attaching Rx queue 0x11da428c0: CQ creation failure: Cannot allocate memory
> net_mlx4: cannot initialize common RSS resources (queue 0): unable to create Rx queue resources: Cannot allocate memory
> net_mlx4: 0x1bf2280: cannot initialize RSS resources: Cannot allocate memory
> net_failsafe: Starting sub_device 0
> net_mlx4: error while attaching Rx queue 0x11da428c0: CQ creation failure: Cannot allocate memory
> net_mlx4: cannot initialize common RSS resources (queue 0): unable to create Rx queue resources: Cannot allocate memory
> net_mlx4: 0x1bf2280: cannot initialize RSS resources: Cannot allocate memory
> 
> It's called from fs_dev_start(). The MLX4 can't start, it's in a state being removed in ethdev. But this error is being masked by fs_err():
> 
> static inline int
> fs_err(struct sub_device *sdev, int err)
> {
>         /* A device removal shouldn't be reported as an error. */
>         if (sdev->remove == 1 || err == -EIO)
>                 return rte_errno = 0;
>         return err;
> } 
> 
> So fs_dev_start() always return a success, returned to failsafe_eth_dev_state_sync(), which is repeated called from the alarm fs_hotplug_alarm(). This goes to an endless loop.
> 
> I believe failsafe_eth_rmv_event_callback() has been called prior to this. It sets sdev->remove = 1, but it didn't help with exiting the loop. fs_err() always returns 0, and fs_dev_start() is called from fs_hotplug_alarm(), that keeps re-alarming itself.

Ok, the issue here is that even with sdev->remove == 1, the device is
not actually removed, but it should.

I think we discussed it with Stephen off-list not too long ago actually.
In his case the actual bug was in vdev_netvsc, but there was still this
issue in failsafe. It is happening there:

   void
   failsafe_dev_remove(struct rte_eth_dev *dev)
   {
           struct sub_device *sdev;
           uint8_t i;

           FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
                                         /* ^^^^^^^^^^^^ HERE is the bug */
                   if (sdev->remove && fs_rxtx_clean(sdev)) {
                           if (fs_lock(dev, 1) != 0)
                                   return;
                           fs_dev_stats_save(sdev);
                           fs_dev_remove(sdev);
                           fs_unlock(dev, 1);
                   }
   }

A device with the remove flag will be cleaned up only if it is already
in an active state. This is not correct, it should be removed in all
cases -- even if it stayed stuck in PROBED state.

Here is a fix I propose for this issue:
http://mails.dpdk.org/archives/dev/2020-October/185921.html

Could you please test it and verify it solves your issue?

> 
> Why do we check (sdev->remove == 1 || err == -EIO) in fs_err()? If sdev->remove is set, this will always return 0, regardless of err.
> 

if sdev->remove == 1, there should not even be a call to sub-device
start ops actually. The device is flagged for removal, next thing to
happen is to remove it, even if the device reappeared on the bus
in-between. The failsafe needs to consume the removed flag (close the device
then remove it), then it will be ready to detect the new device and
will probe it back from the beginning.

> It's hard to reproduce with gdb. Hopefully I can get a good trace in gdb.
> 
> Thanks,
> 
> Long
> 
> >
> >If that's the case, it seems a clear race condition, and an issue of missing the
> >removal event of the device. I would not say yet that the bug is in failsafe, but
> >it could be in ethdev.
> >
> >Can you please check whether the device removal event was properly
> >generated in rte_ethdev? Failsafe (and any other hotplug support layer
> >actually) will depend on it so it should be first checked to work.
> >
> >Thanks,
> >
> >[1]:
> >https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.
> >dpdk.org%2Farchives%2Fdev%2F2020-
> >September%2F182977.html&amp;data=02%7C01%7Clongli%40microsoft.com
> >%7C3e044201096c4f48508108d86c6f319d%7C72f988bf86f141af91ab2d7cd011d
> >b47%7C1%7C0%7C637378572568404734&amp;sdata=DAUVxBJQtJx8mUSWlPI
> >DtenhpMmdiMPpIczmxOdrCqE%3D&amp;reserved=0
> >
> >--
> >Gaëtan

Regards,
-- 
Gaëtan

  reply index

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-03  0:01 [dpdk-stable] " Long Li
2020-10-05  9:42 ` Gaëtan Rivet
2020-10-09 16:20   ` [dpdk-stable] [dpdk-dev] " Gaëtan Rivet
2020-10-09 20:30     ` Long Li
2020-10-12 14:22       ` Gaëtan Rivet [this message]
2020-10-13 17:14         ` Long Li
2020-10-15 10:11 ` [dpdk-stable] " Gaëtan Rivet

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201012142217.4s3l6wrt2qyiezcu@u256.net \
    --to=grive@u256.net \
    --cc=dev@dpdk.org \
    --cc=longli@linuxonhyperv.com \
    --cc=longli@microsoft.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

patches for DPDK stable branches

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ http://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/ public-inbox