From: Matan Azrad <matan@nvidia.com>
To: NBU-Contact-longli <longli@microsoft.com>,
Long Li <longli@linuxonhyperv.com>,
Matan Azrad <matan@mellanox.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-stable] [PATCH] net/vdev_netvsc: Prevent alarm lost on failed device probe
Date: Tue, 20 Oct 2020 05:50:52 +0000 [thread overview]
Message-ID: <MW2PR12MB249216F2E81CCAAC7FC1B73DDF1F0@MW2PR12MB2492.namprd12.prod.outlook.com> (raw)
In-Reply-To: <BN8PR21MB11555C3B59269DE97085224DCE1E0@BN8PR21MB1155.namprd21.prod.outlook.com>
From: Long Li
> > Subject: RE: [PATCH] net/vdev_netvsc: Prevent alarm lost on failed
> > device probe
> >
> >
> >
> > From: Long Li
> > > If a device probe fails, the alarm is canceled and will no longer
> > > work for previously probed devices.
> > >
> > > Fix this by introducing a flag to track if alarm has been set.
> > > Because it's possible that an alarm is triggered while probing is in
> > > progress that may modify vdev_netvsc_ctx_list, introduce a lock to
> protect it.
> > >
> > > Cc: stable@dpdk.org
> > > Signed-off-by: Long Li <longli@microsoft.com>
> > > ---
> > > drivers/net/vdev_netvsc/vdev_netvsc.c | 41
> > > +++++++++++++++++++++++++----------
> > > 1 file changed, 30 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/net/vdev_netvsc/vdev_netvsc.c
> > > b/drivers/net/vdev_netvsc/vdev_netvsc.c
> > > index be8f19c..bd7e308 100644
> > > --- a/drivers/net/vdev_netvsc/vdev_netvsc.c
> > > +++ b/drivers/net/vdev_netvsc/vdev_netvsc.c
> > > @@ -76,6 +76,11 @@ struct vdev_netvsc_ctx {
> > > /** Context list is common to all driver instances. */ static
> > > LIST_HEAD(,
> > > vdev_netvsc_ctx) vdev_netvsc_ctx_list =
> > > LIST_HEAD_INITIALIZER(vdev_netvsc_ctx_list);
> > > +/* Lock to protect concurrent accesses to vdev_netvsc_ctx_list */
> > > +static rte_rwlock_t vdev_netvsc_ctx_list_lock;
> > > +
> > > +/* Flag to track if alarm has been set */ static int
> > > +vdev_netvsc_alarm_set;
> > >
> > > /** Number of entries in context list. */ static unsigned int
> > > vdev_netvsc_ctx_count; @@ -454,19 +459,26 @@ static LIST_HEAD(,
> > > vdev_netvsc_ctx) vdev_netvsc_ctx_list =
> > > struct vdev_netvsc_ctx *ctx;
> > > int ret;
> > >
> > > + rte_rwlock_write_lock(&vdev_netvsc_ctx_list_lock);
> > > LIST_FOREACH(ctx, &vdev_netvsc_ctx_list, entry) {
> > > ret = vdev_netvsc_foreach_iface(vdev_netvsc_device_probe, 0,
> > > ctx);
> > > if (ret < 0)
> > > break;
> > > }
> > > - if (!vdev_netvsc_ctx_count)
> > > + rte_rwlock_write_unlock(&vdev_netvsc_ctx_list_lock);
> > > +
> > > + if (!vdev_netvsc_ctx_count) {
> > > + vdev_netvsc_alarm_set = 0;
> > > return;
> > > + }
> > > +
> > > ret = rte_eal_alarm_set(VDEV_NETVSC_PROBE_MS * 1000,
> > > vdev_netvsc_alarm, NULL);
> > > if (ret < 0) {
> > > DRV_LOG(ERR, "unable to reschedule alarm callback: %s",
> > > rte_strerror(-ret));
> > > + vdev_netvsc_alarm_set = 0;
> > > }
> > > }
> > >
> > > @@ -698,34 +710,41 @@ static LIST_HEAD(, vdev_netvsc_ctx)
> > > vdev_netvsc_ctx_list =
> > > " device.");
> > > goto error;
> > > }
> > > - rte_eal_alarm_cancel(vdev_netvsc_alarm, NULL);
> > > +
> > > + rte_rwlock_write_lock(&vdev_netvsc_ctx_list_lock);
> > > /* Gather interfaces. */
> > > ret = vdev_netvsc_foreach_iface(vdev_netvsc_netvsc_probe, 1,
> > name,
> > > kvargs, specified, &matched);
> > > if (ret < 0)
> > > - goto error;
> > > + goto error_unlock;
> > > if (specified && matched < specified) {
> > > if (!force) {
> > > DRV_LOG(ERR, "Cannot find the specified netvsc device");
> > > - goto error;
> > > + goto error_unlock;
> > > }
> > > /* Try to force probing on non-netvsc specified device. */
> > > if
> > > (vdev_netvsc_foreach_iface(vdev_netvsc_netvsc_probe, 0,
> > name,
> > > kvargs, specified, &matched) < 0)
> > > - goto error;
> > > + goto error_unlock;
> > > if (matched < specified) {
> > > DRV_LOG(ERR, "Cannot find the specified device");
> > > - goto error;
> > > + goto error_unlock;
> > > }
> > > DRV_LOG(WARNING, "non-netvsc device was probed as
> netvsc");
> > > }
> > > - ret = rte_eal_alarm_set(VDEV_NETVSC_PROBE_MS * 1000,
> > > + if (!vdev_netvsc_alarm_set) {
> > > + ret = rte_eal_alarm_set(VDEV_NETVSC_PROBE_MS * 1000,
> > > vdev_netvsc_alarm, NULL);
> > > - if (ret < 0) {
> > > - DRV_LOG(ERR, "unable to schedule alarm callback: %s",
> > > - rte_strerror(-ret));
> > > - goto error;
> > > + if (ret < 0)
> > > + DRV_LOG(ERR, "unable to schedule alarm callback: %s",
> > > + rte_strerror(-ret));
> > > + else
> > > + vdev_netvsc_alarm_set = 1;
> > > }
> > > +
> > > +error_unlock:
> > > + rte_rwlock_write_unlock(&vdev_netvsc_ctx_list_lock);
> > > +
> > > error:
> > > if (kvargs)
> > > rte_kvargs_free(kvargs);
> > > --
> >
> > Hi
> >
> > Nice direction.
> >
> > As I understand from your patch it looks like you want to add support
> > for the vdev_netvsc driver itself to be hot-plug aware - so it can be
> > probed\removed in run-time (even after EAL initialization).
> > So I think the title and commit log should be converted to this:
> > net/vdev_netvsc: support driver instance hot-plug ...
>
> Hi Matan,
>
> Actually I was not trying to add support for hot-plug (although it made half
> way there).
>
> The problem I'm trying to solve is on a system with multiple vdev_netvsc
> interfaces. For example, if the system has eth1, eth2 .. eth10. What if the
> probing on eth1 to eth9 succeed, but on eth10 fails? Current behavior is that
> we can still have DPDK run on eth1 to eth9, but without any alarms. Because
> the alarm is canceled while trying to probing eth10. Then once eth1 to eth9
> lost VF, they won't get VF back because we don't have any alarms.
>
> Long
Ok, got you.
So, why to manage critical sections in the code, better, to stop alarm before adding a new device, and apply it back if the list of devices is not empty.
It will make it simpler, no?
> >
> > More things:
> > Look on convention of the file - blank line only after variables
> > declarations inside the blocks.
> >
> > Need to adjust also vdev_netvsc_vdev_remove to the feature.
> >
> > Matan
next prev parent reply other threads:[~2020-10-20 5:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-15 19:33 Long Li
2020-10-18 7:31 ` Matan Azrad
2020-10-19 20:38 ` Long Li
2020-10-20 5:50 ` Matan Azrad [this message]
2020-10-20 21:39 ` Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MW2PR12MB249216F2E81CCAAC7FC1B73DDF1F0@MW2PR12MB2492.namprd12.prod.outlook.com \
--to=matan@nvidia.com \
--cc=dev@dpdk.org \
--cc=longli@linuxonhyperv.com \
--cc=longli@microsoft.com \
--cc=matan@mellanox.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).