From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by dpdk.org (Postfix) with ESMTP id B54AC1B33D for ; Mon, 23 Oct 2017 10:36:26 +0200 (CEST) Received: by mail-wm0-f49.google.com with SMTP id u138so8206451wmu.4 for ; Mon, 23 Oct 2017 01:36:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=xKG3EvW/9q93HJGXskIK/WkfL6h05Srfg+cMj+Bt3/A=; b=nkOmL/NUP3hwlU1YTpCTRJcvmhRVWYkckmGViN/v/+lVzXl1B7lisM7R/+jdinwO7s Zhlezp4RnWBWWQ8LsENWOJi6uUWH64lyKAdcMYsDylxGXVM8+Y+tvdgtcCnPu+HwCL+h ExlYcumwGWz59P0hFIcnMzZHFw4Rgr5qfyhl5iirY/F9Tz3blmRDlVwm9Oppwn1e+uXw 5/GNGMV1J2dIgSVNE3yWRIEa99zO6QPcVF4lLjYCEhA/qBsOjG99HlpB4JWncVIRvfCV HPDRonWIZUw7nBpm3qGKwYmvBMJK5LaboEzDp1Av76lftXjm8Ue08EbXhozWdLTprnu/ y6xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=xKG3EvW/9q93HJGXskIK/WkfL6h05Srfg+cMj+Bt3/A=; b=q/OJwC3GwZDgax3XWo050ksHDnK+B85yY3lzKZDoFN/Q2IugSLGpMB+7f+ysfA35ls 6zxlVDaBh1NSmXpBj7DSW9Geo6jnL6xLwSdv9l1+7apmLwtPihV9Z4YTh28e8A226cuN 1MAvuHxgpjtc+iOegwKnH6py11kkVrRHm4YJ04/Gd9Gaxe4lq8xxNtr907egV/OXUKDf LyN3nQHmcOXeEMrm3YaT9QTniZ21+gN32nVr9HI6fiYZ0YgSNnl05VvF/IJq7eafM3TO nWozuE9VH3BsrINj4Mcv0ZDCU/ANCi1ZtosdfFSShDXgjjHHh55az++MzXStED0Oz2bN Vhmw== X-Gm-Message-State: AMCzsaXAP7QgJWGw0K7wgFU9c7Ib8bc/S6HUh9OkOtvNeNiXN3PBHAVf 9ARv0KQwaObe1d0ufzq9iau6DQ== X-Google-Smtp-Source: ABhQp+SgJ8rn8PS5QM+sQBPIlV7cMJySr0/azdnO9gClIrPhyH/3sbA9m2iLr8Hg0bvVvRNsjRl+WQ== X-Received: by 10.80.140.36 with SMTP id p33mr15112401edp.130.1508747785285; Mon, 23 Oct 2017 01:36:25 -0700 (PDT) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id h51sm6102288eda.56.2017.10.23.01.36.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Oct 2017 01:36:24 -0700 (PDT) Date: Mon, 23 Oct 2017 10:36:12 +0200 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Ophir Munk Cc: "dev@dpdk.org" , Thomas Monjalon , Olga Shern , "stable@dpdk.org" , Matan Azrad Message-ID: <20171023083612.GK3596@bidouze.vm.6wind.com> References: <1506203877-2090-1-git-send-email-ophirmu@mellanox.com> <1507243328-11287-1-git-send-email-ophirmu@mellanox.com> <20171020103518.GH3596@bidouze.vm.6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v3] net/failsafe: fix calling device during RMV events X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Oct 2017 08:36:27 -0000 On Mon, Oct 23, 2017 at 07:17:41AM +0000, Ophir Munk wrote: > Hi Gaetan, > Thanks for your quick reply. Please see comments inline. > > Regards, > Ophir > > > -----Original Message----- > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com] > > Sent: Friday, October 20, 2017 1:35 PM > > To: Ophir Munk > > Cc: dev@dpdk.org; Thomas Monjalon ; Olga Shern > > ; stable@dpdk.org > > Subject: Re: [PATCH v3] net/failsafe: fix calling device during RMV events > > > > Hi Ophir, > > > > Sorry about the delay, > > I have a few remarks, I think this patch could be simpler. > > > > First, about the commit logline: > > "calling device" is not descriptive enough. I'd suggest > > > > net/failsafe: fix device configuration during RMV events > > > > But I'm not a native speaker either, so use it if you think it is better, or don't, > > it's only a suggestion :). > > > > On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote: > > > This commit prevents control path operations from failing after a sub > > > device removal. > > > > > > Following are the failure steps: > > > 1. The physical device is removed due to change in one of PF > > > parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > > > Within 2 seconds Interrupt thread initializes the actual device > > > removal, then every 2 seconds it tries to re-sync (plug in) the > > > device. The trials fail as long as VF parameter mismatches the PF > > parameter. > > > 4. A control thread initiates a control operation on failsafe which > > > initiates this operation on the device. > > > 5. A race condition occurs between the control thread and interrupt > > > thread when accessing the device data structures. > > > > > > This commit prevents the race condition in step 5. Before this commit > > > if a device was removed and then a control thread operation was > > > initiated on failsafe - in some cases failsafe called the sub device > > > operation instead of avoiding it. Such cases could lead to operations > > failures. > > > > > > > This is a nitpick, but as said earlier, this is not preventing the race condition. > > This race is still present and can still wreak havok on unsuspecting users. > > > > If an application has a weak threading model, it will be subject to this race > > condition still. It is possible to prevent it fully with proper care from the > > application standpoint, but this is not specific to fail-safe and does not > > concern us here. > > > > Anyway, it's really a nitpick, I just wanted to point it out. This is not too > > important for this patch. > > > > > This commit fixes failsafe criteria to determine when the device is > > > removed such that it will avoid calling the sub device operations > > > during that time and will only call them otherwise. > > > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > > Cc: stable@dpdk.org > > > > > > Signed-off-by: Ophir Munk > > > --- > > > v3: > > > 1. Rebase v2 > > > > > > 2. Please ignore checkpatch checks on arguments re-usage - they are > > confirmed. > > > CHECK:MACRO_ARG_REUSE: Macro argument reuse ... possible side- > > effects? > > > #217: FILE: drivers/net/failsafe/failsafe_private.h:241: > > > > > > 3. Add rationales (copy from an email which accompanied v2): > > > > > > On Monday, September 11, 2017 11:31 AM, Gaetan Rivet wrote: > > > > > > > > Hi Ophir, > > > > > > > > On Sat, Sep 09, 2017 at 07:27:11PM +0000, Ophir Munk wrote: > > > > > This commit prevents control path operations from failing after a > > > > > sub device has informed failsafe it has been removed. > > > > > > > > > > Before this commit if a device was removed and then a control path > > > > > > > > Here are the steps if I understood correctly: > > > > > > > > 0. The physical device is removed > > > > 1. The interrupt thread flags the device 2. A control lcore > > > > initiates a control operation 3. The alarm triggers, waking up the eal-intr- > > thread, > > > > initiating the actual device removal. > > > > 4. Race condition occurs between control lcore and interrupt thread. > > > > > > > > "if a device was removed" is ambiguous I think (are we speaking > > > > about the physical port? Is it only flagged? Is it after the removal of the > > device itself?). > > > > From the context I gather that you mean the device is flagged to be > > > > removed, but it won't be as clear in a few month when we revisit this bug > > :) . > > > > > > > > Could you please rephrase this so that the whole context of the > > > > issue is available? > > > > > > > > > > Done. Commit message was rephrased based on your comments > > > > > > > > operations was initiated on failsafe - in some cases failsafe > > > > > called the sub device operation instead of avoiding it. Such cases > > > > > could lead to operations failures. > > > > > > > > > > This commit fixes failsafe criteria to determine when the device > > > > > is removed such that it will avoid calling the sub device > > > > > operations during that time and will only call them otherwise. > > > > > > > > > > > > > This commit mitigates the race condition, reducing the probability > > > > for it to have an effect. It does not, however, remove this race > > > > condition, which is inherent to the DPDK architecture at the moment. > > > > > > > > A proper fix, a more detailed workaround and additional > > > > documentation warning users writing applications to mind their threads > > could be interesting. > > > > > > > > > > The race condition occurs in the last step and may lead to > > > segmentation faults (accessing data structures of the same device by 2 > > > threads) The previous steps ("the physical device is removed", etc) were not > > recreated and tested but probably cannot lead to segmentation fault. > > > > > > > But let's focus on this patch for the time being. > > > > > > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > > > > Cc: stable@dpdk.org > > > > > > > > > > Signed-off-by: Ophir Munk > > > > > --- > > > > > drivers/net/failsafe/failsafe_ether.c | 1 + > > > > > drivers/net/failsafe/failsafe_ops.c | 52 > > > > +++++++++++++++++++++++++++++------ > > > > > 2 files changed, 45 insertions(+), 8 deletions(-) > > > > > > > > > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > > > > b/drivers/net/failsafe/failsafe_ether.c > > > > > index a3a8cce..1def110 100644 > > > > > --- a/drivers/net/failsafe/failsafe_ether.c > > > > > +++ b/drivers/net/failsafe/failsafe_ether.c > > > > > @@ -378,6 +378,7 @@ > > > > > > > > Could you please generate your patches with the function name in the > > diff? > > > > > > Done > > > > > > > > > > > > i); > > > > > goto err_remove; > > > > > } > > > > > + sdev->remove = 0; > > > > > > > > You are adding this here, within failsafe_eth_dev_state_sync, and > > > > removing it from the dev_configure ops. > > > > > > > > 10 lines above, the call to dev_configure is done, meaning that the > > > > remove flag was resetted at this point. > > > > > > > > Can you explain why you prefer resetting the flag here? > > > > > > > > The position of this flag reset will be dependent upon my subsequent > > > > remarks anyway, so hold that thought :) . > > > > > > > > > > The motivation for resetting the "remove" flag within > > failsafe_eth_dev_state_sync is as follows: > > > Previously to this patch the "remove" flag was designed to signal the need > > to remove the sub device. > > > Once the sub device was removed and before being reconfigured the > > "remove" flag was reset. > > > > > > After this patch the scope of the "remove" flag was *extended* to > > > indicate the sub device status as being "plugged out" by resetting this flag > > only after a successful call to failsafe_eth_dev_state_sync(). > > > The "plug out" status could last a very long time (seconds, minutes, days, > > weeks, ...). > > > > > > Previously to this patch failsafe based the "plugged out" status on > > > the sub device state as being below ACTIVE however every 2 seconds > > > dev_configure() was called where the sub device was assigned sdev- > > > >state = DEV_ACTIVE; therefore the sub device state became ACTIVE for > > some time every 2 seconds. > > > This is where the race condition occurred: failsafe considered the sub > > > device as "Plugged in" for some time every 2 seconds (based on its ACTIVE > > state) while it was actually plugged out. > > > > > > After this patch the "Plugged out" status is based on the "remove" flag. > > > > > > > Sorry, I do not agree with this semantical change on the "remove" flag. > > You are essentially adding a new device state, which could be fine per se, but > > should not be done here. > > > > The enum dev_state is there for this purpose. > > > > The flag dev->remove, calls for an operation to be done upon the concerned > > device. It is not meant to become a new device state. > > > > A point about the work methodoly here: if you wanted to change this > > semantic, which could be legitimate and sometimes called for, you should > > have proposed it either during a discussion in a response to my previous > > email, or introducing the change as a separate patch. This point is important > > enough for it to have its own patch, meaning we would have a whole thread > > dedicated to it instead of having to interleave commentaries between > > related-but-separate diffs on the code. > > > > But anyway, if you think you need to express a PLUGOUT state, I'd suggest > > adding a state between DEV_UNDEFINED and DEV_PARSED. > > DEV_UNDEFINED means that the device is in limbo and has no existence per > > se (its parsing failed for example, it is not clear whether the parameters are > > correct, etc...). DEV_PLUGOUT could mean then that the device has been > > successfully probed at least once, meaning that it could possibly have > > residuals from this probing still there, or specific care to be taken when > > manipulating it. > > > > However, I'm not yet convinced that this new state is necessary. I think you > > can mitigate this race condition without having to add it. If you insist in > > introducing this state, please do so in a separate patch, with proper > > definition about the meaning of this state: > > > > + When it should be valid for a device to be in this state. > > + Which operation corresponds to getting into and out of this state. > > + Why this state is interesting and what could not be expressed before > > that is thus being fixed by introducing this state. > > > > But please verify twice whether you absolutely need to complexify the > > current fail-safe internals before going all in and basing your work upon it :) > > > > Indeed what I am currently missing in failsafe is knowing the device is in a PLUGOUT state. > Your suggestion to add a new state DEV_PLUGOUT cannot be used with the current implementation > as the device states are modified during an alarm hotplug handling every 2 seconds. > In fs_hotplug_alarm() we call failsafe_eth_dev_state_sync(dev) which eventually calls ->dev_configure(dev) > where we assign: sdev->state = DEV_ACTIVE; > then when sync fails fs_hotplug_alarm() calls failsafe_dev_remove(dev) which will call fs_dev_remove(sdev); where the sub devices > states are changed from ACTIVE down to DEV_UNDEFINED. > Having said that it means that during a PLUGOUT event - the states are modified with each invocation of the fs_hotplug_alarm > every 2 seconds. So even if we added DEV_PLUGOUT state - it will not remain fixed during the hotplug alarm handling. > I have also verified all of this with printouts. > When seeing a sub device in state "DEV_ACTIVE" - we cannot tell whether the device is currently in "PLUGOUT situation" or "PLUGIN situation" > This allows operations such as fs_mtu_set() on sub-devices which are in "PLUGOUT situation" while their state is > DEV_ACTIVE to be manipulated, which I think should have been avoided. > > Please note fs_mtu_set() implementation: > tatic int fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) > { > .. > FOREACH_SUBDEV_ACTIVE(sdev, i, dev) { > // ***** We are here while the device can be in a "PLUGOUT situation" *** > > To summarize: > I am missing a way to know in failsafe that a sub-device is currently plugged out (sdev->state < DEV_ACTIVE && !sdev->remove) means that the device is plugged out. > 1. I suggested extending the "remove" flag scope for this purpose. It has minimal changes with current failsafe implementation. You prefer not using "remove". I prefer using it, but as a flag, not as a device state. > 2. You suggested adding a new state DEV_PLUGOUT. I don't think it will work with current implementation (as explained above) or may require a redesign of current implementation. > I do not suggest adding a new state DEV_PLUGOUT. I suggest using sdev->remove properly. > Can you suggest another way? > 0. In a separate commit, move the sdev->remove = 0; from fs_dev_configure, into the case DEV_UNDEFINED of the switch within fs_dev_remove. This is cleaner and more logical anyway. 1. Check that sdev->remove is not set in fs_find_next. If sdev->remove is set, then the device should be skipped. 2. In failsafe_dev_remove, do not use the FOREACH_SUBDEV_STATE iterator, but manually iterate over all sub-devices using the subs_tail and subs_head values. As the generic iterator would skip over devices which have sdev->remove set, this function would not work anymore. 3. Find the places I have missed that needs this manual iterator to be used instead of the FOREACH_SUBDEV{,_STATE} ones. I think there is at least other place that I cannot recall, there might be more. If you think this does not work, please tell me why, because then it means that I have misunderstood something about the race condition you are trying to fix. -- Gaëtan Rivet 6WIND