From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0043.outbound.protection.outlook.com [104.47.1.43]) by dpdk.org (Postfix) with ESMTP id 257D01B343; Mon, 23 Oct 2017 09:17:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=l3g+5vAk+meNB+2RekRpoSj27PmxTfvdqe8WBpWKOY4=; b=v+xD/5FxWncYPxfimVuPVsilZtTUvpvzN9iCODGdhhkW4KSAUO2zTqoXUYCIvadjB5FBpR11NBMn0FqK+6/Mz5MbUBj2TirC9P9zUNtkbTHl3t5H/8OUTh4DLLfYbXFC2y9W+YegJO2HZCjhjfEms9Ez0PDPCBLlmWcrZgPA4HA= Received: from DB5PR05MB1254.eurprd05.prod.outlook.com (10.162.157.140) by DB5PR05MB1255.eurprd05.prod.outlook.com (10.162.157.141) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.156.4; Mon, 23 Oct 2017 07:17:41 +0000 Received: from DB5PR05MB1254.eurprd05.prod.outlook.com ([fe80::69a8:7948:e0cb:c16e]) by DB5PR05MB1254.eurprd05.prod.outlook.com ([fe80::69a8:7948:e0cb:c16e%13]) with mapi id 15.20.0156.007; Mon, 23 Oct 2017 07:17:41 +0000 From: Ophir Munk To: =?iso-8859-1?Q?Ga=EBtan_Rivet?= CC: "dev@dpdk.org" , Thomas Monjalon , "Olga Shern" , "stable@dpdk.org" , Ophir Munk , Matan Azrad Thread-Topic: [PATCH v3] net/failsafe: fix calling device during RMV events Thread-Index: AQHTPitOAnvbDeox8U+TVBXlDPhix6LsoegAgARi/KA= Date: Mon, 23 Oct 2017 07:17:41 +0000 Message-ID: References: <1506203877-2090-1-git-send-email-ophirmu@mellanox.com> <1507243328-11287-1-git-send-email-ophirmu@mellanox.com> <20171020103518.GH3596@bidouze.vm.6wind.com> In-Reply-To: <20171020103518.GH3596@bidouze.vm.6wind.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=ophirmu@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB5PR05MB1255; 6:ja5cWLdxg33SxeVYZ7EG8EMMAgp3IVy7cxIfOecPu2yYwB0Sg0l5aF3771KqTBXpPWBtJ27tLZ72jaR/+eaFG4YfmdByWDhRl6xlQp9unlf3/z70N/9O4HzKkKtl0SUVQsvhBvlw8iRUG0YWqjnRyADTASYDIzqHHyouopGOrfqAdYVaUHi6g+CUEe9xcZVxl2FuqZSJ90r7HoazdjoyXV939x7lNwNVF0zH+jKJlDRAukZjmXqNMUyIXtJtX/NaROjkl01XCTjZ6vuCYPq6ND3Cpmrjjh8ZuQF+9QNHOLE54b9R6uz0X+MhQvpit6DoJggMQgRZN/OMli6/MkO/hA==; 5:/43fam2QzyCGyMWPSMQGXBP7SwCxHtL2rf3n0K5fV5p78uqTfZPwnnee5p/feMNS4QJNr01XxHVFtw+eceMvlnXBZDRUQJnEyZ+Xuz7NiKKnDjBLufAHNqnyK8skYygddhLftKxt6KmBo1ekYt3WTQ==; 24:rZv2GX1Kw2xvQQgFUYe3NJNA/5wibUu0ySMKdgPgDWyi27eFEWeXDsYHPTKCmnEPg2zU+2rn3DYbxiG43IlGdaTIRq1tABL2PrZFjV56c1g=; 7:9oDamlNqxKOivdPTDpu/DQxvCmG5WbZ8QaGkajwNNFRL0SDnY+8TlN5hzRjXy3hNC6lPQ9fKfUYYcixMu/lWpi16jBZwPgUClCRhChaTG0Wm+mZMbZEk1+Qs+Y37aB7fAOc/GUcWGUKbyMXESAvo4FZ2Pz5iRyCOJtfzofTdjmkt9XlTLzSG+Vr+8jlNvFVv+eL7eRiOVSxv1r8vyFQvZNnGrE5elHlt8CG/lnwFqUY= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 5e3e8e19-c66f-4e38-fc14-08d519e62607 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(48565401081)(4534020)(4602075)(4627075)(201703031133081)(201702281549075)(2017052603199); SRVR:DB5PR05MB1255; x-ms-traffictypediagnostic: DB5PR05MB1255: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-exchange-antispam-report-test: UriScan:(278428928389397)(788757137089); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(93001095)(100000703101)(100105400095)(10201501046)(3231020)(3002001)(6055026)(6041248)(20161123558100)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123562025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB5PR05MB1255; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB5PR05MB1255; x-forefront-prvs: 046985391D x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(346002)(376002)(39860400002)(51694002)(24454002)(189002)(199003)(13464003)(52314003)(2950100002)(9686003)(53936002)(6916009)(99286003)(55016002)(76176999)(6116002)(229853002)(54356999)(6246003)(2906002)(3280700002)(50986999)(6436002)(5250100002)(106356001)(66066001)(81166006)(81156014)(6506006)(53946003)(3660700001)(68736007)(14454004)(105586002)(102836003)(3846002)(8936002)(8676002)(4326008)(74316002)(53546010)(33656002)(97736004)(54906003)(101416001)(2900100001)(478600001)(107886003)(305945005)(25786009)(189998001)(7696004)(7736002)(316002)(5660300001)(86362001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB5PR05MB1255; H:DB5PR05MB1254.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5e3e8e19-c66f-4e38-fc14-08d519e62607 X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Oct 2017 07:17:41.2460 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR05MB1255 Subject: Re: [dpdk-dev] [PATCH v3] net/failsafe: fix calling device during RMV events X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Oct 2017 07:17:44 -0000 Hi Gaetan, Thanks for your quick reply. Please see comments inline. Regards, Ophir > -----Original Message----- > From: Ga=EBtan Rivet [mailto:gaetan.rivet@6wind.com] > Sent: Friday, October 20, 2017 1:35 PM > To: Ophir Munk > Cc: dev@dpdk.org; Thomas Monjalon ; Olga Shern > ; stable@dpdk.org > Subject: Re: [PATCH v3] net/failsafe: fix calling device during RMV event= s >=20 > Hi Ophir, >=20 > Sorry about the delay, > I have a few remarks, I think this patch could be simpler. >=20 > First, about the commit logline: > "calling device" is not descriptive enough. I'd suggest >=20 > net/failsafe: fix device configuration during RMV events >=20 > But I'm not a native speaker either, so use it if you think it is better,= or don't, > it's only a suggestion :). >=20 > On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote: > > This commit prevents control path operations from failing after a sub > > device removal. > > > > Following are the failure steps: > > 1. The physical device is removed due to change in one of PF > > parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > > Within 2 seconds Interrupt thread initializes the actual device > > removal, then every 2 seconds it tries to re-sync (plug in) the > > device. The trials fail as long as VF parameter mismatches the PF > parameter. > > 4. A control thread initiates a control operation on failsafe which > > initiates this operation on the device. > > 5. A race condition occurs between the control thread and interrupt > > thread when accessing the device data structures. > > > > This commit prevents the race condition in step 5. Before this commit > > if a device was removed and then a control thread operation was > > initiated on failsafe - in some cases failsafe called the sub device > > operation instead of avoiding it. Such cases could lead to operations > failures. > > >=20 > This is a nitpick, but as said earlier, this is not preventing the race c= ondition. > This race is still present and can still wreak havok on unsuspecting user= s. >=20 > If an application has a weak threading model, it will be subject to this = race > condition still. It is possible to prevent it fully with proper care from= the > application standpoint, but this is not specific to fail-safe and does no= t > concern us here. >=20 > Anyway, it's really a nitpick, I just wanted to point it out. This is not= too > important for this patch. >=20 > > This commit fixes failsafe criteria to determine when the device is > > removed such that it will avoid calling the sub device operations > > during that time and will only call them otherwise. > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > Cc: stable@dpdk.org > > > > Signed-off-by: Ophir Munk > > --- > > v3: > > 1. Rebase v2 > > > > 2. Please ignore checkpatch checks on arguments re-usage - they are > confirmed. > > CHECK:MACRO_ARG_REUSE: Macro argument reuse ... possible side- > effects? > > #217: FILE: drivers/net/failsafe/failsafe_private.h:241: > > > > 3. Add rationales (copy from an email which accompanied v2): > > > > On Monday, September 11, 2017 11:31 AM, Gaetan Rivet wrote: > > > > > > Hi Ophir, > > > > > > On Sat, Sep 09, 2017 at 07:27:11PM +0000, Ophir Munk wrote: > > > > This commit prevents control path operations from failing after a > > > > sub device has informed failsafe it has been removed. > > > > > > > > Before this commit if a device was removed and then a control path > > > > > > Here are the steps if I understood correctly: > > > > > > 0. The physical device is removed > > > 1. The interrupt thread flags the device 2. A control lcore > > > initiates a control operation 3. The alarm triggers, waking up the ea= l-intr- > thread, > > > initiating the actual device removal. > > > 4. Race condition occurs between control lcore and interrupt thread. > > > > > > "if a device was removed" is ambiguous I think (are we speaking > > > about the physical port? Is it only flagged? Is it after the removal = of the > device itself?). > > > From the context I gather that you mean the device is flagged to be > > > removed, but it won't be as clear in a few month when we revisit this= bug > :) . > > > > > > Could you please rephrase this so that the whole context of the > > > issue is available? > > > > > > > Done. Commit message was rephrased based on your comments > > > > > > operations was initiated on failsafe - in some cases failsafe > > > > called the sub device operation instead of avoiding it. Such cases > > > > could lead to operations failures. > > > > > > > > This commit fixes failsafe criteria to determine when the device > > > > is removed such that it will avoid calling the sub device > > > > operations during that time and will only call them otherwise. > > > > > > > > > > This commit mitigates the race condition, reducing the probability > > > for it to have an effect. It does not, however, remove this race > > > condition, which is inherent to the DPDK architecture at the moment. > > > > > > A proper fix, a more detailed workaround and additional > > > documentation warning users writing applications to mind their thread= s > could be interesting. > > > > > > > The race condition occurs in the last step and may lead to > > segmentation faults (accessing data structures of the same device by 2 > > threads) The previous steps ("the physical device is removed", etc) wer= e not > recreated and tested but probably cannot lead to segmentation fault. > > > > > But let's focus on this patch for the time being. > > > > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > > > Cc: stable@dpdk.org > > > > > > > > Signed-off-by: Ophir Munk > > > > --- > > > > drivers/net/failsafe/failsafe_ether.c | 1 + > > > > drivers/net/failsafe/failsafe_ops.c | 52 > > > +++++++++++++++++++++++++++++------ > > > > 2 files changed, 45 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > > > b/drivers/net/failsafe/failsafe_ether.c > > > > index a3a8cce..1def110 100644 > > > > --- a/drivers/net/failsafe/failsafe_ether.c > > > > +++ b/drivers/net/failsafe/failsafe_ether.c > > > > @@ -378,6 +378,7 @@ > > > > > > Could you please generate your patches with the function name in the > diff? > > > > Done > > > > > > > > > i); > > > > goto err_remove; > > > > } > > > > + sdev->remove =3D 0; > > > > > > You are adding this here, within failsafe_eth_dev_state_sync, and > > > removing it from the dev_configure ops. > > > > > > 10 lines above, the call to dev_configure is done, meaning that the > > > remove flag was resetted at this point. > > > > > > Can you explain why you prefer resetting the flag here? > > > > > > The position of this flag reset will be dependent upon my subsequent > > > remarks anyway, so hold that thought :) . > > > > > > > The motivation for resetting the "remove" flag within > failsafe_eth_dev_state_sync is as follows: > > Previously to this patch the "remove" flag was designed to signal the n= eed > to remove the sub device. > > Once the sub device was removed and before being reconfigured the > "remove" flag was reset. > > > > After this patch the scope of the "remove" flag was *extended* to > > indicate the sub device status as being "plugged out" by resetting this= flag > only after a successful call to failsafe_eth_dev_state_sync(). > > The "plug out" status could last a very long time (seconds, minutes, da= ys, > weeks, ...). > > > > Previously to this patch failsafe based the "plugged out" status on > > the sub device state as being below ACTIVE however every 2 seconds > > dev_configure() was called where the sub device was assigned sdev- > > >state =3D DEV_ACTIVE; therefore the sub device state became ACTIVE for > some time every 2 seconds. > > This is where the race condition occurred: failsafe considered the sub > > device as "Plugged in" for some time every 2 seconds (based on its ACTI= VE > state) while it was actually plugged out. > > > > After this patch the "Plugged out" status is based on the "remove" flag= . > > >=20 > Sorry, I do not agree with this semantical change on the "remove" flag. > You are essentially adding a new device state, which could be fine per se= , but > should not be done here. >=20 > The enum dev_state is there for this purpose. >=20 > The flag dev->remove, calls for an operation to be done upon the concerne= d > device. It is not meant to become a new device state. >=20 > A point about the work methodoly here: if you wanted to change this > semantic, which could be legitimate and sometimes called for, you should > have proposed it either during a discussion in a response to my previous > email, or introducing the change as a separate patch. This point is impor= tant > enough for it to have its own patch, meaning we would have a whole thread > dedicated to it instead of having to interleave commentaries between > related-but-separate diffs on the code. >=20 > But anyway, if you think you need to express a PLUGOUT state, I'd suggest > adding a state between DEV_UNDEFINED and DEV_PARSED. > DEV_UNDEFINED means that the device is in limbo and has no existence per > se (its parsing failed for example, it is not clear whether the parameter= s are > correct, etc...). DEV_PLUGOUT could mean then that the device has been > successfully probed at least once, meaning that it could possibly have > residuals from this probing still there, or specific care to be taken whe= n > manipulating it. >=20 > However, I'm not yet convinced that this new state is necessary. I think = you > can mitigate this race condition without having to add it. If you insist = in > introducing this state, please do so in a separate patch, with proper > definition about the meaning of this state: >=20 > + When it should be valid for a device to be in this state. > + Which operation corresponds to getting into and out of this state. > + Why this state is interesting and what could not be expressed before > that is thus being fixed by introducing this state. >=20 > But please verify twice whether you absolutely need to complexify the > current fail-safe internals before going all in and basing your work upon= it :) >=20 Indeed what I am currently missing in failsafe is knowing the device is in = a PLUGOUT state. Your suggestion to add a new state DEV_PLUGOUT cannot be used with the curr= ent implementation as the device states are modified during an alarm hotplug handling every 2 = seconds. In fs_hotplug_alarm() we call failsafe_eth_dev_state_sync(dev) which eventu= ally calls ->dev_configure(dev)=20 where we assign: sdev->state =3D DEV_ACTIVE;=20 then when sync fails fs_hotplug_alarm() calls failsafe_dev_remove(dev) whic= h will call fs_dev_remove(sdev); where the sub devices states are changed from ACTIVE down to DEV_UNDEFINED. Having said that it means that during a PLUGOUT event - the states are modi= fied with each invocation of the fs_hotplug_alarm every 2 seconds. So even if we added DEV_PLUGOUT state - it will not remain= fixed during the hotplug alarm handling. I have also verified all of this with printouts. When seeing a sub device in state "DEV_ACTIVE" - we cannot tell whether the= device is currently in "PLUGOUT situation" or "PLUGIN situation" This allows operations such as fs_mtu_set() on sub-devices which are in "PL= UGOUT situation" while their state is=20 DEV_ACTIVE to be manipulated, which I think should have been avoided. Please note fs_mtu_set() implementation: tatic int fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) { .. FOREACH_SUBDEV_ACTIVE(sdev, i, dev) { // ***** We are here while the device can be in a "PLUGOUT situation" *** To summarize: I am missing a way to know in failsafe that a sub-device is currently plugg= ed out 1. I suggested extending the "remove" flag scope for this purpose. It has m= inimal changes with current failsafe implementation. You prefer not using "= remove". 2. You suggested adding a new state DEV_PLUGOUT. I don't think it will work= with current implementation (as explained above) or may require a redesign= of current implementation. Can you suggest another way? > > > > } > > > > } > > > > /* > > > > diff --git a/drivers/net/failsafe/failsafe_ops.c > > > > b/drivers/net/failsafe/failsafe_ops.c > > > > index ff9ad15..314d53d 100644 > > > > --- a/drivers/net/failsafe/failsafe_ops.c > > > > +++ b/drivers/net/failsafe/failsafe_ops.c > > > > @@ -232,7 +232,6 @@ > > > > dev->data->dev_conf.intr_conf.lsc =3D 0; > > > > } > > > > DEBUG("Configuring sub-device %d", i); > > > > - sdev->remove =3D 0; > > > > ret =3D rte_eth_dev_configure(PORT_ID(sdev), > > > > dev->data->nb_rx_queues, > > > > dev->data->nb_tx_queues, > > > > @@ -311,6 +310,8 @@ > > > > int ret; > > > > > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > > > + if (sdev->remove) > > > > + continue; > > > > DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", > > > i); > > > > ret =3D rte_eth_dev_set_link_up(PORT_ID(sdev)); > > > > if (ret) { > > > > @@ -330,6 +331,8 @@ > > > > int ret; > > > > > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > > > + if (sdev->remove) > > > > + continue; > > > > > > For this change and all the others: > > > > > > I think it might be best to have this check added to fs_find_next dir= ectly. > > > > > > Most of the call to the iterators are done within dev_ops, so it > > > makes sense I think to have it there. > > > > > > But then there'd be an issue with the sub-EAL iterations done on > > > previously- removed ports, as the removed flag is precisely resetted > > > too late. The function failsafe_dev_remove would also need to have a > > > manual iteration upon the sub-devices instead of using the macro. > > > > > > I think you can actually reset this flag within fs_dev_remove, > > > instead of the next plug-in, then having this check within > > > fs_find_next > > > *should* not be a problem. > > > > > > > With the new scope of "remove" flag (remaining set to 1 as long as the = sub > device is "plugged out" > > which may last for a very long time) we cannot reset it in > > fs_dev_remove which is called every 2 seconds. > > >=20 > With the remove flag staying as it is, I think it should thus be resetted= within > fs_dev_remove. Actually I think it both helps you write you fix, and clar= ify the > meaning and intended purpose of this flag. >=20 > > > I think you should break up those changes in two: first move the > > > flag reset to fs_dev_remove instead of fs_dev_configure, then add > > > this check to the iterator. > > > >=20 > Please, do this fix this way. I think moving the dev->remove flag can hav= e > subtile consequences, and I'd like to have a specific commit to trace bac= k > which one is responsible. >=20 > > > This way, a git bisect should allow us to pinpoint more easily any > > > new bug as both changes have the potential to introduce subtle ones. > > > >=20 > Well, like I said :). >=20 > > > > I suggest defining a new macro > > > > FOREACH_SUBDEV_ACTIVE(sdev, i, dev) { ... > > > > that will replace all cases of: > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > if (sdev->remove) > > continue; > > > > In order to support the new macro I added a "check_remove" flag to > > fs_find_next (which is based on your idea above: "I think it might be b= est to > have this check added to fs_find_next directly"). > > >=20 > I'd prefer avoiding multiplying the macros. I agree. Should be avoided. > There are already two iterators. You add one, which now means that there > are two ways of iterating upon active devices: using you new macro, and > using the old one. The difference between the two would be difficult to > know, without profound knowledge of the rest of the code: that in one > place the flag is checked, and in the other it is not. >=20 > As such, I suggest you check in all cases that the flag is not set. This > simplifies the use of these macros and the conditions in which their use > is correct. >=20 > This means that you have to manually iterate in places where this flag > should be ignored. I listed these places in my previous email, but I may > have missed some, please be careful. >=20 I already did it in v1 then changed it in V2/3 based on reviews (probably m= y misunderstanding) > Thanks, > -- > Ga=EBtan Rivet > 6WIND