From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10048.outbound.protection.outlook.com [40.107.1.48]) by dpdk.org (Postfix) with ESMTP id 488891B015; Thu, 14 Dec 2017 14:07:32 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=xx6HBiPTeQBjXdlwXLgJhiJ721x0hVkp5uodbJEhhrI=; b=EB0FE9EzB3ZHNR/DVChrJYY5OF930BZoW/88pLU3ODN9WEZ7jiOnvB2RUkdwFvpkI31L+8bOG3kBcNsCNxXgVL+HMYRikMuWxv9eVpU4AwiOxPabDEeThrLs1TeTGUVp2ioHw1eQl/teYFcrgiQ0gwPJbz/QjsuT6u49FG//ihI= Received: from HE1PR0502MB3659.eurprd05.prod.outlook.com (10.167.127.17) by HE1PR0502MB3657.eurprd05.prod.outlook.com (10.167.127.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.282.5; Thu, 14 Dec 2017 13:07:31 +0000 Received: from HE1PR0502MB3659.eurprd05.prod.outlook.com ([fe80::982e:2dce:9449:6891]) by HE1PR0502MB3659.eurprd05.prod.outlook.com ([fe80::982e:2dce:9449:6891%13]) with mapi id 15.20.0282.012; Thu, 14 Dec 2017 13:07:31 +0000 From: Matan Azrad To: =?iso-8859-1?Q?Ga=EBtan_Rivet?= CC: Adrien Mazarguil , Thomas Monjalon , "dev@dpdk.org" , "stable@dpdk.org" Thread-Topic: [PATCH v2 4/4] net/failsafe: fix removed device handling Thread-Index: AQHTdCVqeFZYAzJ0Kk2VOrfpqCAdSKNBZw+wgAAKGACAAGDPgIAA0g4ggAAF+ACAACKyYA== Date: Thu, 14 Dec 2017 13:07:31 +0000 Message-ID: References: <1509637324-13525-1-git-send-email-matan@mellanox.com> <1513175370-16583-1-git-send-email-matan@mellanox.com> <1513175370-16583-5-git-send-email-matan@mellanox.com> <20171213151641.g42zr7zupbsdgxsv@bidouze.vm.6wind.com> <20171213160916.e3rmxmhfhqz72wco@bidouze.vm.6wind.com> <20171213215545.kywwximn2g5xm5x5@bidouze.vm.6wind.com> <20171214104856.d5qgnawuzb54l36z@bidouze.vm.6wind.com> In-Reply-To: <20171214104856.d5qgnawuzb54l36z@bidouze.vm.6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; HE1PR0502MB3657; 6:/c3naAvoxy5ab18Bn/7kJPzfHVXWRFuSVsWqrxLxEL5VP0cgzbBaI7NtM7QZiD+5xGblM3uUSknWsnYKW9oTbI1fIUjpnUYyWEeknd8vX8PZtzdgfHEKQqX9Il+0EkFL9+kf7IB1PryHKZzsASBu5daA9G/GmGxEbRGYAG8f6Wgsb6lGVE+Ee2Z9JI0OmKa5Si3y9/Dq2mSLeDz/zyvLJei9FNTdU7Mc4JGmcoEwe8pz3WSLNsO613YFCUjswDzA4uNZ+tOCWYdRoP9tGlKMjmW9k25vcHc5BLgbaVGYhGLaFIpNHVmnOqnycH+SEpcCxjLKTgabqPwkQnz40OqXsntCKyV2SXlHU9oz+qYMz64=; 5:Dvh+W5mDqFiR82B2Ty8EpI1F7gjjbki5vBINH+d703rXpq566NIzvkLqmcjxmTrEx1UdGF6SxK+N39/aiw7giO0O/+qsjoAxhf0ApZaH9x6hi2TgGZHn9VMSXzQ4emQbOZ4jZ1zIijiEZBFFSElC7TimdhEZ42kMmMeFdT0a7Es=; 24:dRYLWPR9qf27l6UGjaSvdVVOG5UwyRpRjI+omnizDfeMMc37UMsxHnK9S3X2lBRqKaqU8gS7lD4xFdfftdyVC8Z6t2eezvIqat0mc7yk0vc=; 7:11627p4y0jEJe84CF+JBb5XADGG0H8WatRNiDi4gWtF9YmmEix5szGqopQqa/46wsaOXEt6BXQC8DMcG0KzsN1mBIl383xX//vNzxmOC08k1B1LT7DSxkGJOp7bCb4ghBz7CjDU0H/9Pd/f6zfeK3uYk8U6B/nms91rX4paRAlBUooun9S+9AD1OEAEKrCRWFXgqI9GDLZNfWR3TrC7kWx3IQxJbyetKhUU5bFShAJhUoCFJ4/bte8cLmWk16scl x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: f1ecf4e8-707b-4823-9ec6-08d542f3a2d4 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(5600026)(4604075)(48565401081)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(2017052603307); SRVR:HE1PR0502MB3657; x-ms-traffictypediagnostic: HE1PR0502MB3657: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040450)(2401047)(5005006)(8121501046)(3231023)(3002001)(10201501046)(93006095)(93001095)(6055026)(6041248)(20161123558100)(20161123560025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123564025)(6072148)(201708071742011); SRVR:HE1PR0502MB3657; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:HE1PR0502MB3657; x-forefront-prvs: 05214FD68E x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(979002)(39860400002)(346002)(396003)(366004)(376002)(24454002)(13464003)(76104003)(189003)(199004)(51444003)(3846002)(93886005)(2906002)(76176011)(5660300001)(8676002)(2950100002)(53936002)(2900100001)(6916009)(81166006)(33656002)(81156014)(478600001)(86362001)(66066001)(6506007)(99286004)(6246003)(6116002)(4326008)(305945005)(102836003)(5250100002)(7696005)(3660700001)(3280700002)(25786009)(6436002)(229853002)(14454004)(55016002)(97736004)(9686003)(54906003)(59450400001)(68736007)(316002)(106356001)(74316002)(8936002)(53546011)(7736002)(105586002)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0502MB3657; H:HE1PR0502MB3659.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: f1ecf4e8-707b-4823-9ec6-08d542f3a2d4 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Dec 2017 13:07:31.7560 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0502MB3657 Subject: Re: [dpdk-dev] [PATCH v2 4/4] net/failsafe: fix removed device handling X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Dec 2017 13:07:33 -0000 Hi Gaetan > -----Original Message----- > From: Ga=EBtan Rivet [mailto:gaetan.rivet@6wind.com] > Sent: Thursday, December 14, 2017 12:49 PM > To: Matan Azrad > Cc: Adrien Mazarguil ; Thomas Monjalon > ; dev@dpdk.org; stable@dpdk.org > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling >=20 > On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote: > > Hi Gaetan > > >=20 > >=20 > > > > > > > > If you add this check in the iterator itself, you would skip > > > > removed devices before attempting operating upon them, right? > > > > > > > > Then it should probably help with your issue, unless you tested it > > > > and verified that it didnt? > > > > > > > > Something like this: > > > > > > > > ---8<--- > > > > > > > > diff --git a/drivers/net/failsafe/failsafe_private.h > > > > b/drivers/net/failsafe/failsafe_private.h > > > > index d81cc3ca6..62ddc0689 100644 > > > > --- a/drivers/net/failsafe/failsafe_private.h > > > > +++ b/drivers/net/failsafe/failsafe_private.h > > > > @@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev, > > > > subs =3D PRIV(dev)->subs; > > > > tail =3D PRIV(dev)->subs_tail; > > > > while (sid < tail) { > > > > + if (min_state > DEV_PROBED && > > > > + fs_is_removed(&sub[sid])) > > > > + goto next; > > > > if (subs[sid].state >=3D min_state) > > > > break; > > > > +next: > > > > sid++; > > > > } > > > > *sid_out =3D sid; > > > > > > > > --->8--- > > > > > > > > Only issue being that it is completely racy, but as this MT-unsafe > > > > property is inescapable we might as well ignore it and go for KISS. > > > > > > > > If that's enough, I would prefer instead of having this additional > > > > check added to all rte_eth operations. > > > > > > > > > > Ok, actually you were right here to do it this way. The "is_removed" > > > check needs to happen after the operation attempt to effectively > > > mitigate the possible race. Checking before attempting the call will > > > be much less effective. > > > > > > That being said, would it be cleaner to have eth_dev ops return > > > -ENODEV directly, and check against it within fail-safe? > > > > > > > I think that according to "is_removed" semantic we must return a Boolea= n > value (Each value different from '0' means that the device is removed) li= ke > other functions in c library (for example isspace()). > > >=20 > Sure, I wasn't discussing the interface proposed by > rte_eth_dev_is_removed(). >=20 > What I meant was to ask whether checking rte_eth_dev_is_removed() > would be more interesting in the ethdev layer, making the eth_dev_ops > return -ENODEV regardless of the previous error if this check is supporte= d by > the driver and signal that the port is removed. >=20 > I think this information could be interesting to other systems, not just = fail- > safe. >=20 Ok. Got you now. Interesting approach - plan: 1. update fs_link_update to use rte_eth* functions. 2. maybe -EIO is preferred because -ENODEV is used for no port error? 3. update all relevant rte_eth* to use "is_removed" in error flows(1 patch= for flow APIs and 1 for the others). 4. Change fs checks in error flows to check rte_eth* return values. 5. Remove CC stable from commit massage. What do you think? > -- > Ga=EBtan Rivet > 6WIND