From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0068.outbound.protection.outlook.com [104.47.1.68]) by dpdk.org (Postfix) with ESMTP id 13E9C1B7AC; Wed, 31 Jan 2018 14:44:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=HeBnOXu6mZHMitxVzavkKT6Q3e3I8auypLjiYvusxoQ=; b=jjnE5iHDEsig2aHP8Kcz4/hSkvCtSWsEmd47Itv7D2daHDGIAtfnQALO7K3EnH/ImyydPk32luVS24kmx+VzgSLddxHjtLkToAsN8PMJfkkyGFRoUJOaoJjdvkgsq1FrMY2+UkZEg2fWYX6KhTvNpx4kf/Mnel88VcxiqWgftIc= Received: from AM4PR0501MB2657.eurprd05.prod.outlook.com (10.172.215.19) by AM4PR0501MB2802.eurprd05.prod.outlook.com (10.172.216.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.444.14; Wed, 31 Jan 2018 13:44:41 +0000 Received: from AM4PR0501MB2657.eurprd05.prod.outlook.com ([fe80::50a5:cd88:b3d8:763e]) by AM4PR0501MB2657.eurprd05.prod.outlook.com ([fe80::50a5:cd88:b3d8:763e%17]) with mapi id 15.20.0444.016; Wed, 31 Jan 2018 13:44:41 +0000 From: Matan Azrad To: Adrien Mazarguil CC: Shahaf Shuler , Mordechay Haimovsky , "dev@dpdk.org" , "stable@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3] net/mlx4: fix dev rmv not detected after port stop Thread-Index: AQHTmoBrLBnxYvtDvEq0X4rYUAyrUKON6+Dw Date: Wed, 31 Jan 2018 13:44:41 +0000 Message-ID: References: <1516357009-15463-1-git-send-email-motih@mellanox.com> <1517214877-126768-1-git-send-email-motih@mellanox.com> <20180130093958.GE4256@6wind.com> <20180131091513.GS4256@6wind.com> <20180131104352.GT4256@6wind.com> In-Reply-To: <20180131104352.GT4256@6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM4PR0501MB2802; 7:MK5n9v+ClOUmQDb/FxicNTgtNrYS/07T2HW5k9XoVGQ5oeV2QFQIW0SCibZC82oyAJanoiouiCTYEaOHCrlUxzQN7Tg99yza727Iv5HqQ2m3Tu7qb7BNdYYN5Z1VVpE5HS8TfyyMjvf1aCwTUeOxXoQMpzjYEwejhmBg+F35+sTYCuElpsFIfuKACDccNuAM/ZTdZkNO2bBrdpCN7gtaqF9FvQe/t8eN9D0pao7U/dVvKLk0bzbz/CPo19L1dJxA x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: c837d9e7-fc28-4b47-4240-08d568b0c78e x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:AM4PR0501MB2802; x-ms-traffictypediagnostic: AM4PR0501MB2802: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(5005006)(8121501046)(3002001)(3231101)(2400082)(944501161)(10201501046)(93006095)(93001095)(6055026)(6041288)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(20161123562045)(6072148)(201708071742011); SRVR:AM4PR0501MB2802; BCL:0; PCL:0; RULEID:; SRVR:AM4PR0501MB2802; x-forefront-prvs: 056929CBB8 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(346002)(366004)(39860400002)(396003)(39380400002)(53754006)(51444003)(199004)(189003)(229853002)(86362001)(102836004)(3660700001)(26005)(66066001)(74316002)(105586002)(5660300001)(106356001)(6116002)(97736004)(3280700002)(305945005)(7736002)(6436002)(2906002)(76176011)(3846002)(186003)(59450400001)(8936002)(14454004)(68736007)(6506007)(33656002)(55016002)(53936002)(8676002)(81156014)(9686003)(81166006)(478600001)(2900100001)(99286004)(5250100002)(316002)(2950100002)(6916009)(5890100001)(25786009)(54906003)(4326008)(6246003)(93886005)(7696005); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR0501MB2802; H:AM4PR0501MB2657.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: wOg/6AKtt6QSGBBW/agpn5Dg3teKGb7AvTQFzrvCUVJVEktGGlIOrtIPQEp8RJ5BJNSTwdyGd3iGQcE0V0PuOg== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: c837d9e7-fc28-4b47-4240-08d568b0c78e X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Jan 2018 13:44:41.2908 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0501MB2802 Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH v3] net/mlx4: fix dev rmv not detected after port stop X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Jan 2018 13:44:43 -0000 Hi Adrien From: Adrien Mazarguil, Sent: Wednesday, January 31, 2018 12:44 PM > On Wed, Jan 31, 2018 at 10:08:06AM +0000, Matan Azrad wrote: > > Hi all > > > > From: Adrien Mazarguil > > > On Tue, Jan 30, 2018 at 08:37:06PM +0000, Shahaf Shuler wrote: > > > > Tuesday, January 30, 2018 11:40 AM, Adrien Mazarguil: > > > > > Unfortunately I didn't get a chance to review this patch before > > > > > it was > > > applied. > > > > > I'm not sure a stopped port is supposed to report events > > > > > (interrupts). Will applications expect them to occur at this poin= t? > > > > > > > > Why not? > > > > > > > > Stopped port is still counted as attached. The fact the > > > > application stopped > > > the packet receive on it doesn't mean it should not receive a sync > > > events (such as the remove event). > > > > async events, by definition, are not related to traffic being > > > > flows through > > > the port. > > > > > > My comment is based on my understanding of rte_eth_dev_stop(), > which > > > is a device (or port) is completely stopped, in a suspended state > > > and no interrupts shall occur, as a means for applications to > > > temporarily not be bothered by them until restarted. > > > > Stopping traffic is not saying that the application is not interesting > > in the device, I think that you mean to dev_close(). >=20 > No, dev_close() releases resources and destroys configuration. Good luck > using dev_start() or any other devop after dev_close(). I'm just saying here that when the user call dev_close() it means he is not= interesting in the device any more, This is not the case in dev_stop(). =20 > > Any event may still be usable for application between dev_stop() to > > dev_start(), especially RMV or LCS can still be interested. >=20 > Possibly, but then, how come no PMD implements it that way? Again, maybe others PMDs make mistakes. > Neither did mlx4 before this patch got applied. At the very least it cann= ot be considered a > "fix". It fixes something. > > > Think about it that way: applications do not want to get interrupts > > > immediately after the device is initialized, because they might not > > > be ready to process them at this point. An explicit call to > > > rte_eth_dev_start() tells the PMD when it's OK to do so. The converse= is > rte_eth_dev_stop(). > > > > So, they can delay the event registration to the time they interesting = in the > events. > > And use event unregister when they are not interesting in it anymore. >=20 > Of course you can ask application maintainers to adapt to the new behavio= r, > or you know, leave things as they used to be. >=20 I don't know what any application does but for me it is a mistake to stop a= ll event processes in dev_stop(), Maybe for other application maintainers too. > Setting up RMV/LSC callbacks is not the only configuration an application > usually performs before calling dev_start(). Think about setting up flow = rules, > MAC addresses, VLANs, and so on, this on multiple ports before starting > them up all at once. Previously it could be done in an unspecified order,= now > they have to take special care for RMV/LSC. Or maybe there callbacks code are already safe for it. Or they manages the unregister\register calls in the right places. =20 > Many devops are only safe when called while a device is stopped. It's eve= n > documented in rte_ethdev.h. >=20 And? > > > Stopping traffic can already be achieved by not polling from the > > > application side, calling rte_eth_dev_[rt]x_queue_stop() and/or > > > toggling RX/TX interrupts through rte_eth_dev_[rt]x_intr_enable(). > > > rte_eth_dev_stop() provides lower-level device control. > > > > I think it makes sense only for Rx interrupt which is traffic oriented(= like stop > and start). >=20 > OK, looks like we disagree :) >=20 > > > Perhaps documentation is not clear, however that's how LSC seems > > > implemented in all PMDs; it gets disabled after rte_eth_dev_stop() > > > and one should explicitly use rte_eth_link_get() to retrieve link > > > status afterward. I think RMV should behave similarly with > rte_eth_dev_is_removed(). > > > Adapting fail-safe should be easier than modifying all the remaining > PMDs. > > > > Or maybe PMDs which do it make mistakes. >=20 > I'm not convinced mlx4 is the only PMD doing the right thing, we should a= sk > other maintainers as well as application writers' opinion on the matter. = If it's > not a problem for RMV/LSC to occur while a device is stopped, then I'll a= gree > with the approach. It still won't make it a fix regardless. Let's think about RMV event, How can the application\other dpdk entities to= know if the device was removed when it was in stopped state? Checking it synchronically (by rte_eth_dev_is_removed()) can miss the remov= al just a moment after the device came back again, errors will occur and no= one will know why. So, at least for RMV event, we need the notification also in stopped state. > > > > > In my opinion it's not a fix, as in, it doesn't address an issue > > > > > introduced by the mentioned patch whose behavior was correct. > > > > > > > > > > It's probably too late to change it now and it does address an > > > > > issue seen with a use case involving this PMD, however I think > > > > > the fail-safe PMD could as well poll using the recently-added > > > > > rte_eth_dev_is_removed() when it's aware the underlying port is > > > stopped instead of expecting interrupts. >=20 > -- > Adrien Mazarguil > 6WIND