From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f67.google.com (mail-wm0-f67.google.com [74.125.82.67]) by dpdk.org (Postfix) with ESMTP id 170B41B7C2 for ; Wed, 31 Jan 2018 15:32:11 +0100 (CET) Received: by mail-wm0-f67.google.com with SMTP id j21so8732557wmh.1 for ; Wed, 31 Jan 2018 06:32:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=wsUMx7pt4gRxdQ2o7uhrqg6yMTAMzW57JFmlIbcscmg=; b=InvU/1Njd0pP9svjXyU1Hx/Q9Jum/gJnQNKXBHAIzFtUsrleK7KVb992cKzCDh3pHi bhxz+44HVNiFF9ZzSbRDKKiDJdOLN133VNlifz7rW6jYSQgxEorZvpY7WRQnatsg22D4 RTZWvvmq5S5Fr0O0iPikDb4Lso0CekGwMitfMAAzNWMeFi3n5oVM7jaz5v0tFkUgoIf9 QpjwtWZiwyohOxtANTcZUIwrh+xB51aNqICqgOiAphSy6d9upFlHjdj8Fp86mSqGM3qo zUNGfx0BdeJ4ID326Szwc6D5MfsOV9VVUyBIrwEIqTlkScDpaVvqD72xPb3E0cMI4fGj Zavg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=wsUMx7pt4gRxdQ2o7uhrqg6yMTAMzW57JFmlIbcscmg=; b=plBNlachab0X++5HsmjUnhI0kWeOy1KlKqZkLOClscJqOnDziouMltVfpwz4boqTzF WpdsFmMbSdLBUJa/P0cHx0X5qdD049yP4DxU5M5k0j2YPDGFX6mGT4TtNYQ8IXwS6BFI 33+eswsPROr1Xj9xFtE7/kIy2ZPhG/1wTAtAZKoH9bnh8aQsw8NXutCXUXpKkCwChcEi q9/P+Owrv9iGPWD/ZQM9BGlwH5ZkzQjnBuCzo30TgR/H2T0znUJdJkkp60wyYwA/q/DA A7HqvpJLzf5+5GK8shd+05/0NK1b/ERUWLqiHAH7jDJSaRiRazddgFmcTfClMeEQwAjV EUqg== X-Gm-Message-State: AKwxytdLXYqx2g/WG6+9wCPCdvOjGydaJr8ZhDSKXK2BbivfYHHxRMxo qKflotVeiJzoHvwSpO/9NfFSMw== X-Google-Smtp-Source: AH8x225riqI+v931Qzod2rNuJRabSb9SSCiwOo+xBUWu8N0Ch4C6Gt1CFhNXXU2wwMAActXYxRfdvw== X-Received: by 10.28.13.18 with SMTP id 18mr25290350wmn.112.1517409130717; Wed, 31 Jan 2018 06:32:10 -0800 (PST) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id w3sm19061128wrc.22.2018.01.31.06.32.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 31 Jan 2018 06:32:10 -0800 (PST) Date: Wed, 31 Jan 2018 15:31:57 +0100 From: Adrien Mazarguil To: Matan Azrad Cc: Shahaf Shuler , Mordechay Haimovsky , "dev@dpdk.org" , "stable@dpdk.org" Message-ID: <20180131143157.GV4256@6wind.com> References: <1516357009-15463-1-git-send-email-motih@mellanox.com> <1517214877-126768-1-git-send-email-motih@mellanox.com> <20180130093958.GE4256@6wind.com> <20180131091513.GS4256@6wind.com> <20180131104352.GT4256@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH v3] net/mlx4: fix dev rmv not detected after port stop X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Jan 2018 14:32:11 -0000 Hi Matan, On Wed, Jan 31, 2018 at 01:44:41PM +0000, Matan Azrad wrote: > Hi Adrien > > From: Adrien Mazarguil, Sent: Wednesday, January 31, 2018 12:44 PM > > On Wed, Jan 31, 2018 at 10:08:06AM +0000, Matan Azrad wrote: > > > Hi all > > > > > > From: Adrien Mazarguil > > > > On Tue, Jan 30, 2018 at 08:37:06PM +0000, Shahaf Shuler wrote: > > > > > Tuesday, January 30, 2018 11:40 AM, Adrien Mazarguil: > > > > > > Unfortunately I didn't get a chance to review this patch before > > > > > > it was > > > > applied. > > > > > > I'm not sure a stopped port is supposed to report events > > > > > > (interrupts). Will applications expect them to occur at this point? > > > > > > > > > > Why not? > > > > > > > > > > Stopped port is still counted as attached. The fact the > > > > > application stopped > > > > the packet receive on it doesn't mean it should not receive a sync > > > > events (such as the remove event). > > > > > async events, by definition, are not related to traffic being > > > > > flows through > > > > the port. > > > > > > > > My comment is based on my understanding of rte_eth_dev_stop(), > > which > > > > is a device (or port) is completely stopped, in a suspended state > > > > and no interrupts shall occur, as a means for applications to > > > > temporarily not be bothered by them until restarted. > > > > > > Stopping traffic is not saying that the application is not interesting > > > in the device, I think that you mean to dev_close(). > > > > No, dev_close() releases resources and destroys configuration. Good luck > > using dev_start() or any other devop after dev_close(). > > I'm just saying here that when the user call dev_close() it means he is not interesting in the device any more, > This is not the case in dev_stop(). > > > > Any event may still be usable for application between dev_stop() to > > > dev_start(), especially RMV or LCS can still be interested. > > > > Possibly, but then, how come no PMD implements it that way? > > Again, maybe others PMDs make mistakes. It's basically the same as stating that for all these years, neither PMD nor application maintainers understood the true meaning of this API. Maybe you're right, maybe not. What's for sure is this patch unilaterally decided to modify an accepted behavior. Perhaps it's not a big deal but we must be cautious. > > Neither did mlx4 before this patch got applied. At the very least it cannot be considered a > > "fix". > > It fixes something. Technically every time a feature is added, its absence gets "fixed" :) > > > > Think about it that way: applications do not want to get interrupts > > > > immediately after the device is initialized, because they might not > > > > be ready to process them at this point. An explicit call to > > > > rte_eth_dev_start() tells the PMD when it's OK to do so. The converse is > > rte_eth_dev_stop(). > > > > > > So, they can delay the event registration to the time they interesting in the > > events. > > > And use event unregister when they are not interesting in it anymore. > > > > Of course you can ask application maintainers to adapt to the new behavior, > > or you know, leave things as they used to be. > > > > I don't know what any application does but for me it is a mistake to stop all event processes in dev_stop(), > Maybe for other application maintainers too. Just like you, I don't know either what all the applications ever written for DPDK expect out of dev_stop(). What's for sure is that currently, LSC/RMV don't occcur afterward, the same way these events do not occur before dev_start(). Any application possibly relying on this fact will break. In such a situation, a conservative approach is better. > > Setting up RMV/LSC callbacks is not the only configuration an application > > usually performs before calling dev_start(). Think about setting up flow rules, > > MAC addresses, VLANs, and so on, this on multiple ports before starting > > them up all at once. Previously it could be done in an unspecified order, now > > they have to take special care for RMV/LSC. > > Or maybe there callbacks code are already safe for it. > Or they manages the unregister\register calls in the right places. That's my point, these "maybes" don't argue in favor of changing things. > > Many devops are only safe when called while a device is stopped. It's even > > documented in rte_ethdev.h. > > > > And? ...And applications therefore often do all their configuration in an unspecified order while a port is stopped as a measure of safety. No extra care is taken for RMV/LSC. This uncertainty can be addressed by not modifying the current behavior. > > > > Stopping traffic can already be achieved by not polling from the > > > > application side, calling rte_eth_dev_[rt]x_queue_stop() and/or > > > > toggling RX/TX interrupts through rte_eth_dev_[rt]x_intr_enable(). > > > > rte_eth_dev_stop() provides lower-level device control. > > > > > > I think it makes sense only for Rx interrupt which is traffic oriented(like stop > > and start). > > > > OK, looks like we disagree :) > > > > > > Perhaps documentation is not clear, however that's how LSC seems > > > > implemented in all PMDs; it gets disabled after rte_eth_dev_stop() > > > > and one should explicitly use rte_eth_link_get() to retrieve link > > > > status afterward. I think RMV should behave similarly with > > rte_eth_dev_is_removed(). > > > > Adapting fail-safe should be easier than modifying all the remaining > > PMDs. > > > > > > Or maybe PMDs which do it make mistakes. > > > > I'm not convinced mlx4 is the only PMD doing the right thing, we should ask > > other maintainers as well as application writers' opinion on the matter. If it's > > not a problem for RMV/LSC to occur while a device is stopped, then I'll agree > > with the approach. It still won't make it a fix regardless. > > Let's think about RMV event, How can the application\other dpdk entities to know if the device was removed when it was in stopped state? > Checking it synchronically (by rte_eth_dev_is_removed()) can miss the removal just a moment after the device came back again, errors will occur and no one will know why. > > So, at least for RMV event, we need the notification also in stopped state. You sent the rte_eth_dev_is_removed() series. You're aware that PMDs implementing this call benefit from an automatic is_removed() check on all remaining devops whenever some error occur. In short, an application will get notified simply by getting dev_start() (or any other callback) return -EIO and not being able to use the device. PMDs that do not implement is_removed() (or in addition to it) could also artificially trigger a RMV event after dev_start() is called. As long as the PMD remains quiet while the device is stopped, it's fine. > > > > > > In my opinion it's not a fix, as in, it doesn't address an issue > > > > > > introduced by the mentioned patch whose behavior was correct. > > > > > > > > > > > > It's probably too late to change it now and it does address an > > > > > > issue seen with a use case involving this PMD, however I think > > > > > > the fail-safe PMD could as well poll using the recently-added > > > > > > rte_eth_dev_is_removed() when it's aware the underlying port is > > > > stopped instead of expecting interrupts. -- Adrien Mazarguil 6WIND