From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BC981A04BA; Mon, 5 Oct 2020 11:42:27 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9C44D1B737; Mon, 5 Oct 2020 11:42:26 +0200 (CEST) Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [217.70.183.193]) by dpdk.org (Postfix) with ESMTP id AE9941B68A; Mon, 5 Oct 2020 11:42:24 +0200 (CEST) X-Originating-IP: 86.254.165.59 Received: from u256.net (lfbn-poi-1-843-59.w86-254.abo.wanadoo.fr [86.254.165.59]) (Authenticated sender: grive@u256.net) by relay1-d.mail.gandi.net (Postfix) with ESMTPSA id E7E6124000C; Mon, 5 Oct 2020 09:42:22 +0000 (UTC) Date: Mon, 5 Oct 2020 11:42:15 +0200 From: =?utf-8?Q?Ga=C3=ABtan?= Rivet To: Long Li Cc: dev@dpdk.org, Long Li , stable@dpdk.org Message-ID: <20201005094215.u4kt64ycbk35kbeg@u256.net> References: <1601683308-18738-1-git-send-email-longli@linuxonhyperv.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1601683308-18738-1-git-send-email-longli@linuxonhyperv.com> Subject: Re: [dpdk-dev] [PATCH] net/failsafe: check correct error code while handling sub-device add X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, On 02/10/20 17:01 -0700, Long Li wrote: > From: Long Li > > When adding a sub-device, it's possible that the sub-device is configured > successfully but later fails to start. This error should not be masked. Some of those errors are meant to be masked: -EIO, when the device is marked as removed at the ethdev level (see eth_err() in rte_ethdev.c:819). > The driver needs to check the error status to prevent endless loop of > trying to start the sub-device. If the ethdev layer error is due to the device being removed, and failsafe loops on trying to sync the eth device to its own state, then an RMV event should have been emitted but wasn't or it was missed by failsafe. If the ethdev layer error is *not* due to the device being removed, the error should be != -EIO, and sdev->remove should not be set, so fs_err() should not mask it and it should be seen by the app. Can you provide the following details: * What is the return code of rte_eth_dev_start() that is masked in your start loop? * Is the device marked as removed in failsafe? * Is the device marked as removed in ethdev? * Was there an RMV event generated for the device? Whether yes or no, is it correct? Thanks, > > fixes (ae80146c5a1b net/failsafe: fix removed device handling) > > cc: stable@dpdk.org > Signed-off-by: Long Li > --- > drivers/net/failsafe/failsafe_private.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h > index 651578a..c58c0de 100644 > --- a/drivers/net/failsafe/failsafe_private.h > +++ b/drivers/net/failsafe/failsafe_private.h > @@ -497,7 +497,7 @@ int failsafe_eth_new_event_callback(uint16_t port_id, > fs_err(struct sub_device *sdev, int err) > { > /* A device removal shouldn't be reported as an error. */ > - if (sdev->remove == 1 || err == -EIO) > + if (sdev->remove == 1 && err == -EIO) > return rte_errno = 0; > return err; > } > -- > 1.8.3.1 > -- Gaƫtan