From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9A2E0A0A0A for ; Thu, 20 May 2021 11:55:28 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 94403410F7; Thu, 20 May 2021 11:55:28 +0200 (CEST) Received: from relay.smtp-ext.broadcom.com (lpdvacalvio01.broadcom.com [192.19.229.182]) by mails.dpdk.org (Postfix) with ESMTP id 0709C40041 for ; Thu, 20 May 2021 11:55:27 +0200 (CEST) Received: from dhcp-10-123-153-22.dhcp.broadcom.net (bgccx-dev-host-lnx2.bec.broadcom.net [10.123.153.22]) by relay.smtp-ext.broadcom.com (Postfix) with ESMTP id B44C724710 for ; Thu, 20 May 2021 02:55:25 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 relay.smtp-ext.broadcom.com B44C724710 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=broadcom.com; s=dkimrelay; t=1621504526; bh=+0u3KV8K1fyYiMZ394P1GpGoSHa5JqUKgQ70yd9AR2s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=v6zUs9beWrRfR8Vm+dPZVQ7MnWiBsSP4QQcmnVUZg0dN3IAFbYMlJUlakQFE+cSJz prcuHw/ChIap6/TnSAyzw4M2g3wMM5ello24JN8pqYIESRgvH1MFee1ORDeeW9MRUk u+bRiy42yA/m8C7pVUmo1eYkmkDAxfh8EWxsLqjo= From: Kalesh A P To: stable@dpdk.org Date: Thu, 20 May 2021 15:46:49 +0530 Message-Id: <20210520101654.3214-5-kalesh-anakkur.purayil@broadcom.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20210520101654.3214-1-kalesh-anakkur.purayil@broadcom.com> References: <20210520101654.3214-1-kalesh-anakkur.purayil@broadcom.com> Subject: [dpdk-stable] [PATCH 19.11 4/9] net/bnxt: fix firmware fatal error handling X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" From: Kalesh AP [upstream commit 94131e4ab74728da994d5179052bb30c3c76910c ] During some fatal firmware error conditions, the PCI config space register 0x2e which normally contains the subsystem ID will become 0xffff. This register will revert back to the normal value after the chip has completed core reset. If we detect this condition, we can poll this config register immediately for the value to revert. Because we use config read cycles to poll this register, there is no possibility of Master Abort if we happen to read it during core reset. This speeds up recovery significantly as we don't have to wait for the conservative min_time before polling to see if the firmware has come out of reset. As soon as this register changes value we can proceed to re-initialize the device. Fixes: df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW") Signed-off-by: Kalesh AP Reviewed-by: Somnath Kotur Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_ethdev.c | 55 ++++++++++++++++++++++++++++++++++++++++-- drivers/net/bnxt/bnxt_util.h | 2 ++ 2 files changed, 55 insertions(+), 2 deletions(-) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index bcdf1fc..32c4eed 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -4009,6 +4009,32 @@ static void bnxt_dev_cleanup(struct bnxt *bp) bnxt_uninit_resources(bp, true); } +static int +bnxt_check_fw_reset_done(struct bnxt *bp) +{ + int timeout = bp->fw_reset_max_msecs; + uint16_t val = 0; + int rc; + + do { + rc = rte_pci_read_config(bp->pdev, &val, sizeof(val), PCI_SUBSYSTEM_ID_OFFSET); + if (rc < 0) { + PMD_DRV_LOG(ERR, "Failed to read PCI offset 0x%x", PCI_SUBSYSTEM_ID_OFFSET); + return rc; + } + if (val != 0xffff) + break; + rte_delay_ms(1); + } while (timeout--); + + if (val == 0xffff) { + PMD_DRV_LOG(ERR, "Firmware reset aborted, PCI config space invalid\n"); + return -1; + } + + return 0; +} + static int bnxt_restore_vlan_filters(struct bnxt *bp) { struct rte_eth_dev *dev = bp->eth_dev; @@ -4123,6 +4149,12 @@ static void bnxt_dev_recover(void *arg) struct bnxt *bp = arg; int rc = 0; + if (!bp->fw_reset_min_msecs) { + rc = bnxt_check_fw_reset_done(bp); + if (rc) + goto err; + } + /* Clear Error flag so that device re-init should happen */ bp->flags &= ~BNXT_FLAG_FATAL_ERROR; @@ -4162,14 +4194,33 @@ static void bnxt_dev_recover(void *arg) void bnxt_dev_reset_and_resume(void *arg) { struct bnxt *bp = arg; + uint32_t us = US_PER_MS * bp->fw_reset_min_msecs; + uint16_t val = 0; int rc; bnxt_dev_cleanup(bp); bnxt_wait_for_device_shutdown(bp); - rc = rte_eal_alarm_set(US_PER_MS * bp->fw_reset_min_msecs, - bnxt_dev_recover, (void *)bp); + /* During some fatal firmware error conditions, the PCI config space + * register 0x2e which normally contains the subsystem ID will become + * 0xffff. This register will revert back to the normal value after + * the chip has completed core reset. If we detect this condition, + * we can poll this config register immediately for the value to revert. + */ + if (bp->flags & BNXT_FLAG_FATAL_ERROR) { + rc = rte_pci_read_config(bp->pdev, &val, sizeof(val), PCI_SUBSYSTEM_ID_OFFSET); + if (rc < 0) { + PMD_DRV_LOG(ERR, "Failed to read PCI offset 0x%x", PCI_SUBSYSTEM_ID_OFFSET); + return; + } + if (val == 0xffff) { + bp->fw_reset_min_msecs = 0; + us = 1; + } + } + + rc = rte_eal_alarm_set(us, bnxt_dev_recover, (void *)bp); if (rc) PMD_DRV_LOG(ERR, "Error setting recovery alarm"); } diff --git a/drivers/net/bnxt/bnxt_util.h b/drivers/net/bnxt/bnxt_util.h index a15b3a1..68665196 100644 --- a/drivers/net/bnxt/bnxt_util.h +++ b/drivers/net/bnxt/bnxt_util.h @@ -10,6 +10,8 @@ #define BIT(n) (1UL << (n)) #endif /* BIT */ +#define PCI_SUBSYSTEM_ID_OFFSET 0x2e + int bnxt_check_zero_bytes(const uint8_t *bytes, int len); void bnxt_eth_hw_addr_random(uint8_t *mac_addr); -- 2.10.1