From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0D61EA04B1; Wed, 30 Sep 2020 08:50:42 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2CFAA1DAF5; Wed, 30 Sep 2020 08:48:54 +0200 (CEST) Received: from relay.smtp-ext.broadcom.com (lpdvacalvio01.broadcom.com [192.19.229.182]) by dpdk.org (Postfix) with ESMTP id 005511DAF1 for ; Wed, 30 Sep 2020 08:48:50 +0200 (CEST) Received: from dhcp-10-123-153-22.dhcp.broadcom.net (bgccx-dev-host-lnx2.bec.broadcom.net [10.123.153.22]) by relay.smtp-ext.broadcom.com (Postfix) with ESMTP id 038363FE1D; Tue, 29 Sep 2020 23:48:48 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 relay.smtp-ext.broadcom.com 038363FE1D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=broadcom.com; s=dkimrelay; t=1601448530; bh=kTn64/5UtHqrDRW0QLJoKb7gW72ixO87u5jcBAIswhg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GHa3ddBy3YC5AjbEoNXvVNaThkkpGroG2FlaSuE8sbhI+Pt9bc46M1/m3CuWnuD9n w3nzvHcZ3ywOJv/r1GOe9TzcDP7Qm5w0BiVE2tDFbtsptU+Vv2NR7cHxahDFSi/Ws2 a8v0Uho3yggYOLGLrvg+oIiveol+c5JIzsAmlYHU= From: Kalesh A P To: dev@dpdk.org Cc: thomas@monjalon.net, ferruh.yigit@intel.com, ajit.khaparde@broadcom.com Date: Wed, 30 Sep 2020 12:33:23 +0530 Message-Id: <20200930070326.20133-1-kalesh-anakkur.purayil@broadcom.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20200122101654.20824-1-kalesh-anakkur.purayil@broadcom.com> References: <20200122101654.20824-1-kalesh-anakkur.purayil@broadcom.com> Subject: [dpdk-dev] [RFC V2 0/3] librte_ethdev: error recovery support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Kalesh AP The error recovery solution is a protocol implemented between firmware and bnxt PMD to recover from the fatal errors without a system reboot. There is an alarm thread which constantly monitors the health of the firmware and initiates a recovery when needed. There are two scenarios here: 1. Hardware or firmware encountered an error which firmware detected. Firmware is in operational status here. In this case, firmware can reset the chip and notify the driver about the reset. 2. Hardware or firmware encountered an error but firmware is dead/hung. Firmware is not in operational status. In this case, the only possible way to recover the adapter is through host driver(bnxt PMD). In both cases, bnxt PMD reinitializes with the FW again after the reset. During that recovery process, data path will be halted and any control path operation would fail. So, the PMD has to notify the application about this reset/error event to prevent any activities from the application while the PMD is recovering from the error. This patch set adds support for the reset and recovery event in the rte_eth_event framework. FW error and FW reset conditions would be managed by the PMD. Driver uses RTE_ETH_EVENT_RESET event to notify the applications about the FW reset or error. In such cases, PMD use the RTE_ETH_EVENT_RECOVERED event to notify application about PMD has recovered from FW reset or FW error. V2: Added a new event RTE_ETH_EVENT_RESET instead of using the RTE_ETH_EVENT_INTR_RESET to notify applications about device reset. Kalesh AP (3): ethdev: support device reset and recovery events net/bnxt: notify applications about device reset/recovery app/testpmd: handle device recovery event app/test-pmd/testpmd.c | 6 +++++- drivers/net/bnxt/bnxt_cpr.c | 3 +++ drivers/net/bnxt/bnxt_ethdev.c | 9 +++++++++ lib/librte_ethdev/rte_ethdev.h | 2 ++ 4 files changed, 19 insertions(+), 1 deletion(-) -- 2.10.1