From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 918ACCCC6 for ; Thu, 30 Apr 2015 17:05:31 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP; 30 Apr 2015 08:04:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,677,1422950400"; d="scan'208";a="721665869" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by orsmga002.jf.intel.com with ESMTP; 30 Apr 2015 08:04:10 -0700 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id t3UF46No028593; Thu, 30 Apr 2015 23:04:06 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id t3UF42dF024006; Thu, 30 Apr 2015 23:04:05 +0800 Received: (from hzhan75@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id t3UF42lH024002; Thu, 30 Apr 2015 23:04:02 +0800 From: Helin Zhang To: dev@dpdk.org Date: Thu, 30 Apr 2015 23:03:16 +0800 Message-Id: <1430406219-23901-11-git-send-email-helin.zhang@intel.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1430406219-23901-1-git-send-email-helin.zhang@intel.com> References: <1429518150-28098-1-git-send-email-helin.zhang@intel.com> <1430406219-23901-1-git-send-email-helin.zhang@intel.com> Cc: monica.kenguva@intel.com, steven.j.murray@intel.com, shannon.nelson@intel.com Subject: [dpdk-dev] [PATCH v2 10/33] i40e/base: catch NVM write semaphore timeout and retry X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Apr 2015 15:05:32 -0000 In some circumstances, a multi-write transaction takes longer than the default 3 minutes timeout on the write semaphore. If the write failed with an EBUSY status, this is likely the problem. So here it tries to reacquire the semaphore and then retry the write. Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e/i40e_nvm.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/lib/librte_pmd_i40e/i40e/i40e_nvm.c b/lib/librte_pmd_i40e/i40e/i40e_nvm.c index c63f5ba..e82f2a4 100644 --- a/lib/librte_pmd_i40e/i40e/i40e_nvm.c +++ b/lib/librte_pmd_i40e/i40e/i40e_nvm.c @@ -895,11 +895,13 @@ STATIC enum i40e_status_code i40e_nvmupd_state_writing(struct i40e_hw *hw, { enum i40e_status_code status; enum i40e_nvmupd_cmd upd_cmd; + bool retry_attempt = false; DEBUGFUNC("i40e_nvmupd_state_writing"); upd_cmd = i40e_nvmupd_validate_command(hw, cmd, perrno); +retry: switch (upd_cmd) { case I40E_NVMUPD_WRITE_CON: status = i40e_nvmupd_nvm_write(hw, cmd, bytes, perrno); @@ -938,6 +940,39 @@ STATIC enum i40e_status_code i40e_nvmupd_state_writing(struct i40e_hw *hw, *perrno = -ESRCH; break; } + + /* In some circumstances, a multi-write transaction takes longer + * than the default 3 minute timeout on the write semaphore. If + * the write failed with an EBUSY status, this is likely the problem, + * so here we try to reacquire the semaphore then retry the write. + * We only do one retry, then give up. + */ + if (status && (hw->aq.asq_last_status == I40E_AQ_RC_EBUSY) && + !retry_attempt) { + enum i40e_status_code old_status = status; + u32 old_asq_status = hw->aq.asq_last_status; + u32 gtime; + + gtime = rd32(hw, I40E_GLVFGEN_TIMER); + if (gtime >= hw->nvm.hw_semaphore_timeout) { + i40e_debug(hw, I40E_DEBUG_ALL, + "NVMUPD: write semaphore expired (%d >= %lld), retrying\n", + gtime, hw->nvm.hw_semaphore_timeout); + i40e_release_nvm(hw); + status = i40e_acquire_nvm(hw, I40E_RESOURCE_WRITE); + if (status) { + i40e_debug(hw, I40E_DEBUG_ALL, + "NVMUPD: write semaphore reacquire failed aq_err = %d\n", + hw->aq.asq_last_status); + status = old_status; + hw->aq.asq_last_status = old_asq_status; + } else { + retry_attempt = true; + goto retry; + } + } + } + return status; } -- 1.8.1.4