From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1CB0DA0032 for ; Wed, 20 Jul 2022 06:56:23 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DD17D40697; Wed, 20 Jul 2022 06:56:22 +0200 (CEST) Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) by mails.dpdk.org (Postfix) with ESMTP id BA8ED4003C for ; Wed, 20 Jul 2022 06:56:21 +0200 (CEST) Received: by mail-ej1-f50.google.com with SMTP id va17so31039267ejb.0 for ; Tue, 19 Jul 2022 21:56:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=/4XbJCEVUiIt54JOxJ4QFIrNrk7YbKB0O5mwY2SwtlA=; b=hrxniNvskI7z2GpppLT7wRWfmHvsyYe8QveMFLvgOhLsIek5foG5uE36GvNspByJyY JYdyDnSL+VGEQ408qLpj/xCSc1swCqTg18u0ZLXfTwXwVO6qiNDtIJ2CqtgHQ7yYP4Vf DwpHeNaK9ZIhsnAJm+tDXQ/A4K8RA02ZjUV4ZYZmfkh17kIg75XRz290MsviRU2NV9nV vJbetR2/l1xp8G1Y/dZDx2u2aCrpKlWEN4JGEPOuHZQFWP+XQFRDsxfHcpvYulQAOmNy 2WONtBMHXmUkX/SS1KZH2lMY+j92wuBKxvkMV5xSrumObjl4n9g2V9rFE9V3T/U5704C I2cA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=/4XbJCEVUiIt54JOxJ4QFIrNrk7YbKB0O5mwY2SwtlA=; b=fU2nk+vDQwE901ALU2GxYRKD90z2b5ITUfnQSPNBPVt1HVWHBHifHZLH53poHSoSX3 6XQyINmst2Odx/+i+gabDzek2o/v0Nk6BqF4HW72IVk+h+tgu6ha+8h/8SjYXlGOjD2R fVr8HyVHRV+Pkv4wXDlnc99TpYDh4ZYbQ7G3kG4zHhA+v9yDusGtanArKytrdegW+zoI IAXxbcvt8CUQ1HXnKr7Yn+xpZ1SAopdTzTe52X0ql02nLhKDfLO9NuoppP4X3I0n7Luk Y99GT4NHbMCXBLkTGm1Je4piqRMM7CgZOO9V91Wyme98clYS531cJ4wqRcjCeUeiUUFB 2J4Q== X-Gm-Message-State: AJIora/03gKbytc/WzWJyCa+1v7vf8ZlMo1CmiFk/93Tsd4+g5xFGcQz EsNMMWoCHbks3r2CEgMABaATUHnfb8BU8FafK67p5J0NwHXUSw== X-Google-Smtp-Source: AGRyM1t5NRmS7Rw7DQDtYTB4g/4VKAQA9g9neTH3QZMchIF7l52468Fg2aLJMHMkacqbhaeZ/A+vXKue1RGlM+z3Mps= X-Received: by 2002:a17:906:84ef:b0:72b:7649:f5bb with SMTP id zp15-20020a17090684ef00b0072b7649f5bbmr34311190ejb.637.1658292981049; Tue, 19 Jul 2022 21:56:21 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nobin Mathew Date: Wed, 20 Jul 2022 10:26:09 +0530 Message-ID: Subject: Re: VF is still resetting To: users@dpdk.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Any pointers? i40e driver forum asked to submit question here. iavf in-kernel VF driver has a larger timeout, #define IAVF_RESET_WAIT_COMPLETE_COUNT 2000 and also reset detection loop timeout #define IAVF_RESET_WAIT_DETECTED_COUNT 500 We tried even +#define IAVF_RESET_WAIT_CNT 3000 but no luck.... -Nobin On Tue, Jul 19, 2022 at 9:48 PM Nobin Mathew wrote: > > Hi, > > We are running a dpdk app inside a pod, and orchestrating the app very > frequently(test app). > > 1/100 or so we are getting an error: > > 2022-07-17T22:34:24.620291289+03:00 iavf_check_vf_reset_done(): reset > VFR value: 3 > 2022-07-17T22:34:24.620310455+03:00 iavf_init_vf(): VF is still resetting > 2022-07-17T22:34:24.620339697+03:00 iavf_dev_init(): Init vf failed > 2022-07-17T22:34:24.620390802+03:00 EAL: Releasing PCI mapped resource > for 0000:3b:0f.5 > 2022-07-17T22:34:24.620397381+03:00 EAL: Calling pci_unmap_resource > for 0000:3b:0f.5 at 0x2101000000 > 2022-07-17T22:34:24.620442514+03:00 EAL: Calling pci_unmap_resource > for 0000:3b:0f.5 at 0x2101010000 > 2022-07-17T22:34:24.729012277+03:00 EAL: Requested device 0000:3b:0f.5 > cannot be used > 2022-07-17T22:34:24.729028758+03:00 EAL: Bus (pci) probe failed. > > we added one log in dpdk lib to print the VFGEN_RSTAT register of the > VF. In problematic cases, we are seeing the value 3 which maps to > 0xDEADBEEF > > / VF reset states - these are written into the RSTAT register: > * VFGEN_RSTAT on the VF > * When the PF initiates a reset, it writes 0 > * When the reset is complete, it writes 1 > * When the PF detects that the VF has recovered, it writes 2 > * VF checks this register periodically to determine if a reset has occurred, > * then polls it to know when the reset is complete. > * If either the PF or VF reads the register while the hardware > * is in a reset state, it will return DEADBEEF, which, when masked > * will result in 3. > / > enum virtchnl_vfr_states { > VIRTCHNL_VFR_INPROGRESS = 0, > VIRTCHNL_VFR_COMPLETED, > VIRTCHNL_VFR_VFACTIVE, > }; > > We tried this patch also, increasing the poll time, no help. > https://github.com/DPDK/dpdk/commit/be7226980c9ad4963b92b489c8afb17f08899953 > > Details of the setup: > > DPDK library version > 21.11 > VF Driver:- > intel-iavf version 4.0.1-3.2 > PF driver:- > sudo ethtool -i enp94s0f1 > driver: i40e > version: 2.14.13 > firmware-version: 8.15 0x800096ca 20.0.17 > > Since we are seeing 0xDEADBEEF, I am assuming VF-PF reset mailbox msg > is received by PF, and PF initiated the RESET sequence by writing > VFSWR to VPGEN_VFRTRIG register. > > I am not seeing > " dev_err(&pf->pdev->dev, "VF reset check timeout on VF %d\n", " > anywhere in syslog. > > Any pointers?, why does this happen(why VF reset is not complete)?... > > One more question, what is the sequence of calls in the reset path? > i40e_vc_process_vf_msg() -> VIRTCHNL_OP_RESET_VF i40e_vc_reset_vf() -> > i40e_reset_vf() -> i40e_trigger_vf_reset() & i40e_cleanup_reset_vf() > > this one? > > -Nobin