From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id EE6B5F1F; Fri, 20 Oct 2017 03:15:16 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Oct 2017 18:15:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,404,1503385200"; d="scan'208";a="911745065" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.241.226.10]) ([10.241.226.10]) by FMSMGA003.fm.intel.com with ESMTP; 19 Oct 2017 18:15:15 -0700 To: "Patil, Harish" , Thomas Monjalon Cc: "dev@dpdk.org" , Jianfeng Tan , Jingjing Wu , "Thotton, Shijith" , Gregory Etelson , George Prekas , "stable@dpdk.org" , Sergio Gonzalez Monroy , "Glynn, Michael J" References: <20171017201436.65270-1-ferruh.yigit@intel.com> <2156540.XK3Rr5JXAh@xps> From: Ferruh Yigit Message-ID: Date: Thu, 19 Oct 2017 18:15:15 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-stable] [PATCH] igb_uio: revert open and release operations X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Oct 2017 01:15:17 -0000 On 10/19/2017 3:43 PM, Patil, Harish wrote: > -----Original Message----- > From: Harish Patil > Date: Tuesday, October 17, 2017 at 9:50 PM > To: Thomas Monjalon , Ferruh Yigit > > Cc: "dev@dpdk.org" , Jianfeng Tan , > Jingjing Wu , "Thotton, Shijith" > , Gregory Etelson , George > Prekas , "stable@dpdk.org" > Subject: Re: [PATCH] igb_uio: revert open and release operations >> >> >> -----Original Message----- >> From: Thomas Monjalon >> Date: Tuesday, October 17, 2017 at 1:33 PM >> To: Ferruh Yigit , Harish Patil >> >> Cc: "dev@dpdk.org" , Jianfeng Tan , >> Jingjing Wu , "Thotton, Shijith" >> , Gregory Etelson , George >> Prekas , "stable@dpdk.org" >> Subject: Re: [PATCH] igb_uio: revert open and release operations >> >>> 17/10/2017 22:14, Ferruh Yigit: >>>> There were bug reports about terminated application may leave device in >>>> undesired state: >>>> http://dpdk.org/ml/archives/dev/2016-November/049745.html >>>> http://dpdk.org/ml/archives/dev/2016-November/050932.html >>>> >>>> And a proposal to fix: >>>> http://dpdk.org/ml/archives/dev/2016-December/051844.html >>>> >>>> Later another proposal triggered the discussion: >>>> http://dpdk.org/ml/archives/dev/2017-May/066317.html >>>> >>>> Finally a fix patch pushed into v17.08: >>>> Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of >>>> device file") >>>> >>>> Later a regression report sent related to the pushed patch: >>>> http://dpdk.org/ml/archives/dev/2017-September/075236.html >>>> >>>> And a fix for regression integrated into v17.11-rc1: >>>> http://dpdk.org/ml/archives/dev/2017-October/079166.html >>>> Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in >>>> VM") >>>> Commit: 6b9ed026a870 ("igb_uio: fix build with kernel <= 3.17") >>>> >>>> Even after the fix qede PMD reported to be broken: >>>> http://dpdk.org/ml/archives/dev/2017-October/079359.html >>>> >>>> So this patch reverts original fix and related commits. The related >>>> igb_uio code part turns back to v17.05 base. >>> [...] >>>> --- >>>> It would be nice to solve this issue in LTS release, but being close to >>>> the release and the error report without details makes it hard to work >>>> more on this issue. >>> >>> With this revert, we are back to the original issue. >>> We must really try the proposed solution. >>> Harish, please describe your issue and think how it could be fixed. >>> Jingjing made it work for i40e. >>> I know it is less effort to request a simple revert. >>> Please let's try to fix it once for all. >> >> Hi Ferruh/Thomas, >> I’m discussing it internally, so please hold on committing this patch till >> I revert back to you. >> >> Thanks. > >> [Harish] > >> 1) With the introduction of: >> Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of >> device file”) > We saw failures with qede PF & SR-IOV VF initialization in PCI passthru > mode. > >> >> PF PCI passthru mode initialization failure was resolved by: >> “Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in >> VM") Thank you for the update. > >> SR-IOV VF PCI passthru mode initialization issue is that PCI FLR and >> related device cleanup is not completed by the time VF driver starts >> loading. It results in the mbox command failure sent over the HW channel >> between VF and PF. This seems same reason why i40e added a check and wait loop. >> >> Even though pci_reset_function() waits for the stipulated amount of time >> per standards, VF FLR takes longer than that and pci_reset_function() & >> igb_uio_open() call returns before FLR completes and VF PMD driver tries >> to load before FLR completes leading to VF PMD initialization failure. >> >> >> We can work around this problem by adding driver delay/retry logic since >> there is no deterministic way of detecting FLR completion. But this is >> going to increase the driver load time. >> > >> 2) With the above patch ("igb_uio: issue FLR during open and release of >> device file), FLR is going to be issued unconditionally on all devices >> during igb_uio_open. We think it’s an over kill. FLR is required only for >> previous abnormal app termination. We already handle the abnormal app >> termination by doing necessary cleanup in the driver during load. This >> cleanup is more efficient as it is done only when required. So we feel >> that the drivers/devices needing such cleanup (the two cases listed >> below) should do it conditionally when required rather than >> igb_uio_open() unconditionally performing FLR all the time. >> >> e.g., > - cdb166963cae ("net/liquidio: add API for VF FLR”) Both 1) and 2) related to the pci_reset during open(). But the main functionality we are looking for is the pci_reset in release(). So we can remove reset during open() [1]. Will disabling pci_reset in open() [2] solve your problems? Jingjing, Jianfeng, What do you think? [1] Perhaps will need to revert or partially revert liquidio patch after this. [2] disable line "pci_reset_function(dev);" in the igbuio_pci_open() http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/igb_uio/igb_uio.c#n339 >> >> >> - http://dpdk.org/ml/archives/dev/2017-May/066317.html > > Thanks, > Harish >> >