* [dpdk-dev] i40e: pci probe fails when using one bogus sfp @ 2017-06-08 9:29 Olivier Matz 2017-06-08 10:01 ` Bruce Richardson 0 siblings, 1 reply; 14+ messages in thread From: Olivier Matz @ 2017-06-08 9:29 UTC (permalink / raw) To: helin.zhang, jingjing.wu, dev Hi, One of our customers encounters an issue with dpdk when there is a bogus SFP on one of the ports. The following message is reported: PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 (note: 95 is EOPNOTSUPP/ENOTSUP) Unfortunately I cannot reproduce the issue to give more details, but the hypothesis is that it fails in i40e_dev_sync_phy_type(). It could be related to that patch: http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf To me, the expected behavior should be: - pci probe is succesful - the initialization of the port with faulty SFP fails - the initialization of the other ports is succesful Do you have any comment or idea to fix this issue? Thanks, Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-08 9:29 [dpdk-dev] i40e: pci probe fails when using one bogus sfp Olivier Matz @ 2017-06-08 10:01 ` Bruce Richardson 2017-06-08 10:13 ` Olivier Matz 0 siblings, 1 reply; 14+ messages in thread From: Bruce Richardson @ 2017-06-08 10:01 UTC (permalink / raw) To: Olivier Matz; +Cc: helin.zhang, jingjing.wu, dev On Thu, Jun 08, 2017 at 11:29:17AM +0200, Olivier Matz wrote: > Hi, > > One of our customers encounters an issue with dpdk when there > is a bogus SFP on one of the ports. The following message is > reported: > > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 > > (note: 95 is EOPNOTSUPP/ENOTSUP) > > Unfortunately I cannot reproduce the issue to give more details, > but the hypothesis is that it fails in i40e_dev_sync_phy_type(). > It could be related to that patch: > > http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf > > To me, the expected behavior should be: > - pci probe is succesful > - the initialization of the port with faulty SFP fails > - the initialization of the other ports is succesful > > Do you have any comment or idea to fix this issue? > And what is the current behaviour you are seeing? The whole PCI probe is terminating after the failure on the error port? Can that one problem port not just be blacklisted, since it's presumably unusable anyway? /Bruce ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-08 10:01 ` Bruce Richardson @ 2017-06-08 10:13 ` Olivier Matz 2017-06-12 8:45 ` Xing, Beilei 0 siblings, 1 reply; 14+ messages in thread From: Olivier Matz @ 2017-06-08 10:13 UTC (permalink / raw) To: Bruce Richardson; +Cc: helin.zhang, jingjing.wu, dev On Thu, 8 Jun 2017 11:01:54 +0100, Bruce Richardson <bruce.richardson@intel.com> wrote: > On Thu, Jun 08, 2017 at 11:29:17AM +0200, Olivier Matz wrote: > > Hi, > > > > One of our customers encounters an issue with dpdk when there > > is a bogus SFP on one of the ports. The following message is > > reported: > > > > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 > > > > (note: 95 is EOPNOTSUPP/ENOTSUP) > > > > Unfortunately I cannot reproduce the issue to give more details, > > but the hypothesis is that it fails in i40e_dev_sync_phy_type(). > > It could be related to that patch: > > > > http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf > > > > To me, the expected behavior should be: > > - pci probe is succesful > > - the initialization of the port with faulty SFP fails > > - the initialization of the other ports is succesful > > > > Do you have any comment or idea to fix this issue? > > > And what is the current behaviour you are seeing? The whole PCI probe is > terminating after the failure on the error port? Yes, the probe is terminating > Can that one problem > port not just be blacklisted, since it's presumably unusable anyway? This would imply the user (or the manager program) that launches the application to parse the logs to detect which port fails and update the configuration accordingly. I think it would be better to return an error at port initialization, so that the application can takes its dispositions directly. We could even imagine that the port could be reenabled later once the SFP is changed, without restarting the application. Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-08 10:13 ` Olivier Matz @ 2017-06-12 8:45 ` Xing, Beilei 2017-06-12 9:45 ` Olivier Matz 0 siblings, 1 reply; 14+ messages in thread From: Xing, Beilei @ 2017-06-12 8:45 UTC (permalink / raw) To: Olivier Matz, Richardson, Bruce; +Cc: Zhang, Helin, Wu, Jingjing, dev Hi Olivier, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz > Sent: Thursday, June 8, 2017 6:14 PM > To: Richardson, Bruce <bruce.richardson@intel.com> > Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing > <jingjing.wu@intel.com>; dev@dpdk.org > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > On Thu, 8 Jun 2017 11:01:54 +0100, Bruce Richardson > <bruce.richardson@intel.com> wrote: > > On Thu, Jun 08, 2017 at 11:29:17AM +0200, Olivier Matz wrote: > > > Hi, > > > > > > One of our customers encounters an issue with dpdk when there is a > > > bogus SFP on one of the ports. The following message is > > > reported: > > > > > > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 > > > > > > (note: 95 is EOPNOTSUPP/ENOTSUP) > > > > > > Unfortunately I cannot reproduce the issue to give more details, but > > > the hypothesis is that it fails in i40e_dev_sync_phy_type(). > > > It could be related to that patch: > > > > > > http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf > > > > > > To me, the expected behavior should be: > > > - pci probe is succesful > > > - the initialization of the port with faulty SFP fails > > > - the initialization of the other ports is succesful > > > > > > Do you have any comment or idea to fix this issue? > > > > > And what is the current behaviour you are seeing? The whole PCI probe > > is terminating after the failure on the error port? > > Yes, the probe is terminating Sorry I'm not very clear about the termination of PCI probe you mentioned. I did some test in current code base: there're two ports (87:00.0 and 87:00.2)bound to igb_uio, and force the first port to fail to initialize, I find that the second port still can finish initialization successfully. I thought it has met your request. Please correct me if I'm wrong. EAL: PCI device 0000:87:00.0 on NUMA socket -1 EAL: probe driver: 8086:1572 net_i40e ~failed eth_i40e_dev_init(): Failed to sync phy type: 0 EAL: PCI device 0000:87:00.1 on NUMA socket -1 EAL: probe driver: 8086:1572 net_i40e EAL: PCI device 0000:87:00.2 on NUMA socket -1 EAL: probe driver: 8086:1572 net_i40e ~succeed Beilei > > > Can that one problem > > port not just be blacklisted, since it's presumably unusable anyway? > > This would imply the user (or the manager program) that launches the > application to parse the logs to detect which port fails and update the > configuration accordingly. > > I think it would be better to return an error at port initialization, so that the > application can takes its dispositions directly. We could even imagine that the > port could be reenabled later once the SFP is changed, without restarting the > application. > > > Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-12 8:45 ` Xing, Beilei @ 2017-06-12 9:45 ` Olivier Matz 2017-06-12 15:53 ` Wu, Jingjing 2017-06-12 16:23 ` Wu, Jingjing 0 siblings, 2 replies; 14+ messages in thread From: Olivier Matz @ 2017-06-12 9:45 UTC (permalink / raw) To: Xing, Beilei; +Cc: Richardson, Bruce, Zhang, Helin, Wu, Jingjing, dev Hi Beilei, On Mon, 12 Jun 2017 08:45:43 +0000, "Xing, Beilei" <beilei.xing@intel.com> wrote: > Hi Olivier, > > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz > > Sent: Thursday, June 8, 2017 6:14 PM > > To: Richardson, Bruce <bruce.richardson@intel.com> > > Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing > > <jingjing.wu@intel.com>; dev@dpdk.org > > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > > > On Thu, 8 Jun 2017 11:01:54 +0100, Bruce Richardson > > <bruce.richardson@intel.com> wrote: > > > On Thu, Jun 08, 2017 at 11:29:17AM +0200, Olivier Matz wrote: > > > > Hi, > > > > > > > > One of our customers encounters an issue with dpdk when there is a > > > > bogus SFP on one of the ports. The following message is > > > > reported: > > > > > > > > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 > > > > > > > > (note: 95 is EOPNOTSUPP/ENOTSUP) > > > > > > > > Unfortunately I cannot reproduce the issue to give more details, but > > > > the hypothesis is that it fails in i40e_dev_sync_phy_type(). > > > > It could be related to that patch: > > > > > > > > http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf > > > > > > > > To me, the expected behavior should be: > > > > - pci probe is succesful > > > > - the initialization of the port with faulty SFP fails > > > > - the initialization of the other ports is succesful > > > > > > > > Do you have any comment or idea to fix this issue? > > > > > > > And what is the current behaviour you are seeing? The whole PCI probe > > > is terminating after the failure on the error port? > > > > Yes, the probe is terminating > > Sorry I'm not very clear about the termination of PCI probe you mentioned. > I did some test in current code base: there're two ports (87:00.0 and 87:00.2)bound to igb_uio, and force the first port to fail to initialize, I find that the second port still can finish initialization successfully. I thought it has met your request. Please correct me if I'm wrong. > > EAL: PCI device 0000:87:00.0 on NUMA socket -1 > EAL: probe driver: 8086:1572 net_i40e > ~failed > eth_i40e_dev_init(): Failed to sync phy type: 0 > EAL: PCI device 0000:87:00.1 on NUMA socket -1 > EAL: probe driver: 8086:1572 net_i40e > EAL: PCI device 0000:87:00.2 on NUMA socket -1 > EAL: probe driver: 8086:1572 net_i40e > ~succeed Thank you for your quick answer. Yes, the pci probing continues for the other ports even if one port failed (since v17.05, commit 10f6c93cea). But I find a bit strange to have this check about the SFP in the PCI probing function instead of having it the port initialization function. My understanding is the SFP check is not related to PCI probing. Is it consistent with other drivers? In case of failure, it shifts the port ids of next ports, making it harder to recognize them in the application. With current code, after a failure, if the user replaces the faulty SFP after the application is started, it requires the application to support hotplug to ask to probe the PCI again to make the port appear again. If the failure is moved in the port start function, it would just require the application to start the port again. Regards Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-12 9:45 ` Olivier Matz @ 2017-06-12 15:53 ` Wu, Jingjing 2017-06-12 16:25 ` Wu, Jingjing 2017-06-12 16:23 ` Wu, Jingjing 1 sibling, 1 reply; 14+ messages in thread From: Wu, Jingjing @ 2017-06-12 15:53 UTC (permalink / raw) To: Olivier Matz, Xing, Beilei; +Cc: Richardson, Bruce, Zhang, Helin, dev > -----Original Message----- > From: Olivier Matz [mailto:olivier.matz@6wind.com] > Sent: Monday, June 12, 2017 5:46 PM > To: Xing, Beilei <beilei.xing@intel.com> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>; > Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > Hi Beilei, > > On Mon, 12 Jun 2017 08:45:43 +0000, "Xing, Beilei" <beilei.xing@intel.com> wrote: > > Hi Olivier, > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz > > > Sent: Thursday, June 8, 2017 6:14 PM > > > To: Richardson, Bruce <bruce.richardson@intel.com> > > > Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing > > > <jingjing.wu@intel.com>; dev@dpdk.org > > > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > > > > > On Thu, 8 Jun 2017 11:01:54 +0100, Bruce Richardson > > > <bruce.richardson@intel.com> wrote: > > > > On Thu, Jun 08, 2017 at 11:29:17AM +0200, Olivier Matz wrote: > > > > > Hi, > > > > > > > > > > One of our customers encounters an issue with dpdk when there is a > > > > > bogus SFP on one of the ports. The following message is > > > > > reported: > > > > > > > > > > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 > > > > > > > > > > (note: 95 is EOPNOTSUPP/ENOTSUP) > > > > > > > > > > Unfortunately I cannot reproduce the issue to give more details, but > > > > > the hypothesis is that it fails in i40e_dev_sync_phy_type(). > > > > > It could be related to that patch: > > > > > > > > > > http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf > > > > > > > > > > To me, the expected behavior should be: > > > > > - pci probe is succesful > > > > > - the initialization of the port with faulty SFP fails > > > > > - the initialization of the other ports is succesful > > > > > > > > > > Do you have any comment or idea to fix this issue? > > > > > > > > > And what is the current behaviour you are seeing? The whole PCI probe > > > > is terminating after the failure on the error port? > > > > > > Yes, the probe is terminating > > > > Sorry I'm not very clear about the termination of PCI probe you mentioned. > > I did some test in current code base: there're two ports (87:00.0 and 87:00.2)bound to > igb_uio, and force the first port to fail to initialize, I find that the second port still can finish > initialization successfully. I thought it has met your request. Please correct me if I'm wrong. > > > > EAL: PCI device 0000:87:00.0 on NUMA socket -1 > > EAL: probe driver: 8086:1572 net_i40e > > ~failed > > eth_i40e_dev_init(): Failed to sync phy type: 0 > > EAL: PCI device 0000:87:00.1 on NUMA socket -1 > > EAL: probe driver: 8086:1572 net_i40e > > EAL: PCI device 0000:87:00.2 on NUMA socket -1 > > EAL: probe driver: 8086:1572 net_i40e > > ~succeed > > > Thank you for your quick answer. > > Yes, the pci probing continues for the other ports even if one port > failed (since v17.05, commit 10f6c93cea). > > But I find a bit strange to have this check about the SFP in the > PCI probing function instead of having it the port initialization > function. My understanding is the SFP check is not related to PCI > probing. Is it consistent with other drivers? > > In case of failure, it shifts the port ids of next ports, making it > harder to recognize them in the application. > > With current code, after a failure, if the user replaces the faulty SFP > after the application is started, it requires the application to support > hotplug to ask to probe the PCI again to make the port appear again. > > If the failure is moved in the port start function, it would just > require the application to start the port again. > > > Regards > Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-12 15:53 ` Wu, Jingjing @ 2017-06-12 16:25 ` Wu, Jingjing 0 siblings, 0 replies; 14+ messages in thread From: Wu, Jingjing @ 2017-06-12 16:25 UTC (permalink / raw) To: Wu, Jingjing, Olivier Matz, Xing, Beilei Cc: Richardson, Bruce, Zhang, Helin, dev Please ignore the empty mail. > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Wu, Jingjing > Sent: Monday, June 12, 2017 11:54 PM > To: Olivier Matz <olivier.matz@6wind.com>; Xing, Beilei <beilei.xing@intel.com> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Zhang, Helin <helin.zhang@intel.com>; > dev@dpdk.org > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > > > > -----Original Message----- > > From: Olivier Matz [mailto:olivier.matz@6wind.com] > > Sent: Monday, June 12, 2017 5:46 PM > > To: Xing, Beilei <beilei.xing@intel.com> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Zhang, Helin > <helin.zhang@intel.com>; > > Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org > > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > > > Hi Beilei, > > > > On Mon, 12 Jun 2017 08:45:43 +0000, "Xing, Beilei" <beilei.xing@intel.com> wrote: > > > Hi Olivier, > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz > > > > Sent: Thursday, June 8, 2017 6:14 PM > > > > To: Richardson, Bruce <bruce.richardson@intel.com> > > > > Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing > > > > <jingjing.wu@intel.com>; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > > > > > > > On Thu, 8 Jun 2017 11:01:54 +0100, Bruce Richardson > > > > <bruce.richardson@intel.com> wrote: > > > > > On Thu, Jun 08, 2017 at 11:29:17AM +0200, Olivier Matz wrote: > > > > > > Hi, > > > > > > > > > > > > One of our customers encounters an issue with dpdk when there is a > > > > > > bogus SFP on one of the ports. The following message is > > > > > > reported: > > > > > > > > > > > > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95 > > > > > > > > > > > > (note: 95 is EOPNOTSUPP/ENOTSUP) > > > > > > > > > > > > Unfortunately I cannot reproduce the issue to give more details, but > > > > > > the hypothesis is that it fails in i40e_dev_sync_phy_type(). > > > > > > It could be related to that patch: > > > > > > > > > > > > http://dpdk.org/browse/dpdk/commit/?id=edfb226f69bf > > > > > > > > > > > > To me, the expected behavior should be: > > > > > > - pci probe is succesful > > > > > > - the initialization of the port with faulty SFP fails > > > > > > - the initialization of the other ports is succesful > > > > > > > > > > > > Do you have any comment or idea to fix this issue? > > > > > > > > > > > And what is the current behaviour you are seeing? The whole PCI probe > > > > > is terminating after the failure on the error port? > > > > > > > > Yes, the probe is terminating > > > > > > Sorry I'm not very clear about the termination of PCI probe you mentioned. > > > I did some test in current code base: there're two ports (87:00.0 and 87:00.2)bound to > > igb_uio, and force the first port to fail to initialize, I find that the second port still can finish > > initialization successfully. I thought it has met your request. Please correct me if I'm wrong. > > > > > > EAL: PCI device 0000:87:00.0 on NUMA socket -1 > > > EAL: probe driver: 8086:1572 net_i40e > > > ~failed > > > eth_i40e_dev_init(): Failed to sync phy type: 0 > > > EAL: PCI device 0000:87:00.1 on NUMA socket -1 > > > EAL: probe driver: 8086:1572 net_i40e > > > EAL: PCI device 0000:87:00.2 on NUMA socket -1 > > > EAL: probe driver: 8086:1572 net_i40e > > > ~succeed > > > > > > Thank you for your quick answer. > > > > Yes, the pci probing continues for the other ports even if one port > > failed (since v17.05, commit 10f6c93cea). > > > > But I find a bit strange to have this check about the SFP in the > > PCI probing function instead of having it the port initialization > > function. My understanding is the SFP check is not related to PCI > > probing. Is it consistent with other drivers? > > > > In case of failure, it shifts the port ids of next ports, making it > > harder to recognize them in the application. > > > > With current code, after a failure, if the user replaces the faulty SFP > > after the application is started, it requires the application to support > > hotplug to ask to probe the PCI again to make the port appear again. > > > > If the failure is moved in the port start function, it would just > > require the application to start the port again. > > > > > > Regards > > Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-12 9:45 ` Olivier Matz 2017-06-12 15:53 ` Wu, Jingjing @ 2017-06-12 16:23 ` Wu, Jingjing 2017-06-13 8:27 ` Olivier MATZ 1 sibling, 1 reply; 14+ messages in thread From: Wu, Jingjing @ 2017-06-12 16:23 UTC (permalink / raw) To: Olivier Matz, Xing, Beilei; +Cc: Richardson, Bruce, Zhang, Helin, dev HI, Olivier > Thank you for your quick answer. > > Yes, the pci probing continues for the other ports even if one port > failed (since v17.05, commit 10f6c93cea). > > But I find a bit strange to have this check about the SFP in the > PCI probing function instead of having it the port initialization > function. My understanding is the SFP check is not related to PCI > probing. Is it consistent with other drivers? > Could your customer help to check what is the exactly error code is by Checking the "hw->aq.asq_last_status" when eth_i40e_dev_init() fails. Yes, it seems better PHY init fails doesn't block PCI probe. Just compared with i40e Kernel version, PHY init fails doesn't block CPI probe. And there is watchdog task to Check the PHY status. But DPDK is polling mode, If PCI probe fails, PCI probe continues, then application need poll PHY status to support SFP change. And I also checked ixgbe driver, it seems phy init is done at probe time. In my opinion, dev_start and dev_stop is meaning ready for receiving and transmitting packets, it may not be suitable to put it in the start/stop phase. > > Regards > Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-12 16:23 ` Wu, Jingjing @ 2017-06-13 8:27 ` Olivier MATZ 2017-06-13 14:14 ` Wu, Jingjing 0 siblings, 1 reply; 14+ messages in thread From: Olivier MATZ @ 2017-06-13 8:27 UTC (permalink / raw) To: Wu, Jingjing; +Cc: Xing, Beilei, Richardson, Bruce, Zhang, Helin, dev Hi Jingjing, On Mon, 12 Jun 2017 16:23:47 +0000, "Wu, Jingjing" <jingjing.wu@intel.com> wrote: > HI, Olivier > > > Thank you for your quick answer. > > > > Yes, the pci probing continues for the other ports even if one port > > failed (since v17.05, commit 10f6c93cea). > > > > But I find a bit strange to have this check about the SFP in the > > PCI probing function instead of having it the port initialization > > function. My understanding is the SFP check is not related to PCI > > probing. Is it consistent with other drivers? > > > Could your customer help to check what is the exactly error code is by > Checking the "hw->aq.asq_last_status" when eth_i40e_dev_init() fails. I'm afraid it won't be possible, since it's a random issue that is not reproducible. What do you think about adding a log in i40e_dev_sync_phy_type() to display the status value in case of failure? It would help for next times. I can submit a patch for that if you want. > Yes, it seems better PHY init fails doesn't block PCI probe. Just compared with i40e > Kernel version, PHY init fails doesn't block CPI probe. And there is watchdog task to > Check the PHY status. But DPDK is polling mode, If PCI probe fails, PCI probe continues, > then application need poll PHY status to support SFP change. From what I understand, i40e_dev_sync_phy_type() was added to know the PHY capabilities, which useful for instance for devinfo(). Indeed, devinfo() can be call before the port is started, so we need to have get the PHY capabilities before starting the port. I've done a quick patch that: - keeps the call to i40e_dev_sync_phy_type() in pci probing but continue in case of failure - add another call to i40e_dev_sync_phy_type() in port start function I think it would solve the issue without impacting the result of the devinfo() function. What would you think of a patch like this? > And I also checked ixgbe driver, it seems phy init is done at probe time. > In my opinion, dev_start and dev_stop is meaning ready for receiving and transmitting > packets, it may not be suitable to put it in the start/stop phase. I'm wondering: what would/should occur if the SFP is unplugged and replugged while the port is running? I suppose we don't have any PCI notification when unplugging/plugging the SFP, so I'm not sure we should have this check at the PCI level, because the application does not know if the bus has to be probed again. Thanks Olivier ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-13 8:27 ` Olivier MATZ @ 2017-06-13 14:14 ` Wu, Jingjing 2017-06-15 9:03 ` Olivier MATZ 0 siblings, 1 reply; 14+ messages in thread From: Wu, Jingjing @ 2017-06-13 14:14 UTC (permalink / raw) To: Olivier MATZ; +Cc: Xing, Beilei, Richardson, Bruce, Zhang, Helin, dev > -----Original Message----- > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > Sent: Tuesday, June 13, 2017 4:28 PM > To: Wu, Jingjing <jingjing.wu@intel.com> > Cc: Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; > Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > Hi Jingjing, > > On Mon, 12 Jun 2017 16:23:47 +0000, "Wu, Jingjing" <jingjing.wu@intel.com> wrote: > > HI, Olivier > > > > > Thank you for your quick answer. > > > > > > Yes, the pci probing continues for the other ports even if one port > > > failed (since v17.05, commit 10f6c93cea). > > > > > > But I find a bit strange to have this check about the SFP in the > > > PCI probing function instead of having it the port initialization > > > function. My understanding is the SFP check is not related to PCI > > > probing. Is it consistent with other drivers? > > > > > Could your customer help to check what is the exactly error code is by > > Checking the "hw->aq.asq_last_status" when eth_i40e_dev_init() fails. > > I'm afraid it won't be possible, since it's a random issue that is > not reproducible. > > What do you think about adding a log in i40e_dev_sync_phy_type() to > display the status value in case of failure? It would help for next > times. I can submit a patch for that if you want. > Yes, It makes sense. Thanks for doing that. > > Yes, it seems better PHY init fails doesn't block PCI probe. Just compared with i40e > > Kernel version, PHY init fails doesn't block CPI probe. And there is watchdog task to > > Check the PHY status. But DPDK is polling mode, If PCI probe fails, PCI probe continues, > > then application need poll PHY status to support SFP change. > > From what I understand, i40e_dev_sync_phy_type() was added > to know the PHY capabilities, which useful for instance for devinfo(). > Indeed, devinfo() can be call before the port is started, so > we need to have get the PHY capabilities before starting the port. > > I've done a quick patch that: > - keeps the call to i40e_dev_sync_phy_type() in pci probing but > continue in case of failure > - add another call to i40e_dev_sync_phy_type() in port start > function > I think this workaround would not introduce issue if SFP is ready on NIC before Probe. But I'm not sure if it can resolve the issue, because during probe, dozens of initialization work Is ongoing. I think we need to verify it PHY init during start phase works or not. Maybe we Need to do more than that? Not sure. > I think it would solve the issue without impacting the result > of the devinfo() function. What would you think of a patch like > this? > > > And I also checked ixgbe driver, it seems phy init is done at probe time. > > In my opinion, dev_start and dev_stop is meaning ready for receiving and transmitting > > packets, it may not be suitable to put it in the start/stop phase. > > I'm wondering: what would/should occur if the SFP is unplugged and > replugged while the port is running? > > I suppose we don't have any PCI notification when unplugging/plugging > the SFP, so I'm not sure we should have this check at the PCI level, > because the application does not know if the bus has to be probed again. > We can consider about unplugging/plugging as link event in DPDK. Now we have LSC or user can polling link status. Maybe that is a way to go. Thanks Jingjing ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp 2017-06-13 14:14 ` Wu, Jingjing @ 2017-06-15 9:03 ` Olivier MATZ 2017-06-15 9:08 ` [dpdk-dev] [PATCH] net/i40e: avoid PCI probing failure when using " Olivier Matz 0 siblings, 1 reply; 14+ messages in thread From: Olivier MATZ @ 2017-06-15 9:03 UTC (permalink / raw) To: Wu, Jingjing; +Cc: Xing, Beilei, Richardson, Bruce, Zhang, Helin, dev Hi, On Tue, 13 Jun 2017 14:14:43 +0000, "Wu, Jingjing" <jingjing.wu@intel.com> wrote: > > -----Original Message----- > > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > > Sent: Tuesday, June 13, 2017 4:28 PM > > To: Wu, Jingjing <jingjing.wu@intel.com> > > Cc: Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; > > Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org > > Subject: Re: [dpdk-dev] i40e: pci probe fails when using one bogus sfp > > > > Hi Jingjing, > > > > On Mon, 12 Jun 2017 16:23:47 +0000, "Wu, Jingjing" <jingjing.wu@intel.com> wrote: > > > HI, Olivier > > > > > > > Thank you for your quick answer. > > > > > > > > Yes, the pci probing continues for the other ports even if one port > > > > failed (since v17.05, commit 10f6c93cea). > > > > > > > > But I find a bit strange to have this check about the SFP in the > > > > PCI probing function instead of having it the port initialization > > > > function. My understanding is the SFP check is not related to PCI > > > > probing. Is it consistent with other drivers? > > > > > > > Could your customer help to check what is the exactly error code is by > > > Checking the "hw->aq.asq_last_status" when eth_i40e_dev_init() fails. > > > > I'm afraid it won't be possible, since it's a random issue that is > > not reproducible. > > > > What do you think about adding a log in i40e_dev_sync_phy_type() to > > display the status value in case of failure? It would help for next > > times. I can submit a patch for that if you want. > > > Yes, It makes sense. Thanks for doing that. > > > > Yes, it seems better PHY init fails doesn't block PCI probe. Just compared with i40e > > > Kernel version, PHY init fails doesn't block CPI probe. And there is watchdog task to > > > Check the PHY status. But DPDK is polling mode, If PCI probe fails, PCI probe continues, > > > then application need poll PHY status to support SFP change. > > > > From what I understand, i40e_dev_sync_phy_type() was added > > to know the PHY capabilities, which useful for instance for devinfo(). > > Indeed, devinfo() can be call before the port is started, so > > we need to have get the PHY capabilities before starting the port. > > > > I've done a quick patch that: > > - keeps the call to i40e_dev_sync_phy_type() in pci probing but > > continue in case of failure > > - add another call to i40e_dev_sync_phy_type() in port start > > function > > > I think this workaround would not introduce issue if SFP is ready on NIC before Probe. > But I'm not sure if it can resolve the issue, because during probe, dozens of initialization work > Is ongoing. I think we need to verify it PHY init during start phase works or not. Maybe we > Need to do more than that? Not sure. After discussion with Helin, I moved the call to i40e_dev_sync_phy_type() in dev_configure(). I'm sending it as a reply to this mail. I see if I can get my patch tested by the customer. > > > I think it would solve the issue without impacting the result > > of the devinfo() function. What would you think of a patch like > > this? > > > > > And I also checked ixgbe driver, it seems phy init is done at probe time. > > > In my opinion, dev_start and dev_stop is meaning ready for receiving and transmitting > > > packets, it may not be suitable to put it in the start/stop phase. > > > > I'm wondering: what would/should occur if the SFP is unplugged and > > replugged while the port is running? > > > > I suppose we don't have any PCI notification when unplugging/plugging > > the SFP, so I'm not sure we should have this check at the PCI level, > > because the application does not know if the bus has to be probed again. > > > We can consider about unplugging/plugging as link event in DPDK. Now we > have LSC or user can polling link status. Maybe that is a way to go. Yes, I agree. Thanks for the confirmation. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [dpdk-dev] [PATCH] net/i40e: avoid PCI probing failure when using bogus sfp 2017-06-15 9:03 ` Olivier MATZ @ 2017-06-15 9:08 ` Olivier Matz 2017-06-23 9:25 ` Wu, Jingjing 0 siblings, 1 reply; 14+ messages in thread From: Olivier Matz @ 2017-06-15 9:08 UTC (permalink / raw) To: dev, jingjing.wu, helin.zhang; +Cc: beilei.xing, bruce.richardson When a port is using a bogus SFP, the PCI probing returns an error, preventing to register a portid. To give a better chance to the applications to retry after the SFP is changed, move this check in eth_i40e_dev_configure(), so that only a port reconfiguration is needed to retry. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> --- drivers/net/i40e/i40e_ethdev.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index d3428b9c0..1f256d767 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -1142,11 +1142,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) config_floating_veb(dev); /* Clear PXE mode */ i40e_clear_pxe_mode(hw); - ret = i40e_dev_sync_phy_type(hw); - if (ret) { - PMD_INIT_LOG(ERR, "Failed to sync phy type: %d", ret); - goto err_sync_phy_type; - } + i40e_dev_sync_phy_type(hw); + /* * On X710, performance number is far from the expectation on recent * firmware versions. The fix for this issue may not be integrated in @@ -1331,7 +1328,6 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) err_qp_pool_init: err_parameter_init: err_get_capabilities: -err_sync_phy_type: (void)i40e_shutdown_adminq(hw); return ret; @@ -1470,9 +1466,14 @@ i40e_dev_configure(struct rte_eth_dev *dev) struct i40e_adapter *ad = I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); + struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private); enum rte_eth_rx_mq_mode mq_mode = dev->data->dev_conf.rxmode.mq_mode; int i, ret; + ret = i40e_dev_sync_phy_type(hw); + if (ret) + return ret; + /* Initialize to TRUE. If any of Rx queues doesn't meet the * bulk allocation or vector Rx preconditions we will reset it. */ @@ -9176,8 +9177,11 @@ i40e_dev_sync_phy_type(struct i40e_hw *hw) status = i40e_aq_get_phy_capabilities(hw, false, true, &phy_ab, NULL); - if (status) + if (status) { + PMD_INIT_LOG(WARNING, "Failed to sync phy type: status=%d", + status); return ret; + } return 0; } -- 2.11.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] [PATCH] net/i40e: avoid PCI probing failure when using bogus sfp 2017-06-15 9:08 ` [dpdk-dev] [PATCH] net/i40e: avoid PCI probing failure when using " Olivier Matz @ 2017-06-23 9:25 ` Wu, Jingjing 2017-06-23 10:11 ` Ferruh Yigit 0 siblings, 1 reply; 14+ messages in thread From: Wu, Jingjing @ 2017-06-23 9:25 UTC (permalink / raw) To: Olivier Matz, dev, Zhang, Helin; +Cc: Xing, Beilei, Richardson, Bruce > -----Original Message----- > From: Olivier Matz [mailto:olivier.matz@6wind.com] > Sent: Thursday, June 15, 2017 5:09 PM > To: dev@dpdk.org; Wu, Jingjing <jingjing.wu@intel.com>; Zhang, Helin > <helin.zhang@intel.com> > Cc: Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce <bruce.richardson@intel.com> > Subject: [PATCH] net/i40e: avoid PCI probing failure when using bogus sfp > > When a port is using a bogus SFP, the PCI probing returns an error, > preventing to register a portid. > > To give a better chance to the applications to retry after the SFP is > changed, move this check in eth_i40e_dev_configure(), so that only a > port reconfiguration is needed to retry. > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Jingjing Wu <jingjing.wu@intel.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] [PATCH] net/i40e: avoid PCI probing failure when using bogus sfp 2017-06-23 9:25 ` Wu, Jingjing @ 2017-06-23 10:11 ` Ferruh Yigit 0 siblings, 0 replies; 14+ messages in thread From: Ferruh Yigit @ 2017-06-23 10:11 UTC (permalink / raw) To: Wu, Jingjing, Olivier Matz, dev, Zhang, Helin Cc: Xing, Beilei, Richardson, Bruce On 6/23/2017 10:25 AM, Wu, Jingjing wrote: > > >> -----Original Message----- >> From: Olivier Matz [mailto:olivier.matz@6wind.com] >> Sent: Thursday, June 15, 2017 5:09 PM >> To: dev@dpdk.org; Wu, Jingjing <jingjing.wu@intel.com>; Zhang, Helin >> <helin.zhang@intel.com> >> Cc: Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce <bruce.richardson@intel.com> >> Subject: [PATCH] net/i40e: avoid PCI probing failure when using bogus sfp >> >> When a port is using a bogus SFP, the PCI probing returns an error, >> preventing to register a portid. >> >> To give a better chance to the applications to retry after the SFP is >> changed, move this check in eth_i40e_dev_configure(), so that only a >> port reconfiguration is needed to retry. >> >> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> > Acked-by: Jingjing Wu <jingjing.wu@intel.com> Applied to dpdk-next-net/master, thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-06-23 10:11 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-08 9:29 [dpdk-dev] i40e: pci probe fails when using one bogus sfp Olivier Matz 2017-06-08 10:01 ` Bruce Richardson 2017-06-08 10:13 ` Olivier Matz 2017-06-12 8:45 ` Xing, Beilei 2017-06-12 9:45 ` Olivier Matz 2017-06-12 15:53 ` Wu, Jingjing 2017-06-12 16:25 ` Wu, Jingjing 2017-06-12 16:23 ` Wu, Jingjing 2017-06-13 8:27 ` Olivier MATZ 2017-06-13 14:14 ` Wu, Jingjing 2017-06-15 9:03 ` Olivier MATZ 2017-06-15 9:08 ` [dpdk-dev] [PATCH] net/i40e: avoid PCI probing failure when using " Olivier Matz 2017-06-23 9:25 ` Wu, Jingjing 2017-06-23 10:11 ` Ferruh Yigit
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).