DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Question about unsupported transceivers
@ 2015-10-13 17:22 Alex Forster
  2015-10-13 18:57 ` Alex Forster
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Forster @ 2015-10-13 17:22 UTC (permalink / raw)
  To: dev

Hi everybody, apologies for coming to this list with a tech support question.

I'm completely stumped about using non-Intel transceivers with DPDK. testpmd is bailing here: PMD: eth_ixgbe_dev_init(): Unsupported SFP+ Module / PMD: eth_ixgbe_dev_init(): Hardware Initialization Failure: -19

My box is an x64 server running Debian 8 (Jessie) with two X520-Q1 cards using Finisar QSFP transceivers. Here are the things that I've tried so far, unsuccessfully-

  *   Added CONFIG_RTE_LIBRTE_IXGBE_ALLOW_UNSUPPORTED_SFP=y to config/defconfig_x86_64-native-linuxapp-gcc and rebuilt/reinstalled/rebooted
  *   Tried various incantations of modprobe/insmod with allow_unsupported_sfp=1 appended
  *   Added options ixgbe allow_unsupported_sfp=1 to /etc/modprobe.d/dpdk.conf and rebuilt the initrd

Can anybody lead me in the right direction here? It seems like a lot of the information floating around about this issue may be out of date.

Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-13 17:22 [dpdk-dev] Question about unsupported transceivers Alex Forster
@ 2015-10-13 18:57 ` Alex Forster
  2015-10-13 20:34   ` Alexander Duyck
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Forster @ 2015-10-13 18:57 UTC (permalink / raw)
  To: dev

I believe I've discovered my problem: https://gist.github.com/AlexForster/0fb4699bcdf196cf5462

As mentioned previously, I have two X520-Q1 cards installed. It appears that initialization of the first card obeys allow_unsupported_sfp=1, but initialization of the second card does not.

Is this a bug, or is there a way to work around this that I'm not aware of?

Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-13 18:57 ` Alex Forster
@ 2015-10-13 20:34   ` Alexander Duyck
  2015-10-15 14:46     ` Alex Forster
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2015-10-13 20:34 UTC (permalink / raw)
  To: Alex Forster, dev

On 10/13/2015 11:57 AM, Alex Forster wrote:
> I believe I've discovered my problem: https://gist.github.com/AlexForster/0fb4699bcdf196cf5462
>
> As mentioned previously, I have two X520-Q1 cards installed. It appears that initialization of the first card obeys allow_unsupported_sfp=1, but initialization of the second card does not.
>
> Is this a bug, or is there a way to work around this that I'm not aware of?
>
> Alex Forster

If you are using Intel's out-of-tree ixgbe driver I believe the module 
parameters are comma separated with one index per port.  So if you have 
two ports you should be passing "allow_unsupported_sfp=1,1", and for 4 
you would need four '1's.

- Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-13 20:34   ` Alexander Duyck
@ 2015-10-15 14:46     ` Alex Forster
  2015-10-15 15:30       ` Alexander Duyck
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Forster @ 2015-10-15 14:46 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: dev

On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:

>If you are using Intel's out-of-tree ixgbe driver I believe the module
>parameters are comma separated with one index per port.  So if you have
>two ports you should be passing "allow_unsupported_sfp=1,1", and for 4
>you would need four '1's.

This seemed very promising. I compiled and installed the out of tree ixgbe
driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows all
eight "allow_unsupported_sfp enabled" messages but the last four ports
still error out with the unsupported SFP message when running the tests.

Before I start arbitrarily trying to patch out parts of the SFP
verification code in ixgbe, are there any other tips I should know?

Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 14:46     ` Alex Forster
@ 2015-10-15 15:30       ` Alexander Duyck
  2015-10-15 15:33         ` Alex Forster
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2015-10-15 15:30 UTC (permalink / raw)
  To: Alex Forster; +Cc: dev

On 10/15/2015 07:46 AM, Alex Forster wrote:
> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:
>
>> If you are using Intel's out-of-tree ixgbe driver I believe the module
>> parameters are comma separated with one index per port.  So if you have
>> two ports you should be passing "allow_unsupported_sfp=1,1", and for 4
>> you would need four '1's.
>
> This seemed very promising. I compiled and installed the out of tree ixgbe
> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows all
> eight "allow_unsupported_sfp enabled" messages but the last four ports
> still error out with the unsupported SFP message when running the tests.
>
> Before I start arbitrarily trying to patch out parts of the SFP
> verification code in ixgbe, are there any other tips I should know?

Can you send me the command you used to load the module, and the exact 
number of ixgbe ports you have in the system?  With that I could then 
verify that the command was entered correctly as it is possible there 
could still be an issue in the way the command was entered.

One other possibility is that when the driver loads each load counts as 
an instance in the module parameter array.  So if for example you unbind 
the driver on one port and then later rebind it you will have consumed 
one of the values in the array.  Do it enough times and you exceed the 
bounds of the array as you entered it and it will simply use the default 
value of 0.

Also the output of "ethtool -i <ethX>" would be useful to verify that 
you have the out-of-tree driver loaded and not the in kernel.

- Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 15:30       ` Alexander Duyck
@ 2015-10-15 15:33         ` Alex Forster
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Forster @ 2015-10-15 15:33 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: dev

On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:


>On 10/15/2015 07:46 AM, Alex Forster wrote:
>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>wrote:
>>
>>> If you are using Intel's out-of-tree ixgbe driver I believe the module
>>> parameters are comma separated with one index per port.  So if you have
>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for 4
>>> you would need four '1's.
>>
>> This seemed very promising. I compiled and installed the out of tree
>>ixgbe
>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows all
>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>> still error out with the unsupported SFP message when running the tests.
>>
>> Before I start arbitrarily trying to patch out parts of the SFP
>> verification code in ixgbe, are there any other tips I should know?
>
>Can you send me the command you used to load the module, and the exact
>number of ixgbe ports you have in the system?  With that I could then
>verify that the command was entered correctly as it is possible there
>could still be an issue in the way the command was entered.
>
>One other possibility is that when the driver loads each load counts as
>an instance in the module parameter array.  So if for example you unbind
>the driver on one port and then later rebind it you will have consumed
>one of the values in the array.  Do it enough times and you exceed the
>bounds of the array as you entered it and it will simply use the default
>value of 0.
>
>Also the output of "ethtool -i <ethX>" would be useful to verify that
>you have the out-of-tree driver loaded and not the in kernel.
>
>- Alex
>



Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-19  1:06       ` Alex Forster
@ 2015-10-19 15:08         ` Alexander Duyck
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Duyck @ 2015-10-19 15:08 UTC (permalink / raw)
  To: Alex Forster; +Cc: dev

On 10/18/2015 06:06 PM, Alex Forster wrote:
> On 10/15/15, 3:53 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:
>
>
>>>> It looks like you are probably seeing interfaces be unbound and then
>>>> rebound.  As such you are likely pushing things outside of the array
>>>> boundary.  One solution might just be to at more ",1"s if you are only
>>>> going to be doing this kind of thing at boot up.  The upper limit for
>>>> the array is 32 entries so as long as you only are setting this up once
>>>> you could probably get away with that.
>>>>
>>>> An alternative would be to modify the definition of the parameter in
>>>> ixgbe_param.c.  If you look through the file you should fine several
>>>> likes like below:
>>>> 	struct ixgbe_option opt = {
>>>> 			.type = enable_option,
>>>> 			.name = "allow_unsupported_sfp",
>>>> 			.err  = "defaulting to Disabled",
>>>> 			.def  = OPTION_DISABLED
>>>> 		};
>>>>
>>>> If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>>>> reinstall your driver you should be able have it install without any
>>>> issues.
>>>>
>>>> - Alex
>>>>
>>> Yeah, I've had roughly the same thought process since you mentioned the
>>> args array. My first idea was "maybe the driver can't fit all of my 1's"
>>> but I saw it was defined at 32. Then I decided to just patch the whole
>>> enable_unsupported_sfp option out
>>> https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
>>> failing.
>>>
>>> I've been digging a bit, and I'm failing here in ixgbe_main.c...
>>>
>>> /* reset_hw fills in the perm_addr as well */
>>> hw->phy.reset_if_overtemp = true;
>>> err = hw->mac.ops.reset_hw(hw);
>>> hw->phy.reset_if_overtemp = false;
>>> if (err == IXGBE_ERR_SFP_NOT_PRESENT) {
>>> 	err = IXGBE_SUCCESS;
>>> } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) {
>>> 	e_dev_err("failed to load because an unsupported SFP+ or QSFP "
>>> 		  "module type was detected.\n");
>>> 	e_dev_err("Reload the driver after installing a supported "
>>> 		  "module.\n");
>>> 	goto err_sw_init;
>>> } else if (err) {
>>> 	e_dev_err("HW Init failed: %d\n", err);
>>> 	goto err_sw_init;
>>> }
>>>
>>>
>>> I've attempted a hand-stacktrace and came up with the following...
>>>
>>> ixgbe_82599.c@1016
>>>    * ixgbe_reset_hw_82599() is defined
>>>    * calls phy->ops.init() which potentially returns
>>> IXGBE_ERR_SFP_NOT_SUPPORTED
>>>
>>> ixgbe_82599.c@102
>>>    * ixgbe_init_phy_ops_82599() is defined
>>>    * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling
>>> phy->ops.identify()
>>>
>>> ixgbe_82599.c@2085
>>>    * ixgbe_identify_phy_82599() is defined
>>>    * calls ixgbe_identify_module_generic()
>>>
>>> ixgbe_phy.c@1281
>>>    * ixgbe_identify_module_generic() is defined
>>>    * calls ixgbe_identify_qsfp_module_generic()
>>>
>>> ixgbe_phy.c@1663
>>>    * ixgbe_identify_qsfp_module_generic() is defined
>>>    * We fail somewhere before the ending call to ixgbe_get_device_caps()
>>> which does take allow_unsupported_sfp into account
>>>
>>>    * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER,
>>> &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS
>>>    * Possibility: active_cable != true
>>>
>>> And then I'm over my head. Should I assume from here that the most
>>> likely
>>> explanation is a bad transceiver or bad fiber?
>>>
>>> Alex Forster
>>
>> Are you able to swap transceiver or fiber between the 4 ports that work
>> and the 4 that don't?  If you could do that then you should be able to
>> tell if the issue is following the NIC ports, or if it is an issue with
>> the external connections.  If it is issue is following the transceiver
>> or fiber then it is probably what is causing the issue.
>>
>> The other thing you could try doing is adding a printk to the spots
>> where the status is set to SFP_NOT_SUPPORTED so that you could figure
>> out exactly which spot is triggering the rejection of the module.
>>
>> - Alex
>
> I had remote hands swap fibers on the QSFP side and the issue moved to the
> first card, so I'm going to have the fibers cleaned and tested. This
> appears to be my issue.
>
> I'd like to submit a patch for ixgbe_identify_qsfp_module_generic() to
> print more helpful errors in the two cases mentioned above, so that
> hopefully nobody ever has to deal with this again. Would I be wasting my
> time, or does something like this have a likelihood of being accepted?
>
> Thank you for all of your help! I wouldn't have figured this out nearly as
> quickly without it.
>
> Alex Forster

I suspect there would be some value to such a patch, just make sure to 
explain the reason for needing it in the patch description.  My advice 
would be to put such a patch together against what is in Jeff Kirsher's 
next queue 
(https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git/) 
and then base your patches off of that.  The email lists for submitting 
patches to is intel-wired-lan@lists.osuosl.org and
netdev@vger.kernel.org.

- Alex
- Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 19:53     ` Alexander Duyck
@ 2015-10-19  1:06       ` Alex Forster
  2015-10-19 15:08         ` Alexander Duyck
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Forster @ 2015-10-19  1:06 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: dev

On 10/15/15, 3:53 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:


>>> It looks like you are probably seeing interfaces be unbound and then
>>> rebound.  As such you are likely pushing things outside of the array
>>> boundary.  One solution might just be to at more ",1"s if you are only
>>> going to be doing this kind of thing at boot up.  The upper limit for
>>> the array is 32 entries so as long as you only are setting this up once
>>> you could probably get away with that.
>>>
>>> An alternative would be to modify the definition of the parameter in
>>> ixgbe_param.c.  If you look through the file you should fine several
>>> likes like below:
>>> 	struct ixgbe_option opt = {
>>> 			.type = enable_option,
>>> 			.name = "allow_unsupported_sfp",
>>> 			.err  = "defaulting to Disabled",
>>> 			.def  = OPTION_DISABLED
>>> 		};
>>>
>>> If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>>> reinstall your driver you should be able have it install without any
>>> issues.
>>>
>>> - Alex
>>>
>> Yeah, I've had roughly the same thought process since you mentioned the
>> args array. My first idea was "maybe the driver can't fit all of my 1's"
>> but I saw it was defined at 32. Then I decided to just patch the whole
>> enable_unsupported_sfp option out
>> https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
>> failing.
>>
>> I've been digging a bit, and I'm failing here in ixgbe_main.c...
>>
>> /* reset_hw fills in the perm_addr as well */
>> hw->phy.reset_if_overtemp = true;
>> err = hw->mac.ops.reset_hw(hw);
>> hw->phy.reset_if_overtemp = false;
>> if (err == IXGBE_ERR_SFP_NOT_PRESENT) {
>> 	err = IXGBE_SUCCESS;
>> } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) {
>> 	e_dev_err("failed to load because an unsupported SFP+ or QSFP "
>> 		  "module type was detected.\n");
>> 	e_dev_err("Reload the driver after installing a supported "
>> 		  "module.\n");
>> 	goto err_sw_init;
>> } else if (err) {
>> 	e_dev_err("HW Init failed: %d\n", err);
>> 	goto err_sw_init;
>> }
>>
>>
>> I've attempted a hand-stacktrace and came up with the following...
>>
>> ixgbe_82599.c@1016
>>   * ixgbe_reset_hw_82599() is defined
>>   * calls phy->ops.init() which potentially returns
>> IXGBE_ERR_SFP_NOT_SUPPORTED
>>
>> ixgbe_82599.c@102
>>   * ixgbe_init_phy_ops_82599() is defined
>>   * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling
>> phy->ops.identify()
>>
>> ixgbe_82599.c@2085
>>   * ixgbe_identify_phy_82599() is defined
>>   * calls ixgbe_identify_module_generic()
>>
>> ixgbe_phy.c@1281
>>   * ixgbe_identify_module_generic() is defined
>>   * calls ixgbe_identify_qsfp_module_generic()
>>
>> ixgbe_phy.c@1663
>>   * ixgbe_identify_qsfp_module_generic() is defined
>>   * We fail somewhere before the ending call to ixgbe_get_device_caps()
>> which does take allow_unsupported_sfp into account
>>
>>   * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER,
>> &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS
>>   * Possibility: active_cable != true
>>
>> And then I'm over my head. Should I assume from here that the most
>>likely
>> explanation is a bad transceiver or bad fiber?
>>
>> Alex Forster
>
>Are you able to swap transceiver or fiber between the 4 ports that work
>and the 4 that don't?  If you could do that then you should be able to
>tell if the issue is following the NIC ports, or if it is an issue with
>the external connections.  If it is issue is following the transceiver
>or fiber then it is probably what is causing the issue.
>
>The other thing you could try doing is adding a printk to the spots
>where the status is set to SFP_NOT_SUPPORTED so that you could figure
>out exactly which spot is triggering the rejection of the module.
>
>- Alex

I had remote hands swap fibers on the QSFP side and the issue moved to the
first card, so I'm going to have the fibers cleaned and tested. This
appears to be my issue.

I'd like to submit a patch for ixgbe_identify_qsfp_module_generic() to
print more helpful errors in the two cases mentioned above, so that
hopefully nobody ever has to deal with this again. Would I be wasting my
time, or does something like this have a likelihood of being accepted?

Thank you for all of your help! I wouldn't have figured this out nearly as
quickly without it.

Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 17:13   ` Alex Forster
  2015-10-15 18:00     ` Alexander Duyck
@ 2015-10-15 19:53     ` Alexander Duyck
  2015-10-19  1:06       ` Alex Forster
  1 sibling, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2015-10-15 19:53 UTC (permalink / raw)
  To: Alex Forster; +Cc: dev

On 10/15/2015 10:13 AM, Alex Forster wrote:
> On 10/15/15, 12:17 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:
>
>
>> On 10/15/2015 08:43 AM, Alex Forster wrote:
>>> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>> wrote:
>>>
>>>> On 10/15/2015 07:46 AM, Alex Forster wrote:
>>>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> If you are using Intel's out-of-tree ixgbe driver I believe the
>>>>>> module
>>>>>> parameters are comma separated with one index per port.  So if you
>>>>>> have
>>>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for
>>>>>> 4
>>>>>> you would need four '1's.
>>>>> This seemed very promising. I compiled and installed the out of tree
>>>>> ixgbe
>>>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows
>>>>> all
>>>>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>>>>> still error out with the unsupported SFP message when running the
>>>>> tests.
>>>>>
>>>>> Before I start arbitrarily trying to patch out parts of the SFP
>>>>> verification code in ixgbe, are there any other tips I should know?
>>>> Can you send me the command you used to load the module, and the exact
>>>> number of ixgbe ports you have in the system?  With that I could then
>>>> verify that the command was entered correctly as it is possible there
>>>> could still be an issue in the way the command was entered.
>>>>
>>>> One other possibility is that when the driver loads each load counts as
>>>> an instance in the module parameter array.  So if for example you
>>>> unbind
>>>> the driver on one port and then later rebind it you will have consumed
>>>> one of the values in the array.  Do it enough times and you exceed the
>>>> bounds of the array as you entered it and it will simply use the
>>>> default
>>>> value of 0.
>>>>
>>>> Also the output of "ethtool -i <ethX>" would be useful to verify that
>>>> you have the out-of-tree driver loaded and not the in kernel.
>>>>
>>>> - Alex
>>>>
>>> Er, let me try that again.
>>>
>>> https://gist.github.com/AlexForster/f5372c5b60153d278089
>>>
>>>
>>> Alex Forster
>>>
>>>
>> It looks like you are probably seeing interfaces be unbound and then
>> rebound.  As such you are likely pushing things outside of the array
>> boundary.  One solution might just be to at more ",1"s if you are only
>> going to be doing this kind of thing at boot up.  The upper limit for
>> the array is 32 entries so as long as you only are setting this up once
>> you could probably get away with that.
>>
>> An alternative would be to modify the definition of the parameter in
>> ixgbe_param.c.  If you look through the file you should fine several
>> likes like below:
>> 	struct ixgbe_option opt = {
>> 			.type = enable_option,
>> 			.name = "allow_unsupported_sfp",
>> 			.err  = "defaulting to Disabled",
>> 			.def  = OPTION_DISABLED
>> 		};
>>
>> If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>> reinstall your driver you should be able have it install without any
>> issues.
>>
>> - Alex
>>
> Yeah, I've had roughly the same thought process since you mentioned the
> args array. My first idea was "maybe the driver can't fit all of my 1's"
> but I saw it was defined at 32. Then I decided to just patch the whole
> enable_unsupported_sfp option out
> https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
> failing.
>
> I've been digging a bit, and I'm failing here in ixgbe_main.c...
>
> /* reset_hw fills in the perm_addr as well */
> hw->phy.reset_if_overtemp = true;
> err = hw->mac.ops.reset_hw(hw);
> hw->phy.reset_if_overtemp = false;
> if (err == IXGBE_ERR_SFP_NOT_PRESENT) {
> 	err = IXGBE_SUCCESS;
> } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) {
> 	e_dev_err("failed to load because an unsupported SFP+ or QSFP "
> 		  "module type was detected.\n");
> 	e_dev_err("Reload the driver after installing a supported "
> 		  "module.\n");
> 	goto err_sw_init;
> } else if (err) {
> 	e_dev_err("HW Init failed: %d\n", err);
> 	goto err_sw_init;
> }
>
>
> I've attempted a hand-stacktrace and came up with the following...
>
> ixgbe_82599.c@1016
>   * ixgbe_reset_hw_82599() is defined
>   * calls phy->ops.init() which potentially returns
> IXGBE_ERR_SFP_NOT_SUPPORTED
>
> ixgbe_82599.c@102
>   * ixgbe_init_phy_ops_82599() is defined
>   * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling
> phy->ops.identify()
>
> ixgbe_82599.c@2085
>   * ixgbe_identify_phy_82599() is defined
>   * calls ixgbe_identify_module_generic()
>
> ixgbe_phy.c@1281
>   * ixgbe_identify_module_generic() is defined
>   * calls ixgbe_identify_qsfp_module_generic()
>
> ixgbe_phy.c@1663
>   * ixgbe_identify_qsfp_module_generic() is defined
>   * We fail somewhere before the ending call to ixgbe_get_device_caps()
> which does take allow_unsupported_sfp into account
>
>   * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER,
> &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS
>   * Possibility: active_cable != true
>
> And then I'm over my head. Should I assume from here that the most likely
> explanation is a bad transceiver or bad fiber?
>
> Alex Forster

Are you able to swap transceiver or fiber between the 4 ports that work 
and the 4 that don't?  If you could do that then you should be able to 
tell if the issue is following the NIC ports, or if it is an issue with 
the external connections.  If it is issue is following the transceiver 
or fiber then it is probably what is causing the issue.

The other thing you could try doing is adding a printk to the spots 
where the status is set to SFP_NOT_SUPPORTED so that you could figure 
out exactly which spot is triggering the rejection of the module.

- Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 18:00     ` Alexander Duyck
@ 2015-10-15 18:29       ` Alex Forster
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Forster @ 2015-10-15 18:29 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: dev

On 10/15/15, 2:00 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:

>
>Your changes are a bit over-kill and actually take things in the wrong
>direction.  By commenting out the whole allow_unsupported_sfp block you
>are disabling it by default.  Remember the module parameter allows it,
>by removing it there is no way to enable the feature.
>
>Like I mentioned in my previous email just take a look at replacing the
>"OPTION_DISABLED" value with "OPTION_ENABLED" in the .def part of the
>structure.  After that you won't need to pass the module parameter as it
>will always be enabled by default.
>
>- Alex

It's hard to see in the patch, but I basically replaced that whole option
check block with:

{
	/*
	* allow_unsupported_sfp - Enable/Disable support for unsupported
	* and untested SFP+ modules.
	*/
	adapter->hw.allow_unsupported_sfp = true;


}

Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 17:13   ` Alex Forster
@ 2015-10-15 18:00     ` Alexander Duyck
  2015-10-15 18:29       ` Alex Forster
  2015-10-15 19:53     ` Alexander Duyck
  1 sibling, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2015-10-15 18:00 UTC (permalink / raw)
  To: Alex Forster; +Cc: dev

On 10/15/2015 10:13 AM, Alex Forster wrote:
> On 10/15/15, 12:17 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:
>
>
>> On 10/15/2015 08:43 AM, Alex Forster wrote:
>>> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>> wrote:
>>>
>>>> On 10/15/2015 07:46 AM, Alex Forster wrote:
>>>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> If you are using Intel's out-of-tree ixgbe driver I believe the
>>>>>> module
>>>>>> parameters are comma separated with one index per port.  So if you
>>>>>> have
>>>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for
>>>>>> 4
>>>>>> you would need four '1's.
>>>>> This seemed very promising. I compiled and installed the out of tree
>>>>> ixgbe
>>>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows
>>>>> all
>>>>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>>>>> still error out with the unsupported SFP message when running the
>>>>> tests.
>>>>>
>>>>> Before I start arbitrarily trying to patch out parts of the SFP
>>>>> verification code in ixgbe, are there any other tips I should know?
>>>> Can you send me the command you used to load the module, and the exact
>>>> number of ixgbe ports you have in the system?  With that I could then
>>>> verify that the command was entered correctly as it is possible there
>>>> could still be an issue in the way the command was entered.
>>>>
>>>> One other possibility is that when the driver loads each load counts as
>>>> an instance in the module parameter array.  So if for example you
>>>> unbind
>>>> the driver on one port and then later rebind it you will have consumed
>>>> one of the values in the array.  Do it enough times and you exceed the
>>>> bounds of the array as you entered it and it will simply use the
>>>> default
>>>> value of 0.
>>>>
>>>> Also the output of "ethtool -i <ethX>" would be useful to verify that
>>>> you have the out-of-tree driver loaded and not the in kernel.
>>>>
>>>> - Alex
>>>>
>>> Er, let me try that again.
>>>
>>> https://gist.github.com/AlexForster/f5372c5b60153d278089
>>>
>>>
>>> Alex Forster
>>>
>>>
>> It looks like you are probably seeing interfaces be unbound and then
>> rebound.  As such you are likely pushing things outside of the array
>> boundary.  One solution might just be to at more ",1"s if you are only
>> going to be doing this kind of thing at boot up.  The upper limit for
>> the array is 32 entries so as long as you only are setting this up once
>> you could probably get away with that.
>>
>> An alternative would be to modify the definition of the parameter in
>> ixgbe_param.c.  If you look through the file you should fine several
>> likes like below:
>> 	struct ixgbe_option opt = {
>> 			.type = enable_option,
>> 			.name = "allow_unsupported_sfp",
>> 			.err  = "defaulting to Disabled",
>> 			.def  = OPTION_DISABLED
>> 		};
>>
>> If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>> reinstall your driver you should be able have it install without any
>> issues.
>>
>> - Alex
>>
> Yeah, I've had roughly the same thought process since you mentioned the
> args array. My first idea was "maybe the driver can't fit all of my 1's"
> but I saw it was defined at 32. Then I decided to just patch the whole
> enable_unsupported_sfp option out
> https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
> failing.

Your changes are a bit over-kill and actually take things in the wrong 
direction.  By commenting out the whole allow_unsupported_sfp block you 
are disabling it by default.  Remember the module parameter allows it, 
by removing it there is no way to enable the feature.

Like I mentioned in my previous email just take a look at replacing the 
"OPTION_DISABLED" value with "OPTION_ENABLED" in the .def part of the 
structure.  After that you won't need to pass the module parameter as it 
will always be enabled by default.

- Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 16:17 ` Alexander Duyck
@ 2015-10-15 17:13   ` Alex Forster
  2015-10-15 18:00     ` Alexander Duyck
  2015-10-15 19:53     ` Alexander Duyck
  0 siblings, 2 replies; 14+ messages in thread
From: Alex Forster @ 2015-10-15 17:13 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: dev

On 10/15/15, 12:17 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:


>On 10/15/2015 08:43 AM, Alex Forster wrote:
>> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>wrote:
>>
>>> On 10/15/2015 07:46 AM, Alex Forster wrote:
>>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>>> wrote:
>>>>
>>>>> If you are using Intel's out-of-tree ixgbe driver I believe the
>>>>>module
>>>>> parameters are comma separated with one index per port.  So if you
>>>>>have
>>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for
>>>>>4
>>>>> you would need four '1's.
>>>>
>>>> This seemed very promising. I compiled and installed the out of tree
>>>> ixgbe
>>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows
>>>>all
>>>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>>>> still error out with the unsupported SFP message when running the
>>>>tests.
>>>>
>>>> Before I start arbitrarily trying to patch out parts of the SFP
>>>> verification code in ixgbe, are there any other tips I should know?
>>>
>>> Can you send me the command you used to load the module, and the exact
>>> number of ixgbe ports you have in the system?  With that I could then
>>> verify that the command was entered correctly as it is possible there
>>> could still be an issue in the way the command was entered.
>>>
>>> One other possibility is that when the driver loads each load counts as
>>> an instance in the module parameter array.  So if for example you
>>>unbind
>>> the driver on one port and then later rebind it you will have consumed
>>> one of the values in the array.  Do it enough times and you exceed the
>>> bounds of the array as you entered it and it will simply use the
>>>default
>>> value of 0.
>>>
>>> Also the output of "ethtool -i <ethX>" would be useful to verify that
>>> you have the out-of-tree driver loaded and not the in kernel.
>>>
>>> - Alex
>>>
>>
>> Er, let me try that again.
>>
>> https://gist.github.com/AlexForster/f5372c5b60153d278089
>>
>>
>> Alex Forster
>>
>>
>
>It looks like you are probably seeing interfaces be unbound and then
>rebound.  As such you are likely pushing things outside of the array
>boundary.  One solution might just be to at more ",1"s if you are only
>going to be doing this kind of thing at boot up.  The upper limit for
>the array is 32 entries so as long as you only are setting this up once
>you could probably get away with that.
>
>An alternative would be to modify the definition of the parameter in
>ixgbe_param.c.  If you look through the file you should fine several
>likes like below:
>	struct ixgbe_option opt = {
>			.type = enable_option,
>			.name = "allow_unsupported_sfp",
>			.err  = "defaulting to Disabled",
>			.def  = OPTION_DISABLED
>		};
>
>If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>reinstall your driver you should be able have it install without any
>issues.
>
>- Alex
>

Yeah, I've had roughly the same thought process since you mentioned the
args array. My first idea was "maybe the driver can't fit all of my 1's"
but I saw it was defined at 32. Then I decided to just patch the whole
enable_unsupported_sfp option out
https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
failing.

I've been digging a bit, and I'm failing here in ixgbe_main.c...

/* reset_hw fills in the perm_addr as well */
hw->phy.reset_if_overtemp = true;
err = hw->mac.ops.reset_hw(hw);
hw->phy.reset_if_overtemp = false;
if (err == IXGBE_ERR_SFP_NOT_PRESENT) {
	err = IXGBE_SUCCESS;
} else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) {
	e_dev_err("failed to load because an unsupported SFP+ or QSFP "
		  "module type was detected.\n");
	e_dev_err("Reload the driver after installing a supported "
		  "module.\n");
	goto err_sw_init;
} else if (err) {
	e_dev_err("HW Init failed: %d\n", err);
	goto err_sw_init;
}


I've attempted a hand-stacktrace and came up with the following...

ixgbe_82599.c@1016
 * ixgbe_reset_hw_82599() is defined
 * calls phy->ops.init() which potentially returns
IXGBE_ERR_SFP_NOT_SUPPORTED

ixgbe_82599.c@102
 * ixgbe_init_phy_ops_82599() is defined
 * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling
phy->ops.identify()

ixgbe_82599.c@2085
 * ixgbe_identify_phy_82599() is defined
 * calls ixgbe_identify_module_generic()

ixgbe_phy.c@1281
 * ixgbe_identify_module_generic() is defined
 * calls ixgbe_identify_qsfp_module_generic()

ixgbe_phy.c@1663
 * ixgbe_identify_qsfp_module_generic() is defined
 * We fail somewhere before the ending call to ixgbe_get_device_caps()
which does take allow_unsupported_sfp into account

 * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER,
&identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS
 * Possibility: active_cable != true

And then I'm over my head. Should I assume from here that the most likely
explanation is a bad transceiver or bad fiber?

Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
  2015-10-15 15:43 Alex Forster
@ 2015-10-15 16:17 ` Alexander Duyck
  2015-10-15 17:13   ` Alex Forster
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2015-10-15 16:17 UTC (permalink / raw)
  To: Alex Forster; +Cc: dev

On 10/15/2015 08:43 AM, Alex Forster wrote:
> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:
>
>> On 10/15/2015 07:46 AM, Alex Forster wrote:
>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>> wrote:
>>>
>>>> If you are using Intel's out-of-tree ixgbe driver I believe the module
>>>> parameters are comma separated with one index per port.  So if you have
>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for 4
>>>> you would need four '1's.
>>>
>>> This seemed very promising. I compiled and installed the out of tree
>>> ixgbe
>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows all
>>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>>> still error out with the unsupported SFP message when running the tests.
>>>
>>> Before I start arbitrarily trying to patch out parts of the SFP
>>> verification code in ixgbe, are there any other tips I should know?
>>
>> Can you send me the command you used to load the module, and the exact
>> number of ixgbe ports you have in the system?  With that I could then
>> verify that the command was entered correctly as it is possible there
>> could still be an issue in the way the command was entered.
>>
>> One other possibility is that when the driver loads each load counts as
>> an instance in the module parameter array.  So if for example you unbind
>> the driver on one port and then later rebind it you will have consumed
>> one of the values in the array.  Do it enough times and you exceed the
>> bounds of the array as you entered it and it will simply use the default
>> value of 0.
>>
>> Also the output of "ethtool -i <ethX>" would be useful to verify that
>> you have the out-of-tree driver loaded and not the in kernel.
>>
>> - Alex
>>
>
> Er, let me try that again.
>
> https://gist.github.com/AlexForster/f5372c5b60153d278089
>
>
> Alex Forster
>
>

It looks like you are probably seeing interfaces be unbound and then 
rebound.  As such you are likely pushing things outside of the array 
boundary.  One solution might just be to at more ",1"s if you are only 
going to be doing this kind of thing at boot up.  The upper limit for 
the array is 32 entries so as long as you only are setting this up once 
you could probably get away with that.

An alternative would be to modify the definition of the parameter in 
ixgbe_param.c.  If you look through the file you should fine several 
likes like below:
	struct ixgbe_option opt = {
			.type = enable_option,
			.name = "allow_unsupported_sfp",
			.err  = "defaulting to Disabled",
			.def  = OPTION_DISABLED
		};

If you modify the .def value to "OPTION_ENABLED", and then rebuild and 
reinstall your driver you should be able have it install without any issues.

- Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] Question about unsupported transceivers
@ 2015-10-15 15:43 Alex Forster
  2015-10-15 16:17 ` Alexander Duyck
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Forster @ 2015-10-15 15:43 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: dev

On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:

>On 10/15/2015 07:46 AM, Alex Forster wrote:
>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>wrote:
>>
>>> If you are using Intel's out-of-tree ixgbe driver I believe the module
>>> parameters are comma separated with one index per port.  So if you have
>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for 4
>>> you would need four '1's.
>>
>> This seemed very promising. I compiled and installed the out of tree
>>ixgbe
>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows all
>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>> still error out with the unsupported SFP message when running the tests.
>>
>> Before I start arbitrarily trying to patch out parts of the SFP
>> verification code in ixgbe, are there any other tips I should know?
>
>Can you send me the command you used to load the module, and the exact
>number of ixgbe ports you have in the system?  With that I could then
>verify that the command was entered correctly as it is possible there
>could still be an issue in the way the command was entered.
>
>One other possibility is that when the driver loads each load counts as
>an instance in the module parameter array.  So if for example you unbind
>the driver on one port and then later rebind it you will have consumed
>one of the values in the array.  Do it enough times and you exceed the
>bounds of the array as you entered it and it will simply use the default
>value of 0.
>
>Also the output of "ethtool -i <ethX>" would be useful to verify that
>you have the out-of-tree driver loaded and not the in kernel.
>
>- Alex
>

Er, let me try that again.

https://gist.github.com/AlexForster/f5372c5b60153d278089


Alex Forster

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-10-19 15:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-13 17:22 [dpdk-dev] Question about unsupported transceivers Alex Forster
2015-10-13 18:57 ` Alex Forster
2015-10-13 20:34   ` Alexander Duyck
2015-10-15 14:46     ` Alex Forster
2015-10-15 15:30       ` Alexander Duyck
2015-10-15 15:33         ` Alex Forster
2015-10-15 15:43 Alex Forster
2015-10-15 16:17 ` Alexander Duyck
2015-10-15 17:13   ` Alex Forster
2015-10-15 18:00     ` Alexander Duyck
2015-10-15 18:29       ` Alex Forster
2015-10-15 19:53     ` Alexander Duyck
2015-10-19  1:06       ` Alex Forster
2015-10-19 15:08         ` Alexander Duyck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).