DPDK patches and discussions
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Alex Forster <alex@alexforster.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Question about unsupported transceivers
Date: Thu, 15 Oct 2015 12:53:02 -0700	[thread overview]
Message-ID: <5620041E.1060309@gmail.com> (raw)
In-Reply-To: <D2454A0C.2931%alex@alexforster.com>

On 10/15/2015 10:13 AM, Alex Forster wrote:
> On 10/15/15, 12:17 PM, "Alexander Duyck" <alexander.duyck@gmail.com> wrote:
>
>
>> On 10/15/2015 08:43 AM, Alex Forster wrote:
>>> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>> wrote:
>>>
>>>> On 10/15/2015 07:46 AM, Alex Forster wrote:
>>>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> If you are using Intel's out-of-tree ixgbe driver I believe the
>>>>>> module
>>>>>> parameters are comma separated with one index per port.  So if you
>>>>>> have
>>>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for
>>>>>> 4
>>>>>> you would need four '1's.
>>>>> This seemed very promising. I compiled and installed the out of tree
>>>>> ixgbe
>>>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows
>>>>> all
>>>>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>>>>> still error out with the unsupported SFP message when running the
>>>>> tests.
>>>>>
>>>>> Before I start arbitrarily trying to patch out parts of the SFP
>>>>> verification code in ixgbe, are there any other tips I should know?
>>>> Can you send me the command you used to load the module, and the exact
>>>> number of ixgbe ports you have in the system?  With that I could then
>>>> verify that the command was entered correctly as it is possible there
>>>> could still be an issue in the way the command was entered.
>>>>
>>>> One other possibility is that when the driver loads each load counts as
>>>> an instance in the module parameter array.  So if for example you
>>>> unbind
>>>> the driver on one port and then later rebind it you will have consumed
>>>> one of the values in the array.  Do it enough times and you exceed the
>>>> bounds of the array as you entered it and it will simply use the
>>>> default
>>>> value of 0.
>>>>
>>>> Also the output of "ethtool -i <ethX>" would be useful to verify that
>>>> you have the out-of-tree driver loaded and not the in kernel.
>>>>
>>>> - Alex
>>>>
>>> Er, let me try that again.
>>>
>>> https://gist.github.com/AlexForster/f5372c5b60153d278089
>>>
>>>
>>> Alex Forster
>>>
>>>
>> It looks like you are probably seeing interfaces be unbound and then
>> rebound.  As such you are likely pushing things outside of the array
>> boundary.  One solution might just be to at more ",1"s if you are only
>> going to be doing this kind of thing at boot up.  The upper limit for
>> the array is 32 entries so as long as you only are setting this up once
>> you could probably get away with that.
>>
>> An alternative would be to modify the definition of the parameter in
>> ixgbe_param.c.  If you look through the file you should fine several
>> likes like below:
>> 	struct ixgbe_option opt = {
>> 			.type = enable_option,
>> 			.name = "allow_unsupported_sfp",
>> 			.err  = "defaulting to Disabled",
>> 			.def  = OPTION_DISABLED
>> 		};
>>
>> If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>> reinstall your driver you should be able have it install without any
>> issues.
>>
>> - Alex
>>
> Yeah, I've had roughly the same thought process since you mentioned the
> args array. My first idea was "maybe the driver can't fit all of my 1's"
> but I saw it was defined at 32. Then I decided to just patch the whole
> enable_unsupported_sfp option out
> https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
> failing.
>
> I've been digging a bit, and I'm failing here in ixgbe_main.c...
>
> /* reset_hw fills in the perm_addr as well */
> hw->phy.reset_if_overtemp = true;
> err = hw->mac.ops.reset_hw(hw);
> hw->phy.reset_if_overtemp = false;
> if (err == IXGBE_ERR_SFP_NOT_PRESENT) {
> 	err = IXGBE_SUCCESS;
> } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) {
> 	e_dev_err("failed to load because an unsupported SFP+ or QSFP "
> 		  "module type was detected.\n");
> 	e_dev_err("Reload the driver after installing a supported "
> 		  "module.\n");
> 	goto err_sw_init;
> } else if (err) {
> 	e_dev_err("HW Init failed: %d\n", err);
> 	goto err_sw_init;
> }
>
>
> I've attempted a hand-stacktrace and came up with the following...
>
> ixgbe_82599.c@1016
>   * ixgbe_reset_hw_82599() is defined
>   * calls phy->ops.init() which potentially returns
> IXGBE_ERR_SFP_NOT_SUPPORTED
>
> ixgbe_82599.c@102
>   * ixgbe_init_phy_ops_82599() is defined
>   * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling
> phy->ops.identify()
>
> ixgbe_82599.c@2085
>   * ixgbe_identify_phy_82599() is defined
>   * calls ixgbe_identify_module_generic()
>
> ixgbe_phy.c@1281
>   * ixgbe_identify_module_generic() is defined
>   * calls ixgbe_identify_qsfp_module_generic()
>
> ixgbe_phy.c@1663
>   * ixgbe_identify_qsfp_module_generic() is defined
>   * We fail somewhere before the ending call to ixgbe_get_device_caps()
> which does take allow_unsupported_sfp into account
>
>   * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER,
> &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS
>   * Possibility: active_cable != true
>
> And then I'm over my head. Should I assume from here that the most likely
> explanation is a bad transceiver or bad fiber?
>
> Alex Forster

Are you able to swap transceiver or fiber between the 4 ports that work 
and the 4 that don't?  If you could do that then you should be able to 
tell if the issue is following the NIC ports, or if it is an issue with 
the external connections.  If it is issue is following the transceiver 
or fiber then it is probably what is causing the issue.

The other thing you could try doing is adding a printk to the spots 
where the status is set to SFP_NOT_SUPPORTED so that you could figure 
out exactly which spot is triggering the rejection of the module.

- Alex

  parent reply	other threads:[~2015-10-15 19:53 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15 15:43 Alex Forster
2015-10-15 16:17 ` Alexander Duyck
2015-10-15 17:13   ` Alex Forster
2015-10-15 18:00     ` Alexander Duyck
2015-10-15 18:29       ` Alex Forster
2015-10-15 19:53     ` Alexander Duyck [this message]
2015-10-19  1:06       ` Alex Forster
2015-10-19 15:08         ` Alexander Duyck
  -- strict thread matches above, loose matches on Subject: below --
2015-10-13 17:22 Alex Forster
2015-10-13 18:57 ` Alex Forster
2015-10-13 20:34   ` Alexander Duyck
2015-10-15 14:46     ` Alex Forster
2015-10-15 15:30       ` Alexander Duyck
2015-10-15 15:33         ` Alex Forster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5620041E.1060309@gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=alex@alexforster.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).