From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f48.google.com (mail-pa0-f48.google.com [209.85.220.48]) by dpdk.org (Postfix) with ESMTP id BC37091CF for ; Thu, 15 Oct 2015 21:53:04 +0200 (CEST) Received: by pabrc13 with SMTP id rc13so96422188pab.0 for ; Thu, 15 Oct 2015 12:53:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=WwUjatVqP2TRt5nkBP8dCGNgMS+dBmOlgPPmR7yI0Hc=; b=zAr9xH/Hd5BPSINKzqOFYpLgv5g9AmWi1N9+N1qmTC6bZY/lkeLX7sJ674/Kojkv4v hfrh6wDlpPPh7KuWZR8GQWXqXKXfId8bwGuN4SBTA+A3ER5eCKq+m2zl6OOU0HB+HoDr 96aqVrHRJQbFVjNwiBpqQQaymPJGniH+IF624JnGzev79w+gW/m5apiX14qdthnnTWIc VEIwsw0M5Za1aoeUjMsErbBA6DWfAW3RslQEvG/B9sV6oMPpnYK4jCPWpDRfL6ncfCB7 Xxe5Lye9usaMkToYu4/qVMLjwi2uJlNcsjKSGcJgN+IusVMa9q/Pe0aehFXTWYT5/eKt rSWQ== X-Received: by 10.69.16.166 with SMTP id fx6mr11842908pbd.18.1444938783917; Thu, 15 Oct 2015 12:53:03 -0700 (PDT) Received: from [192.168.1.188] (static-50-53-21-5.bvtn.or.frontiernet.net. [50.53.21.5]) by smtp.googlemail.com with ESMTPSA id ja4sm16916616pbb.19.2015.10.15.12.53.03 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Oct 2015 12:53:03 -0700 (PDT) To: Alex Forster References: <561FD17E.6070908@gmail.com> From: Alexander Duyck Message-ID: <5620041E.1060309@gmail.com> Date: Thu, 15 Oct 2015 12:53:02 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Question about unsupported transceivers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Oct 2015 19:53:05 -0000 On 10/15/2015 10:13 AM, Alex Forster wrote: > On 10/15/15, 12:17 PM, "Alexander Duyck" wrote: > > >> On 10/15/2015 08:43 AM, Alex Forster wrote: >>> On 10/15/15, 11:30 AM, "Alexander Duyck" >>> wrote: >>> >>>> On 10/15/2015 07:46 AM, Alex Forster wrote: >>>>> On 10/13/15, 4:34 PM, "Alexander Duyck" >>>>> wrote: >>>>> >>>>>> If you are using Intel's out-of-tree ixgbe driver I believe the >>>>>> module >>>>>> parameters are comma separated with one index per port. So if you >>>>>> have >>>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for >>>>>> 4 >>>>>> you would need four '1's. >>>>> This seemed very promising. I compiled and installed the out of tree >>>>> ixgbe >>>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows >>>>> all >>>>> eight "allow_unsupported_sfp enabled" messages but the last four ports >>>>> still error out with the unsupported SFP message when running the >>>>> tests. >>>>> >>>>> Before I start arbitrarily trying to patch out parts of the SFP >>>>> verification code in ixgbe, are there any other tips I should know? >>>> Can you send me the command you used to load the module, and the exact >>>> number of ixgbe ports you have in the system? With that I could then >>>> verify that the command was entered correctly as it is possible there >>>> could still be an issue in the way the command was entered. >>>> >>>> One other possibility is that when the driver loads each load counts as >>>> an instance in the module parameter array. So if for example you >>>> unbind >>>> the driver on one port and then later rebind it you will have consumed >>>> one of the values in the array. Do it enough times and you exceed the >>>> bounds of the array as you entered it and it will simply use the >>>> default >>>> value of 0. >>>> >>>> Also the output of "ethtool -i " would be useful to verify that >>>> you have the out-of-tree driver loaded and not the in kernel. >>>> >>>> - Alex >>>> >>> Er, let me try that again. >>> >>> https://gist.github.com/AlexForster/f5372c5b60153d278089 >>> >>> >>> Alex Forster >>> >>> >> It looks like you are probably seeing interfaces be unbound and then >> rebound. As such you are likely pushing things outside of the array >> boundary. One solution might just be to at more ",1"s if you are only >> going to be doing this kind of thing at boot up. The upper limit for >> the array is 32 entries so as long as you only are setting this up once >> you could probably get away with that. >> >> An alternative would be to modify the definition of the parameter in >> ixgbe_param.c. If you look through the file you should fine several >> likes like below: >> struct ixgbe_option opt = { >> .type = enable_option, >> .name = "allow_unsupported_sfp", >> .err = "defaulting to Disabled", >> .def = OPTION_DISABLED >> }; >> >> If you modify the .def value to "OPTION_ENABLED", and then rebuild and >> reinstall your driver you should be able have it install without any >> issues. >> >> - Alex >> > Yeah, I've had roughly the same thought process since you mentioned the > args array. My first idea was "maybe the driver can't fit all of my 1's" > but I saw it was defined at 32. Then I decided to just patch the whole > enable_unsupported_sfp option out > https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still > failing. > > I've been digging a bit, and I'm failing here in ixgbe_main.c... > > /* reset_hw fills in the perm_addr as well */ > hw->phy.reset_if_overtemp = true; > err = hw->mac.ops.reset_hw(hw); > hw->phy.reset_if_overtemp = false; > if (err == IXGBE_ERR_SFP_NOT_PRESENT) { > err = IXGBE_SUCCESS; > } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) { > e_dev_err("failed to load because an unsupported SFP+ or QSFP " > "module type was detected.\n"); > e_dev_err("Reload the driver after installing a supported " > "module.\n"); > goto err_sw_init; > } else if (err) { > e_dev_err("HW Init failed: %d\n", err); > goto err_sw_init; > } > > > I've attempted a hand-stacktrace and came up with the following... > > ixgbe_82599.c@1016 > * ixgbe_reset_hw_82599() is defined > * calls phy->ops.init() which potentially returns > IXGBE_ERR_SFP_NOT_SUPPORTED > > ixgbe_82599.c@102 > * ixgbe_init_phy_ops_82599() is defined > * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling > phy->ops.identify() > > ixgbe_82599.c@2085 > * ixgbe_identify_phy_82599() is defined > * calls ixgbe_identify_module_generic() > > ixgbe_phy.c@1281 > * ixgbe_identify_module_generic() is defined > * calls ixgbe_identify_qsfp_module_generic() > > ixgbe_phy.c@1663 > * ixgbe_identify_qsfp_module_generic() is defined > * We fail somewhere before the ending call to ixgbe_get_device_caps() > which does take allow_unsupported_sfp into account > > * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER, > &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS > * Possibility: active_cable != true > > And then I'm over my head. Should I assume from here that the most likely > explanation is a bad transceiver or bad fiber? > > Alex Forster Are you able to swap transceiver or fiber between the 4 ports that work and the 4 that don't? If you could do that then you should be able to tell if the issue is following the NIC ports, or if it is an issue with the external connections. If it is issue is following the transceiver or fiber then it is probably what is causing the issue. The other thing you could try doing is adding a printk to the spots where the status is set to SFP_NOT_SUPPORTED so that you could figure out exactly which spot is triggering the rejection of the module. - Alex