From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f45.google.com (mail-pa0-f45.google.com [209.85.220.45]) by dpdk.org (Postfix) with ESMTP id 2473CB62 for ; Mon, 19 Oct 2015 17:08:35 +0200 (CEST) Received: by padhk11 with SMTP id hk11so32491772pad.1 for ; Mon, 19 Oct 2015 08:08:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=m19QgvnVljhdMWXe9yDNGq0TdPe3eBuwugnbX9NIpbc=; b=fbUvwqC0pSJ8wLxiEGmxnBFdJRz0lbsCYSuzZMjbr3jx864k6dVpM8+dj+8uHK9IXz 1hLebt0y6VOMaYoshzC+2PNpKdxjKECQxSEUtlDA2G81S44bGhEbdGrx1KUqFBZNOyRa 9QRxZcWdeU9FKYdU6NK0OEeDj/JftgPIAMoizLbBlHD5X49ybG13VJoQ1jPLMqV/euNO tXuVOFpIDHELk4az397DrUvLsmFPsHv24EDSqO1ITWNDXh8xt0DwCy7PcLssdncevkrv yregnjdSsyXS5EXa3rQzChkFYqq9i8bcDsRfkTMZw7aIULTu7ZQ4k7THLJc+kUGaK5im U88w== X-Received: by 10.68.69.17 with SMTP id a17mr35334247pbu.10.1445267314404; Mon, 19 Oct 2015 08:08:34 -0700 (PDT) Received: from [192.168.1.188] (static-50-53-21-5.bvtn.or.frontiernet.net. [50.53.21.5]) by smtp.googlemail.com with ESMTPSA id es4sm36864391pbc.42.2015.10.19.08.08.33 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Oct 2015 08:08:33 -0700 (PDT) To: Alex Forster References: <561FD17E.6070908@gmail.com> <5620041E.1060309@gmail.com> From: Alexander Duyck Message-ID: <56250771.7070306@gmail.com> Date: Mon, 19 Oct 2015 08:08:33 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Question about unsupported transceivers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Oct 2015 15:08:35 -0000 On 10/18/2015 06:06 PM, Alex Forster wrote: > On 10/15/15, 3:53 PM, "Alexander Duyck" wrote: > > >>>> It looks like you are probably seeing interfaces be unbound and then >>>> rebound. As such you are likely pushing things outside of the array >>>> boundary. One solution might just be to at more ",1"s if you are only >>>> going to be doing this kind of thing at boot up. The upper limit for >>>> the array is 32 entries so as long as you only are setting this up once >>>> you could probably get away with that. >>>> >>>> An alternative would be to modify the definition of the parameter in >>>> ixgbe_param.c. If you look through the file you should fine several >>>> likes like below: >>>> struct ixgbe_option opt = { >>>> .type = enable_option, >>>> .name = "allow_unsupported_sfp", >>>> .err = "defaulting to Disabled", >>>> .def = OPTION_DISABLED >>>> }; >>>> >>>> If you modify the .def value to "OPTION_ENABLED", and then rebuild and >>>> reinstall your driver you should be able have it install without any >>>> issues. >>>> >>>> - Alex >>>> >>> Yeah, I've had roughly the same thought process since you mentioned the >>> args array. My first idea was "maybe the driver can't fit all of my 1's" >>> but I saw it was defined at 32. Then I decided to just patch the whole >>> enable_unsupported_sfp option out >>> https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still >>> failing. >>> >>> I've been digging a bit, and I'm failing here in ixgbe_main.c... >>> >>> /* reset_hw fills in the perm_addr as well */ >>> hw->phy.reset_if_overtemp = true; >>> err = hw->mac.ops.reset_hw(hw); >>> hw->phy.reset_if_overtemp = false; >>> if (err == IXGBE_ERR_SFP_NOT_PRESENT) { >>> err = IXGBE_SUCCESS; >>> } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) { >>> e_dev_err("failed to load because an unsupported SFP+ or QSFP " >>> "module type was detected.\n"); >>> e_dev_err("Reload the driver after installing a supported " >>> "module.\n"); >>> goto err_sw_init; >>> } else if (err) { >>> e_dev_err("HW Init failed: %d\n", err); >>> goto err_sw_init; >>> } >>> >>> >>> I've attempted a hand-stacktrace and came up with the following... >>> >>> ixgbe_82599.c@1016 >>> * ixgbe_reset_hw_82599() is defined >>> * calls phy->ops.init() which potentially returns >>> IXGBE_ERR_SFP_NOT_SUPPORTED >>> >>> ixgbe_82599.c@102 >>> * ixgbe_init_phy_ops_82599() is defined >>> * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling >>> phy->ops.identify() >>> >>> ixgbe_82599.c@2085 >>> * ixgbe_identify_phy_82599() is defined >>> * calls ixgbe_identify_module_generic() >>> >>> ixgbe_phy.c@1281 >>> * ixgbe_identify_module_generic() is defined >>> * calls ixgbe_identify_qsfp_module_generic() >>> >>> ixgbe_phy.c@1663 >>> * ixgbe_identify_qsfp_module_generic() is defined >>> * We fail somewhere before the ending call to ixgbe_get_device_caps() >>> which does take allow_unsupported_sfp into account >>> >>> * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER, >>> &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS >>> * Possibility: active_cable != true >>> >>> And then I'm over my head. Should I assume from here that the most >>> likely >>> explanation is a bad transceiver or bad fiber? >>> >>> Alex Forster >> >> Are you able to swap transceiver or fiber between the 4 ports that work >> and the 4 that don't? If you could do that then you should be able to >> tell if the issue is following the NIC ports, or if it is an issue with >> the external connections. If it is issue is following the transceiver >> or fiber then it is probably what is causing the issue. >> >> The other thing you could try doing is adding a printk to the spots >> where the status is set to SFP_NOT_SUPPORTED so that you could figure >> out exactly which spot is triggering the rejection of the module. >> >> - Alex > > I had remote hands swap fibers on the QSFP side and the issue moved to the > first card, so I'm going to have the fibers cleaned and tested. This > appears to be my issue. > > I'd like to submit a patch for ixgbe_identify_qsfp_module_generic() to > print more helpful errors in the two cases mentioned above, so that > hopefully nobody ever has to deal with this again. Would I be wasting my > time, or does something like this have a likelihood of being accepted? > > Thank you for all of your help! I wouldn't have figured this out nearly as > quickly without it. > > Alex Forster I suspect there would be some value to such a patch, just make sure to explain the reason for needing it in the patch description. My advice would be to put such a patch together against what is in Jeff Kirsher's next queue (https://git.kernel.org/cgit/linux/kernel/git/jkirsher/next-queue.git/) and then base your patches off of that. The email lists for submitting patches to is intel-wired-lan@lists.osuosl.org and netdev@vger.kernel.org. - Alex - Alex