From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4BC8AA0524; Sat, 1 Feb 2020 00:20:26 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C59C31C0CD; Sat, 1 Feb 2020 00:20:25 +0100 (CET) Received: from inbox.dpdk.org (xvm-172-178.dc0.ghst.net [95.142.172.178]) by dpdk.org (Postfix) with ESMTP id 67B761BFE2 for ; Sat, 1 Feb 2020 00:20:24 +0100 (CET) Received: by inbox.dpdk.org (Postfix, from userid 33) id 31AE0A0525; Sat, 1 Feb 2020 00:20:24 +0100 (CET) From: bugzilla@dpdk.org To: dev@dpdk.org Date: Fri, 31 Jan 2020 23:20:22 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: 19.08 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: mgsmith@netgate.com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 Subject: [dpdk-dev] [Bug 388] ixgbe: link state race condition can occur when starting a fiber port X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" https://bugs.dpdk.org/show_bug.cgi?id=3D388 Bug ID: 388 Summary: ixgbe: link state race condition can occur when starting a fiber port Product: DPDK Version: 19.08 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: mgsmith@netgate.com Target Milestone: --- Created attachment 81 --> https://bugs.dpdk.org/attachment.cgi?id=3D81&action=3Dedit patch Overview: If the link is down when ports on an SFP+ X552 (device ID 0x15ac) are start= ed, a race condition can occur that prevents them from working when the link pe= er becomes available and the link comes up. If 2 ports are started individually with some time in between them, the iss= ue is not observed. The race condition seems to occur only when one port is started and then the other is started immediately afterwards (e.g. via scri= pt or control plane programmatically applying configuration). Steps to reproduce: 1. Install FD.IO VPP packages (available at https://packagecloud.io/fdio/release - vpp, vpp-lib, vpp-plugins needed) on= a CentOS 7 system with X552 SFP+ devices attached. 2. If the X552 ports are bound to the kernel ixgbe driver, take them administratively down so VPP will take over management via '[sudo] ifdown eth0'. 3. Start VPP with '[sudo] systemctl start vpp'. 4. Create a text file commands.txt containing API commands to start the por= ts: echo 'sw_interface_set_flags sw_if_index 1 admin-up sw_interface_set_flags sw_if_index 2 admin-up' > commands.txt 5. Remove the SFP+ cables from the X552 ports so that link will not be established when they are brought up. 6. Run commands to start both ports in rapid succession with '[sudo] vpp_api_test in commands.txt' 7. Check the link state by running '[sudo] vppctl show hardware-interface'.= The link speed should be displayed as "Unknown" and the link state should be displayed as "no carrier". 8. Connect an SFP+ cable between the two ports. 9. Check the link state again. One port may should that it is up and the li= nk speed now. The other should still report Unknown/no carrier. Actual results: The second port started reports that it's link is down and never recovers, = even if the port is stopped and restarted. Expected results: The second port reports that it's link is up and can forward and receive packets. Build date and hardware: Observed in DPDK 19.08 (VPP 20.01). Current DPDK master branch appears to h= ave the same issue. Observed on a Xeon-D 1537 SoC with 2 copper i350 ports and 2 SFP+ X552 port= s. Additional information: Attached gdb and found that when rte_eth_link_get_nowait() is called for the port which was having the issue, ixgbe_dev_link_update_share() would return before attempting to check the link state because the IXGBE_FLAG_NEED_LINK_CONFIG flag was set on the struct ixgbe_interrupt for = the device. Further exploration showed that following sequence of events occurr= ed: 1. ixgbe_dev_link_update_share() sets the IXGBE_FLAG_NEED_LINK_CONFIG flag = and schedules ixgbe_dev_setup_link_alarm_handler() to run after 10us. 2. ixgbe_dev_start() is executed and cancels the execution of ixgbe_dev_setup_link_alarm_handler(). 3. Since ixgbe_dev_setup_link_alarm_handler() is where the IXGBE_FLAG_NEED_LINK_CONFIG flag would normally be cleared and its execution was cancelled, the flag remains set. All subsequent calls to ixgbe_dev_link_update_share() return early and never actually check the link state again. The attached patch seems to fix the issue. --=20 You are receiving this mail because: You are the assignee for the bug.=