* [dpdk-dev] [Bug 388] ixgbe: link state race condition can occur when starting a fiber port
@ 2020-01-31 23:20 bugzilla
0 siblings, 0 replies; only message in thread
From: bugzilla @ 2020-01-31 23:20 UTC (permalink / raw)
To: dev
https://bugs.dpdk.org/show_bug.cgi?id=388
Bug ID: 388
Summary: ixgbe: link state race condition can occur when
starting a fiber port
Product: DPDK
Version: 19.08
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: mgsmith@netgate.com
Target Milestone: ---
Created attachment 81
--> https://bugs.dpdk.org/attachment.cgi?id=81&action=edit
patch
Overview:
If the link is down when ports on an SFP+ X552 (device ID 0x15ac) are started,
a race condition can occur that prevents them from working when the link peer
becomes available and the link comes up.
If 2 ports are started individually with some time in between them, the issue
is not observed. The race condition seems to occur only when one port is
started and then the other is started immediately afterwards (e.g. via script
or control plane programmatically applying configuration).
Steps to reproduce:
1. Install FD.IO VPP packages (available at
https://packagecloud.io/fdio/release - vpp, vpp-lib, vpp-plugins needed) on a
CentOS 7 system with X552 SFP+ devices attached.
2. If the X552 ports are bound to the kernel ixgbe driver, take them
administratively down so VPP will take over management via '[sudo] ifdown
eth0'.
3. Start VPP with '[sudo] systemctl start vpp'.
4. Create a text file commands.txt containing API commands to start the ports:
echo 'sw_interface_set_flags sw_if_index 1 admin-up
sw_interface_set_flags sw_if_index 2 admin-up' > commands.txt
5. Remove the SFP+ cables from the X552 ports so that link will not be
established when they are brought up.
6. Run commands to start both ports in rapid succession with '[sudo]
vpp_api_test in commands.txt'
7. Check the link state by running '[sudo] vppctl show hardware-interface'. The
link speed should be displayed as "Unknown" and the link state should be
displayed as "no carrier".
8. Connect an SFP+ cable between the two ports.
9. Check the link state again. One port may should that it is up and the link
speed now. The other should still report Unknown/no carrier.
Actual results:
The second port started reports that it's link is down and never recovers, even
if the port is stopped and restarted.
Expected results:
The second port reports that it's link is up and can forward and receive
packets.
Build date and hardware:
Observed in DPDK 19.08 (VPP 20.01). Current DPDK master branch appears to have
the same issue.
Observed on a Xeon-D 1537 SoC with 2 copper i350 ports and 2 SFP+ X552 ports.
Additional information:
Attached gdb and found that when rte_eth_link_get_nowait() is called for the
port which was having the issue, ixgbe_dev_link_update_share() would return
before attempting to check the link state because the
IXGBE_FLAG_NEED_LINK_CONFIG flag was set on the struct ixgbe_interrupt for the
device. Further exploration showed that following sequence of events occurred:
1. ixgbe_dev_link_update_share() sets the IXGBE_FLAG_NEED_LINK_CONFIG flag and
schedules ixgbe_dev_setup_link_alarm_handler() to run after 10us.
2. ixgbe_dev_start() is executed and cancels the execution of
ixgbe_dev_setup_link_alarm_handler().
3. Since ixgbe_dev_setup_link_alarm_handler() is where the
IXGBE_FLAG_NEED_LINK_CONFIG flag would normally be cleared and its execution
was cancelled, the flag remains set. All subsequent calls to
ixgbe_dev_link_update_share() return early and never actually check the link
state again.
The attached patch seems to fix the issue.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2020-01-31 23:20 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-31 23:20 [dpdk-dev] [Bug 388] ixgbe: link state race condition can occur when starting a fiber port bugzilla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).