DPDK patches and discussions
 help / color / mirror / Atom feed
From: Edwin Brossette <edwin.brossette@6wind.com>
To: dev@dpdk.org
Cc: Olivier Matz <olivier.matz@6wind.com>
Subject: net/ixgbe: all interrupt disabled after enabling lsc on X552/X557-AT card
Date: Thu, 18 Apr 2024 15:39:54 +0200	[thread overview]
Message-ID: <CANDF9xC+4yBVxX8sgXzM0EdYcQFhzonG8FnnTQ1trSDxAkvAaQ@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 7028 bytes --]

Hello,

I have found a bug in the ixgbe pmd, concerning this model of nic in
particular:
     -> Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T

I was able to detect this bug after enabling the use of Link state change
interrupt in the card via the following flag:
    - port->rte_conf.intr_conf.lsc = 1;

With these interrupts enabled on this particular network card, there is a
chance on every link-up that all interrupts get disabled, as the interrupt
mask is unintentionally set to 0.
This happens because there are two interrupts that are triggered on a link
status change on this model of nic. See related code here:
https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_ethdev.c?id=c99e1db87798c2cd2c0ac94c7676e961eedffc87#n4552

Whenever an lsc interrupt is triggered, we enter this
ixgbe_dev_interrupt_get_status() function.
Bit 20 (IXGBE_EICR_LSC ) and 25 (IXGBE_EICR_GPI_SDP0_X550EM_x ) of the
Extendend Interrupt Cause Register (EICR) will be set to '1' on a link
status change.
When the interrupt is handled in the ixgbe_dev_interrupt_delayed_handler()
function, if the IXGBE_FLAG_NEED_LINK_UPDATE flag is raised (happens when
EICR_LSC bit is raised in eicr),  an alarm for a delayed handler will be
programmed.

The following code poses issue:
https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_ethdev.c?id=c99e1db87798c2cd2c0ac94c7676e961eedffc87#n4736

intr->mask = intr->mask_original;
intr->mask_original = 0;

Sometimes, it's possible to see a first interrupt with bit IXGBE_EICR_LSC
raised, quickly followed by a second interrupt with both bit IXGBE_EICR_LSC
and bit IXGBE_EICR_GPI_SDP0_X550EM_x being raised.
In this case, the flag IXGBE_FLAG_NEED_LINK_UPDATE will be raised in both
cases and the delayed handler will be programmed to be ran twice.
This will lead to intr->mask_original being set to 0 after the first
delayed_handler is done, and then intr->mask being also set to 0 after the
second time it's run.
This effectively disables all interrupts from working on the device, which
is particularly painful since we no longer poll for link state on our side,
having switched to interrupt-based link state handling.

My suggestion to solve this issue is to avoid creating an alarm for
ixgbe_dev_interrupt_delayed_handler() if there is already one pending. I do
this by checking for bit IXGBE_EIMS_LSC  in intr->mask.
I no longer see this issue after pushing in my local repository. I will
suggest a patch by responding to this mail.

I can show a way to reproduce bellow with test pmd. I have added some
custom logs into the code for my own convenience:

=====================================

    1) start testpmd:

PCI-SLOT      IFNAME  MAC-ADDRESS        KMOD   MAX SPEED        CURRENT
SPEED    DEVICE
0000:03:00.0  ntfp1   0c:c4:7a:75:8f:ec  ixgbe  1x2.5 GT/s PCIe  1x2.5 GT/s
PCIe  Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T
0000:03:00.1  ntfp2   0c:c4:7a:75:8f:ed  ixgbe  1x2.5 GT/s PCIe  1x2.5 GT/s
PCIe  Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T

sudo dpdk-hugepages.py --setup 2G;
dpdk-devbind --bind=vfio-pci 0000:03:00.0
dpdk-devbind --bind=vfio-pci 0000:03:00.1

 dpdk-testpmd -a 0000:03:00.0 -a 0000:03:00.1 -- -i --rxq=2 --txq=2
--coremask=0x0c --total-num-mbufs=250000
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ixgbe (8086:15ad) device: 0000:03:00.0 (socket
-1)
EAL: Probe PCI driver: net_ixgbe (8086:15ad) device: 0000:03:00.1 (socket
-1)
Interactive-mode selected
previous number of forwarding cores 1 - changed to number of configured
cores 2
Warning: NUMA should be configured manually by using --port-numa-config and
--ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=250000, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 0C:C4:7A:75:8F:EC
Configuring Port 1 (socket 0)
Port 1: 0C:C4:7A:75:8F:ED
Checking link statuses...
Done


    2) To reproduce, simply repeatedly set link up and down. At one point
there should be something like this:

testpmd> set link-up port 0
ixgbe_dev_interrupt_get_status(): <><><><>><>: eicr register:
100000              <---------------------- IXGBE_EICR_LSC
ixgbe_dev_interrupt_get_status(): <><><><>><>: set
IXGBE_FLAG_NEED_LINK_UPDATE
ixgbe_dev_interrupt_get_status(): <><><><>><>: current flags: flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (1): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (2): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (3): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: for dev: 0x557c0cd844c0
ixgbe_dev_interrupt_action(): <><><><>><>: saved intr: mask_original =
36700160      <------------ original interrupt mask
ixgbe_dev_interrupt_action(): <><><><>><>: current intr: mask =
35651584                  <------------ interrupt mask with lsc disabled
ixgbe_dev_interrupt_get_status(): <><><><>><>: eicr register:
2100000            <---------------------- IXGBE_EICR_LSC and
IXGBE_EICR_GPI_SDP0_X550EM_x
ixgbe_dev_interrupt_get_status(): <><><><>><>: set
IXGBE_FLAG_NEED_LINK_UPDATE
ixgbe_dev_interrupt_get_status(): <><><><>><>: current flags: flags = 5
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (1): flags = 5
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (2): flags = 5
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (3): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: for dev: 0x557c0cd844c0
ixgbe_dev_interrupt_action(): <><><><>><>: saved intr: mask_original =
35651584      <------------ original mask is lost here
ixgbe_dev_interrupt_action(): <><><><>><>: current intr: mask = 35651584

Port 0: link state change event
ixgbe_dev_interrupt_delayed_handler(): <><><><>><>: mask restored: 35651584
testpmd> set link-down port 0ixgbe_dev_interrupt_delayed_handler():
<><><><>><>: mask restored: 0   <--- all interrupts are disabled

    3) At this point, link state can no longer be updated:

testpmd> show port info 0

********************* Infos for port 0  *********************
MAC address: 0C:C4:7A:75:8F:EC
Device name: 0000:03:00.0
Driver name: net_ixgbe
Firmware-version: 0x800001cf
Devargs:
Connect to socket: 0
memory allocation on the socket: 0
Link status: up     <----------------------
Link speed: 10 Gbps
Link duplex: full-duplex
Autoneg status: On
MTU: 1500

testpmd> set link-down port 0
testpmd> show port info

********************* Infos for port 0  *********************
MAC address: 0C:C4:7A:75:8F:EC
Device name: 0000:03:00.0
Driver name: net_ixgbe
Firmware-version: 0x800001cf
Devargs:
Connect to socket: 0
memory allocation on the socket: 0
Link status: up    <------------- should not be up after link down.
Link speed: 10 Gbps
Link duplex: full-duplex
Autoneg status: On
MTU: 1500

[-- Attachment #2: Type: text/html, Size: 9331 bytes --]

             reply	other threads:[~2024-04-18 13:40 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18 13:39 Edwin Brossette [this message]
2024-04-18 13:53 ` [PATCH] net/ixgbe: don't create a delayed interrupt handler if one already exists edwin.brossette
2024-05-16 11:41   ` Burakov, Anatoly
2024-05-21 14:50     ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANDF9xC+4yBVxX8sgXzM0EdYcQFhzonG8FnnTQ1trSDxAkvAaQ@mail.gmail.com \
    --to=edwin.brossette@6wind.com \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).