Hello,

I have found a bug in the ixgbe pmd, concerning this model of nic in particular:
     -> Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T

I was able to detect this bug after enabling the use of Link state change interrupt in the card via the following flag:
    - port->rte_conf.intr_conf.lsc = 1;

With these interrupts enabled on this particular network card, there is a chance on every link-up that all interrupts get disabled, as the interrupt mask is unintentionally set to 0.
This happens because there are two interrupts that are triggered on a link status change on this model of nic. See related code here:
https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_ethdev.c?id=c99e1db87798c2cd2c0ac94c7676e961eedffc87#n4552

Whenever an lsc interrupt is triggered, we enter this ixgbe_dev_interrupt_get_status() function.
Bit 20 (IXGBE_EICR_LSC ) and 25 (IXGBE_EICR_GPI_SDP0_X550EM_x ) of the Extendend Interrupt Cause Register (EICR) will be set to '1' on a link status change.
When the interrupt is handled in the ixgbe_dev_interrupt_delayed_handler() function, if the IXGBE_FLAG_NEED_LINK_UPDATE flag is raised (happens when EICR_LSC bit is raised in eicr),  an alarm for a delayed handler will be programmed.

The following code poses issue:
https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_ethdev.c?id=c99e1db87798c2cd2c0ac94c7676e961eedffc87#n4736
intr->mask = intr->mask_original;
intr->mask_original = 0;
Sometimes, it's possible to see a first interrupt with bit IXGBE_EICR_LSC raised, quickly followed by a second interrupt with both bit IXGBE_EICR_LSC and bit IXGBE_EICR_GPI_SDP0_X550EM_x being raised.
In this case, the flag IXGBE_FLAG_NEED_LINK_UPDATE will be raised in both cases and the delayed handler will be programmed to be ran twice.
This will lead to intr->mask_original being set to 0 after the first delayed_handler is done, and then intr->mask being also set to 0 after the second time it's run.
This effectively disables all interrupts from working on the device, which is particularly painful since we no longer poll for link state on our side, having switched to interrupt-based link state handling.

My suggestion to solve this issue is to avoid creating an alarm for ixgbe_dev_interrupt_delayed_handler() if there is already one pending. I do this by checking for bit IXGBE_EIMS_LSC  in intr->mask.
I no longer see this issue after pushing in my local repository. I will suggest a patch by responding to this mail.

I can show a way to reproduce bellow with test pmd. I have added some custom logs into the code for my own convenience:

=====================================

    1) start testpmd:

PCI-SLOT      IFNAME  MAC-ADDRESS        KMOD   MAX SPEED        CURRENT SPEED    DEVICE
0000:03:00.0  ntfp1   0c:c4:7a:75:8f:ec  ixgbe  1x2.5 GT/s PCIe  1x2.5 GT/s PCIe  Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T
0000:03:00.1  ntfp2   0c:c4:7a:75:8f:ed  ixgbe  1x2.5 GT/s PCIe  1x2.5 GT/s PCIe  Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T

sudo dpdk-hugepages.py --setup 2G;
dpdk-devbind --bind=vfio-pci 0000:03:00.0
dpdk-devbind --bind=vfio-pci 0000:03:00.1

 dpdk-testpmd -a 0000:03:00.0 -a 0000:03:00.1 -- -i --rxq=2 --txq=2 --coremask=0x0c --total-num-mbufs=250000
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ixgbe (8086:15ad) device: 0000:03:00.0 (socket -1)
EAL: Probe PCI driver: net_ixgbe (8086:15ad) device: 0000:03:00.1 (socket -1)
Interactive-mode selected
previous number of forwarding cores 1 - changed to number of configured cores 2
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=250000, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 0C:C4:7A:75:8F:EC
Configuring Port 1 (socket 0)
Port 1: 0C:C4:7A:75:8F:ED
Checking link statuses...
Done


    2) To reproduce, simply repeatedly set link up and down. At one point there should be something like this:

testpmd> set link-up port 0
ixgbe_dev_interrupt_get_status(): <><><><>><>: eicr register: 100000              <---------------------- IXGBE_EICR_LSC
ixgbe_dev_interrupt_get_status(): <><><><>><>: set IXGBE_FLAG_NEED_LINK_UPDATE
ixgbe_dev_interrupt_get_status(): <><><><>><>: current flags: flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (1): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (2): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (3): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: for dev: 0x557c0cd844c0
ixgbe_dev_interrupt_action(): <><><><>><>: saved intr: mask_original = 36700160      <------------ original interrupt mask
ixgbe_dev_interrupt_action(): <><><><>><>: current intr: mask = 35651584                  <------------ interrupt mask with lsc disabled
ixgbe_dev_interrupt_get_status(): <><><><>><>: eicr register: 2100000            <---------------------- IXGBE_EICR_LSC and IXGBE_EICR_GPI_SDP0_X550EM_x
ixgbe_dev_interrupt_get_status(): <><><><>><>: set IXGBE_FLAG_NEED_LINK_UPDATE
ixgbe_dev_interrupt_get_status(): <><><><>><>: current flags: flags = 5
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (1): flags = 5
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (2): flags = 5
ixgbe_dev_interrupt_action(): <><><><>><>: current flags (3): flags = 1
ixgbe_dev_interrupt_action(): <><><><>><>: for dev: 0x557c0cd844c0
ixgbe_dev_interrupt_action(): <><><><>><>: saved intr: mask_original = 35651584      <------------ original mask is lost here
ixgbe_dev_interrupt_action(): <><><><>><>: current intr: mask = 35651584

Port 0: link state change event
ixgbe_dev_interrupt_delayed_handler(): <><><><>><>: mask restored: 35651584
testpmd> set link-down port 0ixgbe_dev_interrupt_delayed_handler(): <><><><>><>: mask restored: 0   <--- all interrupts are disabled

    3) At this point, link state can no longer be updated:

testpmd> show port info 0

********************* Infos for port 0  *********************
MAC address: 0C:C4:7A:75:8F:EC
Device name: 0000:03:00.0
Driver name: net_ixgbe
Firmware-version: 0x800001cf
Devargs:
Connect to socket: 0
memory allocation on the socket: 0
Link status: up     <----------------------
Link speed: 10 Gbps
Link duplex: full-duplex
Autoneg status: On
MTU: 1500

testpmd> set link-down port 0
testpmd> show port info

********************* Infos for port 0  *********************
MAC address: 0C:C4:7A:75:8F:EC
Device name: 0000:03:00.0
Driver name: net_ixgbe
Firmware-version: 0x800001cf
Devargs:
Connect to socket: 0
memory allocation on the socket: 0
Link status: up    <------------- should not be up after link down.
Link speed: 10 Gbps
Link duplex: full-duplex
Autoneg status: On
MTU: 1500