Summary: We’ve noticed that the PMD for the X540-AT2 does not recover properly after it reports dropped packets (resulting in increments of rte_eth_stats.imissing) due to CPU overload. The said driver then continues to drop packets even when the CPU load subsides. A restart of the just the driver (with no other changes) fixes the issue. Other drivers (such as the I350 1GE PMD) do no exhibit this behavior under identical traffic conditions and everything else being the same, causing us to suspect a bug in the X540 PMD.

 

Here are more details:

 

We have a very simple application using DPDK 21.11 running on an x86_64 platform running CentOS 8. Application receives 64byte packets at 1Gbps from port 0 of the X540-AT2 card, does some processing, and transmits packets over port 1 of the same card. Everything’s ok when the CPU load is moderate. When processing load saturates the CPU core, the imissing count increments (as expected) as PMD cannot keep up with the received packets. The real issue is that driver continues to miss packets and increments imissing even after the CPU load subsides to levels where previously it reported no dropped packets. A restart of the X540 driver using rte_eth_dev_stop() and rte_eth_dev_start() fixes the issue. Here’s the sequence:

 

  1. CPU core moderately loaded, X540-AT2 PMD reports no missed packets (all good).
  2. CPU core saturated, PMD reports missed packets (as expected)
  3. CPU core load subsides and is about the same as level in item 1 above, but PMD continues to drop packets and increments imissing (strange)
  4. Issue of dropped packets gets fixed after restarting the port driver by calling rte_eth_dev_stop() and then rte_eth_dev_start() with no other changes and no restart of the overall process/thread/application.

 

The above behavior is not seen with other drivers, i.e., packet drops stop upon mitigation of the CPU load level.

 

Has anyone else seen the above issue with the X540-AT2 card?

 

Thanks,

Vinay Purohit

CloudJuncxion, Inc.