https://bugs.dpdk.org/show_bug.cgi?id=627 Bug ID: 627 Summary: mlx5 / DPDK 20.11: severe RSS misbehavior and performance drop with libibverbs 29 Product: DPDK Version: 20.11 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: major Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: martin.weiser@allegro-packets.com Target Milestone: --- When updating to DPDK v20.11 we noticed a dramatic performance drop and RSS misbehavior with our ConnectX-5 NICs. Some of the rx queues do not receive any traffic while others only receive a fraction of what they should receive. Overall the receive performance dropped by around 90%. We then were able to identify that this only happens when the default mlx5_rx_burst path is used and the issue does not appear when a vectorized rx function is used. We are forced to the default path since we use multi-segment mbufs to handle jumbo frames which automatically drops you to the default non-vectorized path in mlx5. After digging a bit deeper we could identify the following commit to cause the issue: 54c2d46b160f8ad0bff0977812bf871ca5dd8241: net/mlx5: support flow tag and packet header miniCQEs Reverting this commit restored the original performance. But on a test system running testpmd we were not able to replicate this behavior. It took us quite some time to identify the reason for the difference: On the test system we have an installation of libibverbs 33 and this version has the symbol mlx5dv_dr_action_create_dest_devx_tir in infiniband/mlx5dv.h. This causes the meson build to set HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR during compilation which in turn causes config->dest_tir to be set in the driver. It seems that the whole driver behaves quite differently if this flag is set. On our regular system we have an installation of libibverbs 29 which lacks the aforementioned symbol and if we compile testpmd on that system it also shows the misbehavior. But only if you force it to use the non-vectorized mlx5_rx_burst path. We used the following testpmd command line to replicate the behavior: `./app/dpdk-testpmd -n 4 --legacy-mem -w 81:00.0 -w 81:00.1 -l 1,2,3,4,5 -- --total-num-mbufs=2000000 --nb-cores=4 --rxq=4 --txq=4 --max-pkt-len=15360 --rx-offloads=0x12800 --mbuf-size=2331 --rxd=4096` -- You are receiving this mail because: You are the assignee for the bug.