Dear DPDK Community/Maintainers.

I am writing to consult about a technical issue with the Virtio network driver (located in driver/net/virtio). During our testing of the packed queue feature with desc_event_flags enabled, Testpmd consistently crashes after running for a short period. Below is a detailed description of the scenario, observation, root cause analysis, and a proposed fix!we hope to get clarification on whether this is a usage error or a potential driver issue.

1. Scenario

2. Observation

After Testpmd runs for several minutes, it crashes unexpectedly. Debugging shows that the content of vq_descx (within struct virtqueue) is being modified unexpectedly!likely due to address mismatches in the queue memory layout.

3. Command Used to Reproduce

testpmd -c 0xff -n 4 --huge-dir=/mnt/huge_dpdk --socket-mem=1024 --socket-limit=1024 -w 0000:15:00.1 --file-prefix=test3 -d /usr/lib64/librte_pmd_virtio.so \
        -- --total-num-mbufs=115200 --rxq=4 --txq=4 --forward-mode=txonly --nb-cores=4 --stats-period 1 --burst=512 --rxd=512 --txd=512 --eth-peer=0,10:70:fd:2a:60:39

4. Suspected Root Cause

The Virtio driver uses inconsistent address calculation logic for two critical steps:
  1. When informing the hardware (via the modern_setup_queue interface) of the physical addresses for the driver and device regions of the packed queue.
  2. When calculating the virtual addresses of the driver and device regions for the driver¨s own use (via vring_init_packed).
This mismatch leads to the hardware and the driver referencing different memory regions for the device queue, causing unintended overwrites of vq_descx.

5. Detailed Analysis

5.1 Address Calculation in modern_setup_queue (Hardware-facing)

The modern_setup_queue function configures the queue addresses and passes them to the hardware. For packed queues, it calculates desc_addravail_addr (driver region), and used_addr (device region) as follows:

Key parameters for our test:
Example calculation (assuming desc_addr = 0x0):
  1. avail_addr = 0x0 + (0x1000 * 0x10) = 0x10000 (driver region address passed to hardware)
  2. used_addr = RTE_ALIGN_CEIL(0x10000 + 4 + (0x1000 * 2), 0x1000) = RTE_ALIGN_CEIL(0x12004, 0x1000) = 0x13000 (device region address passed to hardware)

5.2 Address Calculation in vring_init_packed (Driver-internal)

The vring_init_packed function calculates the driver-internal virtual addresses for the packed queue¨s driver and device regions:
Example calculation (same desc_addr = 0x0p = 0x0):
  1. vr->driver = 0x0 + (0x1000 * 0x10) = 0x10000 (matches avail_addr from modern_setup_queue)
  2. vr->device = RTE_ALIGN_CEIL(0x10000 + sizeof(struct vring_packed_desc_event), 0x1000)
    • Assuming sizeof(struct vring_packed_desc_event) = 4 (standard definition), this becomes RTE_ALIGN_CEIL(0x10004, 0x1000) = 0x11000 (driver-internal device region address)

5.3 Critical Mismatch

6. Modification Applied to Fix the Issue

We modified modern_setup_queue to use the same address logic as vring_init_packed for packed queues. After this change, Testpmd runs stably without crashes:

7. Reference from Virtio-User/Vhost-User

We noticed that the virtio-user and vhost-user drivers already use the same logic as our modified modern_setup_queue for packed queues. For example, in virtio_user_setup_queue_packed:

8. Question for Clarification

Therefore, for packed queues, why do modern_setup_queue and vring_init_packed use different logic to calculate the device region address?

Is this inconsistency due to incorrect usage on my part, or are there special considerations specific to packed queue mode (e.g., hardware compatibility, protocol requirements)?
We would greatly appreciate your help in clarifying this confusion. Thank you!

Best regards.

[A DPDK user and developer].



,,,,,,
1104121601@qq.com