[Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx

DPDK patches and discussions
 help / color / mirror / Atom feed

* [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst
@ 2025-08-27  0:52 =?gb18030?B?RG9yYWVtb24=?=
  2025-08-28 14:45 ` Stephen Hemminger
  2025-08-28 15:57 ` Bruce Richardson
  0 siblings, 2 replies; 3+ messages in thread
From: =?gb18030?B?RG9yYWVtb24=?= @ 2025-08-27  0:52 UTC (permalink / raw)
  To: =?gb18030?B?ZGV2?=

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb18030", Size: 3670 bytes --]

Hello DPDK / net_ice maintainers,

We are seeing a reproducible and concerning issue when using the net_ice PMD with DPDK 22.11.2, and we would appreciate your help diagnosing it.

Summary
- Environment:
- DPDK: 22.11.2
- net_ice PCI device: 8086:159b
- ice kernel driver: 1.12.7
- NIC firmware: FW 7.3.6111681 (NVM 4.30)
- IOVA mode: PA, VFIO enabled
- Multi-process socket: /var/run/dpdk/PGW/mp_socket
- NUMA: 2, detected lcores: 112
- Bonding: pmd_bond with bonded devices created (net_bonding0 on port 4, net_bonding1 on port 5)
- Driver enabled AVX2 OFFLOAD Vector Tx (log shows "ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx")

- Problem statement:
- Our application calls rte_eth_tx_prepare before calling rte_eth_tx_burst as part of the normal transmission path.
- After the application has been running for some time (not immediate), the kernel/driver emits the following messages repeatedly:
- ice_interrupt_handler(): OICR: MDD event
- ice_interrupt_handler(): Malicious Driver Detection event 3 by TCLAN on TX queue 1025 PF# 1
- We are using a single TX queue (application-level single queue) and are sending only one packet per burst (burst size = 1).
- The sequence is: rte_eth_tx_prepare (returns) -&gt; rte_eth_tx_burst -&gt; MDD events occur later.
- The events affect stability and repeat over time.

Relevant startup logs (excerpt)
EAL: Detected CPU lcores: 112
EAL: Detected NUMA nodes: 2
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:3b:00.1 (socket 0)
ice_load_pkg_type(): Active package is: 1.3.45.0, ICE COMMS Package (double VLAN mode)
ice_dev_init(): FW 7.3.6111681 API 1.7
...
bond_probe(3506) - Initializing pmd_bond for net_bonding0
bond_probe(3592) - Create bonded device net_bonding0 on port 4 in mode 1 on socket 0.
...
ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx (port 0).
TELEMETRY: No legacy callbacks, legacy socket not created

What we have tried / preliminary observations
- Confirmed application calls rte_eth_tx_prepare prior to rte_eth_tx_burst.
- Confirmed single TX queue configuration and small bursts (size = 1) ¡ª not high-rate, not a typical high-burst/malicious pattern.
- The MDD log identifies "TX queue 1025";&nbsp; unclear how that maps to our DPDK queue numbering (we use queue 0 in the app).
- No obvious other DPDK errors at startup;&nbsp; interface initializes normally and vector TX is enabled.
- We suspect the driver's Malicious Driver Detection (MDD) is triggering due to some descriptor/doorbell ordering or offload interaction, possibly related to AVX2 Vector Tx offload.

Questions / requests to the maintainers
1.&nbsp; What specifically triggers "MDD event 3 by TCLAN" in net_ice?&nbsp; Which driver check/threshold corresponds to event type 3?
2.&nbsp; How is the "TX queue 1025" value computed/mapped in the log?&nbsp; (Is it queue id + offset, VF mapping, or an internal vector id?)&nbsp; We need to map that log value to our DPDK queue index.
3.&nbsp; Can the rte_eth_tx_prepare + rte_eth_tx_burst call pattern cause MDD detections under any circumstances?&nbsp; If so, are there recommended usage patterns or ordering constraints to avoid false positives?
4.&nbsp; Are there known firmware/driver/DPDK version combinations with similar MDD behavior?&nbsp; Do you recommend specific NIC firmware, kernel driver, or DPDK versions as a workaround/fix?
5.&nbsp; Any suggested workarounds we can test quickly (e.g., disable vector TX offload, disable specific HW offloads, change interrupt/queue bindings, or adjust doorbell behavior)?

Best regards.

[-- Attachment #2: Type: text/html, Size: 9101 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst
  2025-08-27  0:52 [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst =?gb18030?B?RG9yYWVtb24=?=
@ 2025-08-28 14:45 ` Stephen Hemminger
  2025-08-28 15:57 ` Bruce Richardson
  1 sibling, 0 replies; 3+ messages in thread
From: Stephen Hemminger @ 2025-08-28 14:45 UTC (permalink / raw)
  To: Doraemon; +Cc: dev

On Wed, 27 Aug 2025 08:52:26 +0800
"Doraemon" <nobitanobi@qq.com> wrote:

> Hello DPDK / net_ice maintainers,
> 
> 
> We are seeing a reproducible and concerning issue when using the net_ice PMD with DPDK 22.11.2, and we would appreciate your help diagnosing it.
> 
> 
> Summary
> - Environment:
> - DPDK: 22.11.2
> - net_ice PCI device: 8086:159b
> - ice kernel driver: 1.12.7
> - NIC firmware: FW 7.3.6111681 (NVM 4.30)
> - IOVA mode: PA, VFIO enabled
> - Multi-process socket: /var/run/dpdk/PGW/mp_socket
> - NUMA: 2, detected lcores: 112
> - Bonding: pmd_bond with bonded devices created (net_bonding0 on port 4, net_bonding1 on port 5)
> - Driver enabled AVX2 OFFLOAD Vector Tx (log shows "ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx")
> 
> 
> - Problem statement:
> - Our application calls rte_eth_tx_prepare before calling rte_eth_tx_burst as part of the normal transmission path.
> - After the application has been running for some time (not immediate), the kernel/driver emits the following messages repeatedly:
> - ice_interrupt_handler(): OICR: MDD event
> - ice_interrupt_handler(): Malicious Driver Detection event 3 by TCLAN on TX queue 1025 PF# 1
> - We are using a single TX queue (application-level single queue) and are sending only one packet per burst (burst size = 1).
> - The sequence is: rte_eth_tx_prepare (returns) -&gt; rte_eth_tx_burst -&gt; MDD events occur later.
> - The events affect stability and repeat over time.
> 
> 
> Relevant startup logs (excerpt)
> EAL: Detected CPU lcores: 112
> EAL: Detected NUMA nodes: 2
> EAL: Selected IOVA mode 'PA'
> EAL: VFIO support initialized
> EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:3b:00.1 (socket 0)
> ice_load_pkg_type(): Active package is: 1.3.45.0, ICE COMMS Package (double VLAN mode)
> ice_dev_init(): FW 7.3.6111681 API 1.7
> ...
> bond_probe(3506) - Initializing pmd_bond for net_bonding0
> bond_probe(3592) - Create bonded device net_bonding0 on port 4 in mode 1 on socket 0.
> ...
> ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx (port 0).
> TELEMETRY: No legacy callbacks, legacy socket not created
> 
> 
> What we have tried / preliminary observations
> - Confirmed application calls rte_eth_tx_prepare prior to rte_eth_tx_burst.
> - Confirmed single TX queue configuration and small bursts (size = 1) — not high-rate, not a typical high-burst/malicious pattern.
> - The MDD log identifies "TX queue 1025";&nbsp; unclear how that maps to our DPDK queue numbering (we use queue 0 in the app).
> - No obvious other DPDK errors at startup;&nbsp; interface initializes normally and vector TX is enabled.
> - We suspect the driver's Malicious Driver Detection (MDD) is triggering due to some descriptor/doorbell ordering or offload interaction, possibly related to AVX2 Vector Tx offload.
> 
> 
> Questions / requests to the maintainers
> 1.&nbsp; What specifically triggers "MDD event 3 by TCLAN" in net_ice?&nbsp; Which driver check/threshold corresponds to event type 3?
> 2.&nbsp; How is the "TX queue 1025" value computed/mapped in the log?&nbsp; (Is it queue id + offset, VF mapping, or an internal vector id?)&nbsp; We need to map that log value to our DPDK queue index.
> 3.&nbsp; Can the rte_eth_tx_prepare + rte_eth_tx_burst call pattern cause MDD detections under any circumstances?&nbsp; If so, are there recommended usage patterns or ordering constraints to avoid false positives?
> 4.&nbsp; Are there known firmware/driver/DPDK version combinations with similar MDD behavior?&nbsp; Do you recommend specific NIC firmware, kernel driver, or DPDK versions as a workaround/fix?
> 5.&nbsp; Any suggested workarounds we can test quickly (e.g., disable vector TX offload, disable specific HW offloads, change interrupt/queue bindings, or adjust doorbell behavior)?
> 
> 
> 
> 
> Best regards.

Did you make sure that the source address of the packet matches the MAC address of teh VF?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst
  2025-08-27  0:52 [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst =?gb18030?B?RG9yYWVtb24=?=
  2025-08-28 14:45 ` Stephen Hemminger
@ 2025-08-28 15:57 ` Bruce Richardson
  1 sibling, 0 replies; 3+ messages in thread
From: Bruce Richardson @ 2025-08-28 15:57 UTC (permalink / raw)
  To: Doraemon; +Cc: dev

On Wed, Aug 27, 2025 at 08:52:26AM +0800, Doraemon wrote:
>    Hello DPDK / net_ice maintainers,
> 
>    We are seeing a reproducible and concerning issue when using the
>    net_ice PMD with DPDK 22.11.2, and we would appreciate your help
>    diagnosing it.
>    Summary
>    - Environment:
>    - DPDK: 22.11.2
>    - net_ice PCI device: 8086:159b
>    - ice kernel driver: 1.12.7
>    - NIC firmware: FW 7.3.6111681 (NVM 4.30)
>    - IOVA mode: PA, VFIO enabled
>    - Multi-process socket: /var/run/dpdk/PGW/mp_socket
>    - NUMA: 2, detected lcores: 112
>    - Bonding: pmd_bond with bonded devices created (net_bonding0 on port
>    4, net_bonding1 on port 5)
>    - Driver enabled AVX2 OFFLOAD Vector Tx (log shows
>    "ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx")
>    - Problem statement:
>    - Our application calls rte_eth_tx_prepare before calling
>    rte_eth_tx_burst as part of the normal transmission path.
>    - After the application has been running for some time (not immediate),
>    the kernel/driver emits the following messages repeatedly:
>    - ice_interrupt_handler(): OICR: MDD event
>    - ice_interrupt_handler(): Malicious Driver Detection event 3 by TCLAN
>    on TX queue 1025 PF# 1
>    - We are using a single TX queue (application-level single queue) and
>    are sending only one packet per burst (burst size = 1).
>    - The sequence is: rte_eth_tx_prepare (returns) -> rte_eth_tx_burst ->
>    MDD events occur later.
>    - The events affect stability and repeat over time.
>    Relevant startup logs (excerpt)
>    EAL: Detected CPU lcores: 112
>    EAL: Detected NUMA nodes: 2
>    EAL: Selected IOVA mode 'PA'
>    EAL: VFIO support initialized
>    EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:3b:00.1 (socket
>    0)
>    ice_load_pkg_type(): Active package is: 1.3.45.0, ICE COMMS Package
>    (double VLAN mode)
>    ice_dev_init(): FW 7.3.6111681 API 1.7
>    ...
>    bond_probe(3506) - Initializing pmd_bond for net_bonding0
>    bond_probe(3592) - Create bonded device net_bonding0 on port 4 in mode
>    1 on socket 0.
>    ...
>    ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx (port 0).
>    TELEMETRY: No legacy callbacks, legacy socket not created
>    What we have tried / preliminary observations
>    - Confirmed application calls rte_eth_tx_prepare prior to
>    rte_eth_tx_burst.
>    - Confirmed single TX queue configuration and small bursts (size = 1)
>    �� not high-rate, not a typical high-burst/malicious pattern.
>    - The MDD log identifies "TX queue 1025";  unclear how that maps to our
>    DPDK queue numbering (we use queue 0 in the app).
>    - No obvious other DPDK errors at startup;  interface initializes
>    normally and vector TX is enabled.
>    - We suspect the driver's Malicious Driver Detection (MDD) is
>    triggering due to some descriptor/doorbell ordering or offload
>    interaction, possibly related to AVX2 Vector Tx offload.
>    Questions / requests to the maintainers
>    1.  What specifically triggers "MDD event 3 by TCLAN" in net_ice?
>    Which driver check/threshold corresponds to event type 3?
>    2.  How is the "TX queue 1025" value computed/mapped in the log?  (Is
>    it queue id + offset, VF mapping, or an internal vector id?)  We need
>    to map that log value to our DPDK queue index.
>    3.  Can the rte_eth_tx_prepare + rte_eth_tx_burst call pattern cause
>    MDD detections under any circumstances?  If so, are there recommended
>    usage patterns or ordering constraints to avoid false positives?
>    4.  Are there known firmware/driver/DPDK version combinations with
>    similar MDD behavior?  Do you recommend specific NIC firmware, kernel
>    driver, or DPDK versions as a workaround/fix?
>    5.  Any suggested workarounds we can test quickly (e.g., disable vector
>    TX offload, disable specific HW offloads, change interrupt/queue
>    bindings, or adjust doorbell behavior)?

While I've not come across this particular issue before in the past, one
immediate suggestion might be to try the latest point release of 22.11,
updating from 22.11.2 to 22.11.9. Checking the diffs, I see that there were
some changes made to the ice_prep_pkts() function between those two
releases. Perhaps those changes may help here.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-28 15:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-27  0:52 [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst =?gb18030?B?RG9yYWVtb24=?=
2025-08-28 14:45 ` Stephen Hemminger
2025-08-28 15:57 ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).