From: Stephen Hemminger <stephen@networkplumber.org>
To: "Doraemon" <nobitanobi@qq.com>
Cc: "dev" <dev@dpdk.org>
Subject: Re: [Help needed] net_ice: MDD event (Malicious Driver Detection) on TX queue when using rte_eth_tx_prepare / rte_eth_tx_burst
Date: Thu, 28 Aug 2025 07:45:41 -0700 [thread overview]
Message-ID: <20250828074541.2996da5a@hermes.local> (raw)
In-Reply-To: <tencent_F6CFD7AE41985DA5FB835CDE833F44C6AF06@qq.com>
On Wed, 27 Aug 2025 08:52:26 +0800
"Doraemon" <nobitanobi@qq.com> wrote:
> Hello DPDK / net_ice maintainers,
>
>
> We are seeing a reproducible and concerning issue when using the net_ice PMD with DPDK 22.11.2, and we would appreciate your help diagnosing it.
>
>
> Summary
> - Environment:
> - DPDK: 22.11.2
> - net_ice PCI device: 8086:159b
> - ice kernel driver: 1.12.7
> - NIC firmware: FW 7.3.6111681 (NVM 4.30)
> - IOVA mode: PA, VFIO enabled
> - Multi-process socket: /var/run/dpdk/PGW/mp_socket
> - NUMA: 2, detected lcores: 112
> - Bonding: pmd_bond with bonded devices created (net_bonding0 on port 4, net_bonding1 on port 5)
> - Driver enabled AVX2 OFFLOAD Vector Tx (log shows "ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx")
>
>
> - Problem statement:
> - Our application calls rte_eth_tx_prepare before calling rte_eth_tx_burst as part of the normal transmission path.
> - After the application has been running for some time (not immediate), the kernel/driver emits the following messages repeatedly:
> - ice_interrupt_handler(): OICR: MDD event
> - ice_interrupt_handler(): Malicious Driver Detection event 3 by TCLAN on TX queue 1025 PF# 1
> - We are using a single TX queue (application-level single queue) and are sending only one packet per burst (burst size = 1).
> - The sequence is: rte_eth_tx_prepare (returns) -> rte_eth_tx_burst -> MDD events occur later.
> - The events affect stability and repeat over time.
>
>
> Relevant startup logs (excerpt)
> EAL: Detected CPU lcores: 112
> EAL: Detected NUMA nodes: 2
> EAL: Selected IOVA mode 'PA'
> EAL: VFIO support initialized
> EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:3b:00.1 (socket 0)
> ice_load_pkg_type(): Active package is: 1.3.45.0, ICE COMMS Package (double VLAN mode)
> ice_dev_init(): FW 7.3.6111681 API 1.7
> ...
> bond_probe(3506) - Initializing pmd_bond for net_bonding0
> bond_probe(3592) - Create bonded device net_bonding0 on port 4 in mode 1 on socket 0.
> ...
> ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx (port 0).
> TELEMETRY: No legacy callbacks, legacy socket not created
>
>
> What we have tried / preliminary observations
> - Confirmed application calls rte_eth_tx_prepare prior to rte_eth_tx_burst.
> - Confirmed single TX queue configuration and small bursts (size = 1) — not high-rate, not a typical high-burst/malicious pattern.
> - The MDD log identifies "TX queue 1025"; unclear how that maps to our DPDK queue numbering (we use queue 0 in the app).
> - No obvious other DPDK errors at startup; interface initializes normally and vector TX is enabled.
> - We suspect the driver's Malicious Driver Detection (MDD) is triggering due to some descriptor/doorbell ordering or offload interaction, possibly related to AVX2 Vector Tx offload.
>
>
> Questions / requests to the maintainers
> 1. What specifically triggers "MDD event 3 by TCLAN" in net_ice? Which driver check/threshold corresponds to event type 3?
> 2. How is the "TX queue 1025" value computed/mapped in the log? (Is it queue id + offset, VF mapping, or an internal vector id?) We need to map that log value to our DPDK queue index.
> 3. Can the rte_eth_tx_prepare + rte_eth_tx_burst call pattern cause MDD detections under any circumstances? If so, are there recommended usage patterns or ordering constraints to avoid false positives?
> 4. Are there known firmware/driver/DPDK version combinations with similar MDD behavior? Do you recommend specific NIC firmware, kernel driver, or DPDK versions as a workaround/fix?
> 5. Any suggested workarounds we can test quickly (e.g., disable vector TX offload, disable specific HW offloads, change interrupt/queue bindings, or adjust doorbell behavior)?
>
>
>
>
> Best regards.
Did you make sure that the source address of the packet matches the MAC address of teh VF?
next prev parent reply other threads:[~2025-08-28 14:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 0:52 =?gb18030?B?RG9yYWVtb24=?=
2025-08-28 14:45 ` Stephen Hemminger [this message]
2025-08-28 15:57 ` Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250828074541.2996da5a@hermes.local \
--to=stephen@networkplumber.org \
--cc=dev@dpdk.org \
--cc=nobitanobi@qq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).