* [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close
@ 2025-11-06 16:38 Hemant Agrawal
2025-11-06 16:38 ` [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues Hemant Agrawal
2025-11-07 8:32 ` [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close David Marchand
0 siblings, 2 replies; 6+ messages in thread
From: Hemant Agrawal @ 2025-11-06 16:38 UTC (permalink / raw)
To: dev; +Cc: stephen, Hemant Agrawal, sachin.saxena, stable
When rte_eth_dev_close() is called, it performs the following actions:
Calls dev->dev_ops->dev_close(), which in this case is dpaa2_dev_close().
Then calls rte_eth_dev_release_port(), which releases all device data
and sets dev->data to NULL.
Later, when rte_dev_remove() is called, the FSLMC bus invokes
dev->remove() — that is, rte_dpaa2_remove().
However, rte_dpaa2_remove() calls dpaa2_dev_close() again. Since dev->data
was already set to NULL by the previous call, this second invocation
causes a crash.
Fixes: 5964d36a2904 ("net/dpaa2: release port upon close")
Cc: sachin.saxena@nxp.com
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
drivers/net/dpaa2/dpaa2_ethdev.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 7da32ce856..f3db7982a4 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -3347,14 +3347,17 @@ static int
rte_dpaa2_remove(struct rte_dpaa2_device *dpaa2_dev)
{
struct rte_eth_dev *eth_dev;
- int ret;
+ int ret = 0;
eth_dev = dpaa2_dev->eth_dev;
- dpaa2_dev_close(eth_dev);
+ if (eth_dev->data) {
+ ret = dpaa2_dev_close(eth_dev);
+ if (!ret)
+ ret = rte_eth_dev_release_port(eth_dev);
+ }
dpaa2_valid_dev--;
if (!dpaa2_valid_dev)
rte_mempool_free(dpaa2_tx_sg_pool);
- ret = rte_eth_dev_release_port(eth_dev);
return ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues
2025-11-06 16:38 [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close Hemant Agrawal
@ 2025-11-06 16:38 ` Hemant Agrawal
2025-11-06 19:29 ` Stephen Hemminger
2025-11-07 8:34 ` David Marchand
2025-11-07 8:32 ` [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close David Marchand
1 sibling, 2 replies; 6+ messages in thread
From: Hemant Agrawal @ 2025-11-06 16:38 UTC (permalink / raw)
To: dev; +Cc: stephen, Maxime Leroy, jun.yang, stable
From: Maxime Leroy <maxime@leroys.fr>
When using the prefetch Rx path (dpaa2_dev_prefetch_rx), the driver keeps
track of one outstanding VDQCR command per DPIO portal in the global
rte_global_active_dqs_list[] array. Each queue_storage_info_t also stores
the active result buffer and portal index:
qs->active_dqs
qs->active_dpio_id
Before issuing a new pull command, dpaa2_dev_prefetch_rx() checks for an
active entry and spins on qbman_check_command_complete() until the
corresponding VDQCR completes.
On port close / hotplug remove, dpaa2_free_rx_tx_queues() frees all
per-lcore queue_storage_info_t structures and their dq_storage[] buffers,
but never clears the global rte_global_active_dqs_list[] entries. After a
detach/attach sequence (or "del/add" in grout), the prefetch Rx path
still sees an active entry for the portal and spins forever on a stale dq
buffer that has been freed and will never be completed by hardware. In
gdb, dq->dq.tok stays 0 and dpaa2_dev_prefetch_rx() loops in:
while (!qbman_check_command_complete(get_swp_active_dqs(idx)))
;
Fix this by clearing the active VDQ state before freeing queue storage.
For each Rx queue and lcore, if qs->active_dqs is non-NULL, call
clear_swp_active_dqs(qs->active_dpio_id) and set qs->active_dqs to NULL.
Then dpaa2_queue_storage_free() can safely free q_storage and
dq_storage[].
After this change, a DPNI detach/attach sequence no longer leaves stale
entries in rte_global_active_dqs_list[], and the prefetch Rx loop does
not hang waiting for a completion from a previous device instance.
Reproduction:
- grout:
grcli interface add port dpni.1 devargs fslmc:dpni.1
grcli interface del dpni.1
grcli interface add port dpni.1 devargs fslmc:dpni.1
-> Rx was stuck in qbman_check_command_complete(), now works.
- testpmd:
dpdk-testpmd -n1 -a fslmc:dpni.65535 -- -i --forward-mode=rxonly
testpmd> port attach fslmc:dpni.1
testpmd> port start all
testpmd> start
testpmd> stop
testpmd> port stop all
testpmd> port detach 0
testpmd> port attach fslmc:dpni.1
testpmd> port start all
testpmd> start
-> Rx was hanging, now runs normall
Fixes: 12d98eceb8ac ("bus/fslmc: enhance QBMAN DQ storage logic")
Cc: jun.yang@nxp.com
Cc: stable@dpdk.org
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
---
.mailmap | 1 +
drivers/net/dpaa2/dpaa2_ethdev.c | 19 +++++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/.mailmap b/.mailmap
index 10c37a97a6..1f540f7f51 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1036,6 +1036,7 @@ Mauro Annarumma <mauroannarumma@hotmail.it>
Maxime Coquelin <maxime.coquelin@redhat.com>
Maxime Gouin <maxime.gouin@6wind.com>
Maxime Leroy <maxime.leroy@6wind.com>
+Maxime Leroy <maxime@leroys.fr>
Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Megha Ajmera <megha.ajmera@intel.com>
Meijuan Zhao <meijuanx.zhao@intel.com>
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index f3db7982a4..3c18d58804 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -631,6 +631,24 @@ dpaa2_alloc_rx_tx_queues(struct rte_eth_dev *dev)
return ret;
}
+static void
+dpaa2_clear_queue_active_dps(struct dpaa2_queue *q, int num_lcores)
+{
+ int i;
+
+ for (i = 0; i < num_lcores; i++) {
+ struct queue_storage_info_t *qs = q->q_storage[i];
+
+ if (!qs)
+ continue;
+
+ if (qs->active_dqs) {
+ clear_swp_active_dqs(qs->active_dpio_id);
+ qs->active_dqs = NULL;
+ }
+ }
+}
+
static void
dpaa2_free_rx_tx_queues(struct rte_eth_dev *dev)
{
@@ -645,6 +663,7 @@ dpaa2_free_rx_tx_queues(struct rte_eth_dev *dev)
/* cleaning up queue storage */
for (i = 0; i < priv->nb_rx_queues; i++) {
dpaa2_q = priv->rx_vq[i];
+ dpaa2_clear_queue_active_dps(dpaa2_q, RTE_MAX_LCORE);
dpaa2_queue_storage_free(dpaa2_q,
RTE_MAX_LCORE);
}
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues
2025-11-06 16:38 ` [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues Hemant Agrawal
@ 2025-11-06 19:29 ` Stephen Hemminger
[not found] ` <CAHHRULVJe45=gNq1in6eHo5yEq-+QguxeDGfVHeC8D3KgMDdqQ@mail.gmail.com>
2025-11-07 8:34 ` David Marchand
1 sibling, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2025-11-06 19:29 UTC (permalink / raw)
To: Hemant Agrawal; +Cc: dev, Maxime Leroy, jun.yang, stable
On Thu, 6 Nov 2025 22:08:06 +0530
Hemant Agrawal <hemant.agrawal@nxp.com> wrote:
> +static void
> +dpaa2_clear_queue_active_dps(struct dpaa2_queue *q, int num_lcores)
> +{
> + int i;
> +
> + for (i = 0; i < num_lcores; i++) {
> + struct queue_storage_info_t *qs = q->q_storage[i];
> +
> + if (!qs)
> + continue;
> +
> + if (qs->active_dqs) {
> + clear_swp_active_dqs(qs->active_dpio_id);
> + qs->active_dqs = NULL;
> + }
> + }
> +}
> +
Why not use RTE_LCORE_FOREACH() here?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close
2025-11-06 16:38 [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close Hemant Agrawal
2025-11-06 16:38 ` [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues Hemant Agrawal
@ 2025-11-07 8:32 ` David Marchand
1 sibling, 0 replies; 6+ messages in thread
From: David Marchand @ 2025-11-07 8:32 UTC (permalink / raw)
To: Hemant Agrawal; +Cc: dev, stephen, sachin.saxena, stable, maxime
Hello,
On Thu, 6 Nov 2025 at 17:38, Hemant Agrawal <hemant.agrawal@nxp.com> wrote:
>
> When rte_eth_dev_close() is called, it performs the following actions:
>
> Calls dev->dev_ops->dev_close(), which in this case is dpaa2_dev_close().
> Then calls rte_eth_dev_release_port(), which releases all device data
> and sets dev->data to NULL.
>
> Later, when rte_dev_remove() is called, the FSLMC bus invokes
> dev->remove() — that is, rte_dpaa2_remove().
> However, rte_dpaa2_remove() calls dpaa2_dev_close() again. Since dev->data
> was already set to NULL by the previous call, this second invocation
> causes a crash.
>
> Fixes: 5964d36a2904 ("net/dpaa2: release port upon close")
> Cc: sachin.saxena@nxp.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
> drivers/net/dpaa2/dpaa2_ethdev.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
> index 7da32ce856..f3db7982a4 100644
> --- a/drivers/net/dpaa2/dpaa2_ethdev.c
> +++ b/drivers/net/dpaa2/dpaa2_ethdev.c
> @@ -3347,14 +3347,17 @@ static int
> rte_dpaa2_remove(struct rte_dpaa2_device *dpaa2_dev)
> {
> struct rte_eth_dev *eth_dev;
> - int ret;
> + int ret = 0;
>
> eth_dev = dpaa2_dev->eth_dev;
Having a back reference of the "class" object in a "device" object
seems wrong to me (and there is a dev_priv->eth_dev too...).
It breaks the separation that was introduced with rte_device years ago.
I did not look in detail, but it seems strange that after closing a
first time, there would still remain a reference of the eth_dev object
in the dpaa2 device object.
At least, it would be worth double checking that the
dpaa2_dev->eth_dev is cleared in dpaa2_dev_close.
> - dpaa2_dev_close(eth_dev);
> + if (eth_dev->data) {
> + ret = dpaa2_dev_close(eth_dev);
> + if (!ret)
> + ret = rte_eth_dev_release_port(eth_dev);
I don't see why you would need to decrement dpaa2_valid_dev again below.
Maybe a missing return here?
> + }
> dpaa2_valid_dev--;
> if (!dpaa2_valid_dev)
> rte_mempool_free(dpaa2_tx_sg_pool);
> - ret = rte_eth_dev_release_port(eth_dev);
>
> return ret;
> }
Taking a step back, the issue this patch wants to fix is a pattern
that is resolved by other drivers by checking if a eth_dev is
allocated for a rte_device.
A simpler (untested) fix seems to be:
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 7da32ce856..6682a72341 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -3349,7 +3349,10 @@ rte_dpaa2_remove(struct rte_dpaa2_device *dpaa2_dev)
struct rte_eth_dev *eth_dev;
int ret;
- eth_dev = dpaa2_dev->eth_dev;
+ eth_dev = rte_eth_dev_allocated(dpaa2_dev->device.name);
+ if (!eth_dev)
+ return 0;
+
dpaa2_dev_close(eth_dev);
dpaa2_valid_dev--;
if (!dpaa2_valid_dev)
--
David Marchand
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues
2025-11-06 16:38 ` [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues Hemant Agrawal
2025-11-06 19:29 ` Stephen Hemminger
@ 2025-11-07 8:34 ` David Marchand
1 sibling, 0 replies; 6+ messages in thread
From: David Marchand @ 2025-11-07 8:34 UTC (permalink / raw)
To: Hemant Agrawal; +Cc: dev, stephen, Maxime Leroy, jun.yang, stable
On Thu, 6 Nov 2025 at 17:38, Hemant Agrawal <hemant.agrawal@nxp.com> wrote:
> diff --git a/.mailmap b/.mailmap
> index 10c37a97a6..1f540f7f51 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -1036,6 +1036,7 @@ Mauro Annarumma <mauroannarumma@hotmail.it>
> Maxime Coquelin <maxime.coquelin@redhat.com>
> Maxime Gouin <maxime.gouin@6wind.com>
> Maxime Leroy <maxime.leroy@6wind.com>
> +Maxime Leroy <maxime@leroys.fr>
On a single line please:
Maxime Leroy <maxime@leroys.fr> <maxime.leroy@6wind.com>
> Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
> Megha Ajmera <megha.ajmera@intel.com>
> Meijuan Zhao <meijuanx.zhao@intel.com>
--
David Marchand
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues
[not found] ` <CAHHRULVJe45=gNq1in6eHo5yEq-+QguxeDGfVHeC8D3KgMDdqQ@mail.gmail.com>
@ 2025-11-07 10:38 ` Hemant Agrawal
0 siblings, 0 replies; 6+ messages in thread
From: Hemant Agrawal @ 2025-11-07 10:38 UTC (permalink / raw)
To: Maxime Leroy, stephen; +Cc: dev, Jun Yang, stable
[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]
Le jeu. 6 nov. 2025, 20:29, Stephen Hemminger <stephen@networkplumber.org<mailto:stephen@networkplumber.org>> a écrit :
On Thu, 6 Nov 2025 22:08:06 +0530
Hemant Agrawal <hemant.agrawal@nxp.com<mailto:hemant.agrawal@nxp.com>> wrote:
> +static void
> +dpaa2_clear_queue_active_dps(struct dpaa2_queue *q, int num_lcores)
> +{
> + int i;
> +
> + for (i = 0; i < num_lcores; i++) {
> + struct queue_storage_info_t *qs = q->q_storage[i];
> +
> + if (!qs)
> + continue;
> +
> + if (qs->active_dqs) {
> + clear_swp_active_dqs(qs->active_dpio_id);
> + qs->active_dqs = NULL;
> + }
> + }
> +}
> +
Why not use RTE_LCORE_FOREACH() here?
For the loop, I did it the same way as in dpaa2_queue_storage_alloc(), to stay aligned with the rest of the driver instead of using RTE_LCORE_FOREACH().
Thanks Hemant for upstreaming my patch. However, you added my email in .mailmap as a new entry — it should instead be added as a second email under the existing Maxime Leroy entry.
[Hemant]
Will send v2
[-- Attachment #2: Type: text/html, Size: 4290 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-11-07 10:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-06 16:38 [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close Hemant Agrawal
2025-11-06 16:38 ` [PATCH 2/3] net/dpaa2: clear active VDQ state when freeing Rx queues Hemant Agrawal
2025-11-06 19:29 ` Stephen Hemminger
[not found] ` <CAHHRULVJe45=gNq1in6eHo5yEq-+QguxeDGfVHeC8D3KgMDdqQ@mail.gmail.com>
2025-11-07 10:38 ` Hemant Agrawal
2025-11-07 8:34 ` David Marchand
2025-11-07 8:32 ` [PATCH 1/3] net/dpaa2: fix duplicate calling of dpaa2 dev close David Marchand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).