patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Shani Peretz <shperetz@nvidia.com>
To: Maayan Kashani <mkashani@nvidia.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Cc: Maayan Kashani <mkashani@nvidia.com>,
	Dariusz Sosnowski <dsosnowski@nvidia.com>,
	Raslan Darawsheh <rasland@nvidia.com>,
	Matan Azrad <matan@nvidia.com>,
	Slava Ovsiienko <viacheslavo@nvidia.com>,
	Bing Zhao <bingz@nvidia.com>
Subject: RE: [PATCH 22.11 v2] net/mlx5: fix device start error handling
Date: Tue, 30 Dec 2025 10:58:05 +0000	[thread overview]
Message-ID: <MW4PR12MB7484484FCABD30A89A209E4FBFBCA@MW4PR12MB7484.namprd12.prod.outlook.com> (raw)
In-Reply-To: <20251125110927.178331-1-mkashani@nvidia.com>



> -----Original Message-----
> From: Maayan Kashani <mkashani@nvidia.com>
> Sent: Tuesday, 25 November 2025 13:09
> To: stable@dpdk.org
> Cc: Maayan Kashani <mkashani@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; Raslan Darawsheh <rasland@nvidia.com>; Matan
> Azrad <matan@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>; Bing
> Zhao <bingz@mellanox.com>
> Subject: [PATCH 22.11 v2] net/mlx5: fix device start error handling
> 
> External email: Use caution opening links or attachments
> 
> 
> [ upstream commit 860f6c63dbc1 ]
> 
> When mlx5_dev_start() fails partway through initialization, the error cleanup
> code unconditionally calls cleanup functions for all steps, including those that
> were never successfully initialized. This causes state corruption leading to
> incorrect behavior on subsequent start attempts.
> 
> The issue manifests as:
> 1. First start attempt fails with -ENOMEM (expected) 2. Second start attempt
> returns -EINVAL instead of -ENOMEM 3. With flow isolated mode, second
> attempt incorrectly succeeds,
>    leading to segfault in rte_eth_rx_burst()
> 
> Root cause: The single error label cleanup path calls functions like
> mlx5_traffic_disable() and mlx5_flow_stop_default() even when their
> corresponding initialization functions (mlx5_traffic_enable() and
> mlx5_flow_start_default()) were never called due to earlier failure.
> 
> For example, when mlx5_rxq_start() fails:
> - mlx5_traffic_enable() at line 1403 never executes
> - mlx5_flow_start_default() at line 1420 never executes
> - But cleanup unconditionally calls:
>   * mlx5_traffic_disable() - destroys control flows list
>   * mlx5_flow_stop_default() - corrupts flow metadata state
> 
> This corrupts the device state, causing subsequent start attempts to fail with
> different errors or, in isolated mode, to incorrectly succeed with an improperly
> initialized device.
> 
> Fix by replacing the single error label with cascading error labels (Linux kernel
> style). Each label cleans up only its corresponding step, then falls through to
> clean up earlier steps.
> This ensures only successfully initialized steps are cleaned up, maintaining
> device state consistency across failed start attempts.
> 
> Bugzilla ID: 1419
> Fixes: 8db7e3b69822 ("net/mlx5: change operations for non-cached flows")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Maayan Kashani <mkashani@nvidia.com>
> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
> ---
>  drivers/net/mlx5/mlx5_trigger.c | 60 +++++++++++++++++++++++----------
>  1 file changed, 42 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
> index f72ed7f820f..16c3164bcf1 100644
> --- a/drivers/net/mlx5/mlx5_trigger.c
> +++ b/drivers/net/mlx5/mlx5_trigger.c
> @@ -1133,6 +1133,11 @@ mlx5_hw_representor_port_allowed_start(struct
> rte_eth_dev *dev)
> 
>  #endif
> 
> +#define SAVE_RTE_ERRNO_AND_STOP(ret, dev) do { \
> +       ret = rte_errno;                        \
> +       (dev)->data->dev_started = 0;           \
> +} while (0)
> +
>  /**
>   * DPDK callback to start the device.
>   *
> @@ -1203,19 +1208,23 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>         if (ret) {
>                 DRV_LOG(ERR, "port %u Tx packet pacing init failed: %s",
>                         dev->data->port_id, strerror(rte_errno));
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
>                 goto error;
>         }
>         if (mlx5_devx_obj_ops_en(priv->sh) &&
>             priv->obj_ops.lb_dummy_queue_create) {
>                 ret = priv->obj_ops.lb_dummy_queue_create(dev);
> -               if (ret)
> -                       goto error;
> +               if (ret) {
> +                       SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +                       goto txpp_stop;
> +               }
>         }
>         ret = mlx5_txq_start(dev);
>         if (ret) {
>                 DRV_LOG(ERR, "port %u Tx queue allocation failed: %s",
>                         dev->data->port_id, strerror(rte_errno));
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto lb_dummy_queue_release;
>         }
>         if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
>                 if (!priv->sh->dev_cap.vf && !priv->sh->dev_cap.sf && @@ -1239,7
> +1248,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>         if (ret) {
>                 DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
>                         dev->data->port_id, strerror(rte_errno));
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto txq_stop;
>         }
>         /*
>          * Such step will be skipped if there is no hairpin TX queue configured @@
> -1249,7 +1259,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>         if (ret) {
>                 DRV_LOG(ERR, "port %u hairpin auto binding failed: %s",
>                         dev->data->port_id, strerror(rte_errno));
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto rxq_stop;
>         }
>         /* Set started flag here for the following steps like control flow. */
>         dev->data->dev_started = 1;
> @@ -1257,7 +1268,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>         if (ret) {
>                 DRV_LOG(ERR, "port %u Rx interrupt vector creation failed",
>                         dev->data->port_id);
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto rxq_stop;
>         }
>         mlx5_os_stats_init(dev);
>         /*
> @@ -1269,7 +1281,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>                 DRV_LOG(ERR,
>                         "port %u failed to attach indirect actions: %s",
>                         dev->data->port_id, rte_strerror(rte_errno));
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto rx_intr_vec_disable;
>         }
>  #ifdef HAVE_MLX5_HWS_SUPPORT
>         if (priv->sh->config.dv_flow_en == 2) { @@ -1277,7 +1290,8 @@
> mlx5_dev_start(struct rte_eth_dev *dev)
>                 if (ret) {
>                         DRV_LOG(ERR, "port %u failed to update HWS tables",
>                                 dev->data->port_id);
> -                       goto error;
> +                       SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +                       goto action_handle_detach;
>                 }
>         }
>  #endif
> @@ -1285,7 +1299,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>         if (ret) {
>                 DRV_LOG(ERR, "port %u failed to set defaults flows",
>                         dev->data->port_id);
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto action_handle_detach;
>         }
>         /* Set a mask and offset of dynamic metadata flows into Rx queues. */
>         mlx5_flow_rxq_dynf_metadata_set(dev);
> @@ -1302,12 +1317,14 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>         if (ret) {
>                 DRV_LOG(DEBUG, "port %u failed to start default actions: %s",
>                         dev->data->port_id, strerror(rte_errno));
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto traffic_disable;
>         }
>         if (mlx5_dev_ctx_shared_mempool_subscribe(dev) != 0) {
>                 DRV_LOG(ERR, "port %u failed to subscribe for mempool life cycle:
> %s",
>                         dev->data->port_id, rte_strerror(rte_errno));
> -               goto error;
> +               SAVE_RTE_ERRNO_AND_STOP(ret, dev);
> +               goto stop_default;
>         }
>         rte_wmb();
>         dev->tx_pkt_burst = mlx5_select_tx_function(dev); @@ -1334,18
> +1351,25 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>                 priv->sh->port[priv->dev_port - 1].devx_ih_port_id =
>                                         (uint32_t)dev->data->port_id;
>         return 0;
> -error:
> -       ret = rte_errno; /* Save rte_errno before cleanup. */
> -       /* Rollback. */
> -       dev->data->dev_started = 0;
> +stop_default:
>         mlx5_flow_stop_default(dev);
> +traffic_disable:
>         mlx5_traffic_disable(dev);
> -       mlx5_txq_stop(dev);
> +action_handle_detach:
> +       mlx5_action_handle_detach(dev);
> +rx_intr_vec_disable:
> +       mlx5_rx_intr_vec_disable(dev);
> +rxq_stop:
>         mlx5_rxq_stop(dev);
> +txq_stop:
> +       mlx5_txq_stop(dev);
> +lb_dummy_queue_release:
>         if (priv->obj_ops.lb_dummy_queue_release)
>                 priv->obj_ops.lb_dummy_queue_release(dev);
> -       mlx5_txpp_stop(dev); /* Stop last. */
> -       rte_errno = ret; /* Restore rte_errno. */
> +txpp_stop:
> +       mlx5_txpp_stop(dev);
> +error:
> +       rte_errno = ret;
>         return -rte_errno;
>  }
> 
> --
> 2.43.0


Hey,
This patch has also been applied to 23.11.
 
Thanks,
Shani

      reply	other threads:[~2025-12-30 10:58 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 11:09 Maayan Kashani
2025-12-30 10:58 ` Shani Peretz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW4PR12MB7484484FCABD30A89A209E4FBFBCA@MW4PR12MB7484.namprd12.prod.outlook.com \
    --to=shperetz@nvidia.com \
    --cc=bingz@nvidia.com \
    --cc=dsosnowski@nvidia.com \
    --cc=matan@nvidia.com \
    --cc=mkashani@nvidia.com \
    --cc=rasland@nvidia.com \
    --cc=stable@dpdk.org \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).