DPDK patches and discussions
 help / color / mirror / Atom feed
* net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
@ 2024-12-20 17:05 Edwin Brossette
  2024-12-22  7:39 ` Asaf Penso
  2025-04-29 12:25 ` [PATCH] net/mlx5: fix the maximal queue size query Viacheslav Ovsiienko
  0 siblings, 2 replies; 7+ messages in thread
From: Edwin Brossette @ 2024-12-20 17:05 UTC (permalink / raw)
  To: dev; +Cc: Laurent Hardy, Olivier Matz, Didier Pallard, Jean-Mickael Guerin

[-- Attachment #1: Type: text/plain, Size: 4127 bytes --]

Hello,

I have run into a regression following an update to stable dpdk-24.11 with
a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics
in my lab which have DevX disabled: using mstconfig utility, I can see the
flag UCTX_EN is not set.

Mainly, the issue is that the ports cannot be started, with the following
error logs in the journal:

Set nb_rxd=1 (asked=512) for port=0
Set nb_txd=1 (asked=512) for port=0
starting port 0
Initializing port 0 [7c:fe:90:65:e6:54]
port 0: ntfp1 (mlx5_pci)
nb_rxq=2 nb_txq=2
rxq0=c9 rxq1=c25
txq0=c9 txq1=c25
port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526
mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be
higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next
power of two (64)
mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be
higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next
power of two (64)
mlx5_net: Port 0 Rx queue 0 CQ creation failure.
mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory
rte_eth_dev_start(port 0) failed, error=-12
Failed to start port 0, set link down
Failed to start port 0

Looking more precisely into the problem, it appears that the number of Rx
and Tx descriptors configured for my queues is 1. This happens because
mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is
unexpected. I identified this patch to be responsible for the regression:

4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits
https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c

After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never
configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is
called in this bit of code:

https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681

/*
* When CTX is created by Verbs, query HCA attribute is unsupported.
* When CTX is imported, we cannot know if it is created by DevX or
* Verbs. So, we use query HCA attribute function to check it.
*/
if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) {
/* Query HCA attributes. */
ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr);
if (ret) {
DRV_LOG(ERR, "Unable to read HCA caps in DevX mode.");
rte_errno = ENOTSUP;
goto error;
}
cdev->config.devx = 1;
}
DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT ");

I deduced that following the above patch, the correct value for maximum Rx
and Tx descriptors will only be set if DevX is enabled (see the if
condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx
descriptors will be 1, which will make the ports fail to start. Perhaps we
should keep the previous default value (65535) if config.devx == 0 (DevX
off)? This could be done like this, for example:

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7708a0b80883..8ba3eb4a32de 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct
rte_eth_dev_info *info)
        info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
        mlx5_set_default_params(dev, info);
        mlx5_set_txlimit_params(dev, info);
-       info->rx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
-       info->tx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       if (priv->sh->cdev->config.devx) {
+               info->rx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+               info->tx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       }
        if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
            priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
                info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;

Thanks in advance for your help.

Regards,
Edwin Brossette.

[-- Attachment #2: Type: text/html, Size: 5377 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
  2024-12-20 17:05 net/mlx5: wrong Rx/Tx descriptor limits when DevX is off Edwin Brossette
@ 2024-12-22  7:39 ` Asaf Penso
  2024-12-23 13:08   ` Slava Ovsiienko
  2025-04-29 12:25 ` [PATCH] net/mlx5: fix the maximal queue size query Viacheslav Ovsiienko
  1 sibling, 1 reply; 7+ messages in thread
From: Asaf Penso @ 2024-12-22  7:39 UTC (permalink / raw)
  To: igootorov, Slava Ovsiienko
  Cc: Laurent Hardy, Olivier Matz, Didier Pallard, Jean-Mickael Guerin,
	Edwin Brossette, dev

[-- Attachment #1: Type: text/plain, Size: 4631 bytes --]

Hello Igor and Slava,
Can you please check out this issue?

Regards,
Asaf Penso

From: Edwin Brossette <edwin.brossette@6wind.com>
Sent: Friday, 20 December 2024 19:06
To: dev@dpdk.org
Cc: Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>; Jean-Mickael Guerin <jmg@6wind.com>
Subject: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off

Hello,

I have run into a regression following an update to stable dpdk-24.11 with a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics in my lab which have DevX disabled: using mstconfig utility, I can see the flag UCTX_EN is not set.

Mainly, the issue is that the ports cannot be started, with the following error logs in the journal:
Set nb_rxd=1 (asked=512) for port=0
Set nb_txd=1 (asked=512) for port=0
starting port 0
Initializing port 0 [7c:fe:90:65:e6:54]
port 0: ntfp1 (mlx5_pci)
nb_rxq=2 nb_txq=2
rxq0=c9 rxq1=c25
txq0=c9 txq1=c25
port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526
mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next power of two (64)
mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next power of two (64)
mlx5_net: Port 0 Rx queue 0 CQ creation failure.
mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory
rte_eth_dev_start(port 0) failed, error=-12
Failed to start port 0, set link down
Failed to start port 0

Looking more precisely into the problem, it appears that the number of Rx and Tx descriptors configured for my queues is 1. This happens because mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is unexpected. I identified this patch to be responsible for the regression:

4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits
https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c

After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is called in this bit of code:

https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681
/*
* When CTX is created by Verbs, query HCA attribute is unsupported.
* When CTX is imported, we cannot know if it is created by DevX or
* Verbs. So, we use query HCA attribute function to check it.
*/
if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) {
/* Query HCA attributes. */
ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr);
if (ret) {
DRV_LOG(ERR, "Unable to read HCA caps in DevX mode.");
rte_errno = ENOTSUP;
goto error;
}
cdev->config.devx = 1;
}
DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT ");

I deduced that following the above patch, the correct value for maximum Rx and Tx descriptors will only be set if DevX is enabled (see the if condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx descriptors will be 1, which will make the ports fail to start. Perhaps we should keep the previous default value (65535) if config.devx == 0 (DevX off)? This could be done like this, for example:

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7708a0b80883..8ba3eb4a32de 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
        info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
        mlx5_set_default_params(dev, info);
        mlx5_set_txlimit_params(dev, info);
-       info->rx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
-       info->tx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       if (priv->sh->cdev->config.devx) {
+               info->rx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+               info->tx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       }
        if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
            priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
                info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;

Thanks in advance for your help.

Regards,
Edwin Brossette.


[-- Attachment #2: Type: text/html, Size: 9503 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
  2024-12-22  7:39 ` Asaf Penso
@ 2024-12-23 13:08   ` Slava Ovsiienko
  2025-02-12 14:34     ` Edwin Brossette
  0 siblings, 1 reply; 7+ messages in thread
From: Slava Ovsiienko @ 2024-12-23 13:08 UTC (permalink / raw)
  To: Asaf Penso, igootorov
  Cc: Laurent Hardy, Olivier Matz, Didier Pallard, Jean-Mickael Guerin,
	Edwin Brossette, dev

[-- Attachment #1: Type: text/plain, Size: 5449 bytes --]

Confirm, it’s a bug, IIUC was introduced by reporting function update.
AFAIK, we do not test with non-DevX environment anymore, so missed this.
Fix should be provided.

With best regards,
Slava

From: Asaf Penso <asafp@nvidia.com>
Sent: Sunday, December 22, 2024 9:39 AM
To: igootorov@gmail.com; Slava Ovsiienko <viacheslavo@nvidia.com>
Cc: Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>; Jean-Mickael Guerin <jmg@6wind.com>; Edwin Brossette <edwin.brossette@6wind.com>; dev@dpdk.org
Subject: RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off

Hello Igor and Slava,
Can you please check out this issue?

Regards,
Asaf Penso

From: Edwin Brossette <edwin.brossette@6wind.com<mailto:edwin.brossette@6wind.com>>
Sent: Friday, 20 December 2024 19:06
To: dev@dpdk.org<mailto:dev@dpdk.org>
Cc: Laurent Hardy <laurent.hardy@6wind.com<mailto:laurent.hardy@6wind.com>>; Olivier Matz <olivier.matz@6wind.com<mailto:olivier.matz@6wind.com>>; Didier Pallard <didier.pallard@6wind.com<mailto:didier.pallard@6wind.com>>; Jean-Mickael Guerin <jmg@6wind.com<mailto:jmg@6wind.com>>
Subject: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off

Hello,

I have run into a regression following an update to stable dpdk-24.11 with a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics in my lab which have DevX disabled: using mstconfig utility, I can see the flag UCTX_EN is not set.

Mainly, the issue is that the ports cannot be started, with the following error logs in the journal:
Set nb_rxd=1 (asked=512) for port=0
Set nb_txd=1 (asked=512) for port=0
starting port 0
Initializing port 0 [7c:fe:90:65:e6:54]
port 0: ntfp1 (mlx5_pci)
nb_rxq=2 nb_txq=2
rxq0=c9 rxq1=c25
txq0=c9 txq1=c25
port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526
mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next power of two (64)
mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next power of two (64)
mlx5_net: Port 0 Rx queue 0 CQ creation failure.
mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory
rte_eth_dev_start(port 0) failed, error=-12
Failed to start port 0, set link down
Failed to start port 0

Looking more precisely into the problem, it appears that the number of Rx and Tx descriptors configured for my queues is 1. This happens because mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is unexpected. I identified this patch to be responsible for the regression:

4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits
https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c

After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is called in this bit of code:

https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681
/*
* When CTX is created by Verbs, query HCA attribute is unsupported.
* When CTX is imported, we cannot know if it is created by DevX or
* Verbs. So, we use query HCA attribute function to check it.
*/
if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) {
/* Query HCA attributes. */
ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr);
if (ret) {
DRV_LOG(ERR, "Unable to read HCA caps in DevX mode.");
rte_errno = ENOTSUP;
goto error;
}
cdev->config.devx = 1;
}
DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT ");

I deduced that following the above patch, the correct value for maximum Rx and Tx descriptors will only be set if DevX is enabled (see the if condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx descriptors will be 1, which will make the ports fail to start. Perhaps we should keep the previous default value (65535) if config.devx == 0 (DevX off)? This could be done like this, for example:

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7708a0b80883..8ba3eb4a32de 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
        info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
        mlx5_set_default_params(dev, info);
        mlx5_set_txlimit_params(dev, info);
-       info->rx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
-       info->tx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       if (priv->sh->cdev->config.devx) {
+               info->rx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+               info->tx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       }
        if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
            priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
                info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;

Thanks in advance for your help.

Regards,
Edwin Brossette.


[-- Attachment #2: Type: text/html, Size: 11108 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
  2024-12-23 13:08   ` Slava Ovsiienko
@ 2025-02-12 14:34     ` Edwin Brossette
  2025-03-05  9:17       ` Slava Ovsiienko
  0 siblings, 1 reply; 7+ messages in thread
From: Edwin Brossette @ 2025-02-12 14:34 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Asaf Penso, igootorov, Laurent Hardy, Olivier Matz,
	Didier Pallard, Jean-Mickael Guerin, dev

[-- Attachment #1: Type: text/plain, Size: 5908 bytes --]

Hello,

Sorry for bothering you again.
May I inquire if this issue is still being worked on ?
If so, when can I expect to see a fix ?

Best regards,
Edwin Brossette

On Mon, Dec 23, 2024 at 2:09 PM Slava Ovsiienko <viacheslavo@nvidia.com>
wrote:

> Confirm, it’s a bug, IIUC was introduced by reporting function update.
> AFAIK, we do not test with non-DevX environment anymore, so missed this.
>
> Fix should be provided.
>
>
>
> With best regards,
>
> Slava
>
>
>
> *From:* Asaf Penso <asafp@nvidia.com>
> *Sent:* Sunday, December 22, 2024 9:39 AM
> *To:* igootorov@gmail.com; Slava Ovsiienko <viacheslavo@nvidia.com>
> *Cc:* Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <
> olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>;
> Jean-Mickael Guerin <jmg@6wind.com>; Edwin Brossette <
> edwin.brossette@6wind.com>; dev@dpdk.org
> *Subject:* RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
>
>
>
> Hello Igor and Slava,
>
> Can you please check out this issue?
>
>
>
> Regards,
>
> Asaf Penso
>
>
>
> *From:* Edwin Brossette <edwin.brossette@6wind.com>
> *Sent:* Friday, 20 December 2024 19:06
> *To:* dev@dpdk.org
> *Cc:* Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <
> olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>;
> Jean-Mickael Guerin <jmg@6wind.com>
> *Subject:* net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
>
>
>
> Hello,
>
> I have run into a regression following an update to stable dpdk-24.11 with
> a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics
> in my lab which have DevX disabled: using mstconfig utility, I can see the
> flag UCTX_EN is not set.
>
> Mainly, the issue is that the ports cannot be started, with the following
> error logs in the journal:
>
> Set nb_rxd=1 (asked=512) for port=0
> Set nb_txd=1 (asked=512) for port=0
> starting port 0
> Initializing port 0 [7c:fe:90:65:e6:54]
> port 0: ntfp1 (mlx5_pci)
> nb_rxq=2 nb_txq=2
> rxq0=c9 rxq1=c25
> txq0=c9 txq1=c25
> port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526
> mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be
> higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
> mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next
> power of two (64)
> mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be
> higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
> mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next
> power of two (64)
> mlx5_net: Port 0 Rx queue 0 CQ creation failure.
> mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory
> rte_eth_dev_start(port 0) failed, error=-12
> Failed to start port 0, set link down
> Failed to start port 0
>
>
> Looking more precisely into the problem, it appears that the number of Rx
> and Tx descriptors configured for my queues is 1. This happens because
> mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is
> unexpected. I identified this patch to be responsible for the regression:
>
> 4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits
>
> https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c
>
> After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never
> configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is
> called in this bit of code:
>
> https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681
>
> /*
> * When CTX is created by Verbs, query HCA attribute is unsupported.
> * When CTX is imported, we cannot know if it is created by DevX or
> * Verbs. So, we use query HCA attribute function to check it.
> */
> if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) {
>
> /* Query HCA attributes. */
> ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr);
> if (ret) {
>
> DRV_LOG(ERR, "Unable to read HCA caps in DevX mode.");
> rte_errno = ENOTSUP;
> goto error;
>
> }
> cdev->config.devx = 1;
>
> }
> DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT ");
>
>
>
> I deduced that following the above patch, the correct value for maximum Rx
> and Tx descriptors will only be set if DevX is enabled (see the if
> condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx
> descriptors will be 1, which will make the ports fail to start. Perhaps we
> should keep the previous default value (65535) if config.devx == 0 (DevX
> off)? This could be done like this, for example:
>
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> b/drivers/net/mlx5/mlx5_ethdev.c
> index 7708a0b80883..8ba3eb4a32de 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct
> rte_eth_dev_info *info)
>         info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
>         mlx5_set_default_params(dev, info);
>         mlx5_set_txlimit_params(dev, info);
> -       info->rx_desc_lim.nb_max =
> -               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> -       info->tx_desc_lim.nb_max =
> -               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> +       if (priv->sh->cdev->config.devx) {
> +               info->rx_desc_lim.nb_max =
> +                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> +               info->tx_desc_lim.nb_max =
> +                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> +       }
>         if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
>             priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
>                 info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;
>
> Thanks in advance for your help.
>
> Regards,
> Edwin Brossette.
>
>
>

[-- Attachment #2: Type: text/html, Size: 10949 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
  2025-02-12 14:34     ` Edwin Brossette
@ 2025-03-05  9:17       ` Slava Ovsiienko
  2025-03-17 15:02         ` Edwin Brossette
  0 siblings, 1 reply; 7+ messages in thread
From: Slava Ovsiienko @ 2025-03-05  9:17 UTC (permalink / raw)
  To: Edwin Brossette
  Cc: Asaf Penso, igootorov, Laurent Hardy, Olivier Matz,
	Didier Pallard, Jean-Mickael Guerin, dev

[-- Attachment #1: Type: text/plain, Size: 7038 bytes --]

Hi, Edwin

Thank you for the patch.
You are quite right, “sh->cdev->config.hca_attr.log_max_wq_sz” is not set if DevX is disengaged.
I found some other places where the uninitialized “log_max_wq_sz” might be used.
So. I’d rather prefer to configure the “log_max_wq_sz” for IBV case as well, instead of just fixing mlx5_dev_infos_get().

There is the property in  “priv->sh->dev_cap.max_qp_wr”, it reflects the max number of descriptors if rdma_core is used.
Would you like to update your patch with this? Or would you prefer me to do it ?

With best regards,
Slava


From: Edwin Brossette <edwin.brossette@6wind.com>
Sent: Wednesday, February 12, 2025 4:34 PM
To: Slava Ovsiienko <viacheslavo@nvidia.com>
Cc: Asaf Penso <asafp@nvidia.com>; igootorov@gmail.com; Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>; Jean-Mickael Guerin <jmg@6wind.com>; dev@dpdk.org
Subject: Re: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off

Hello,

Sorry for bothering you again.
May I inquire if this issue is still being worked on ?
If so, when can I expect to see a fix ?

Best regards,
Edwin Brossette

On Mon, Dec 23, 2024 at 2:09 PM Slava Ovsiienko <viacheslavo@nvidia.com<mailto:viacheslavo@nvidia.com>> wrote:
Confirm, it’s a bug, IIUC was introduced by reporting function update.
AFAIK, we do not test with non-DevX environment anymore, so missed this.
Fix should be provided.

With best regards,
Slava

From: Asaf Penso <asafp@nvidia.com<mailto:asafp@nvidia.com>>
Sent: Sunday, December 22, 2024 9:39 AM
To: igootorov@gmail.com<mailto:igootorov@gmail.com>; Slava Ovsiienko <viacheslavo@nvidia.com<mailto:viacheslavo@nvidia.com>>
Cc: Laurent Hardy <laurent.hardy@6wind.com<mailto:laurent.hardy@6wind.com>>; Olivier Matz <olivier.matz@6wind.com<mailto:olivier.matz@6wind.com>>; Didier Pallard <didier.pallard@6wind.com<mailto:didier.pallard@6wind.com>>; Jean-Mickael Guerin <jmg@6wind.com<mailto:jmg@6wind.com>>; Edwin Brossette <edwin.brossette@6wind.com<mailto:edwin.brossette@6wind.com>>; dev@dpdk.org<mailto:dev@dpdk.org>
Subject: RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off

Hello Igor and Slava,
Can you please check out this issue?

Regards,
Asaf Penso

From: Edwin Brossette <edwin.brossette@6wind.com<mailto:edwin.brossette@6wind.com>>
Sent: Friday, 20 December 2024 19:06
To: dev@dpdk.org<mailto:dev@dpdk.org>
Cc: Laurent Hardy <laurent.hardy@6wind.com<mailto:laurent.hardy@6wind.com>>; Olivier Matz <olivier.matz@6wind.com<mailto:olivier.matz@6wind.com>>; Didier Pallard <didier.pallard@6wind.com<mailto:didier.pallard@6wind.com>>; Jean-Mickael Guerin <jmg@6wind.com<mailto:jmg@6wind.com>>
Subject: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off

Hello,

I have run into a regression following an update to stable dpdk-24.11 with a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics in my lab which have DevX disabled: using mstconfig utility, I can see the flag UCTX_EN is not set.

Mainly, the issue is that the ports cannot be started, with the following error logs in the journal:
Set nb_rxd=1 (asked=512) for port=0
Set nb_txd=1 (asked=512) for port=0
starting port 0
Initializing port 0 [7c:fe:90:65:e6:54]
port 0: ntfp1 (mlx5_pci)
nb_rxq=2 nb_txq=2
rxq0=c9 rxq1=c25
txq0=c9 txq1=c25
port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526
mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next power of two (64)
mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next power of two (64)
mlx5_net: Port 0 Rx queue 0 CQ creation failure.
mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory
rte_eth_dev_start(port 0) failed, error=-12
Failed to start port 0, set link down
Failed to start port 0

Looking more precisely into the problem, it appears that the number of Rx and Tx descriptors configured for my queues is 1. This happens because mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is unexpected. I identified this patch to be responsible for the regression:

4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits
https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c

After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is called in this bit of code:

https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681
/*
* When CTX is created by Verbs, query HCA attribute is unsupported.
* When CTX is imported, we cannot know if it is created by DevX or
* Verbs. So, we use query HCA attribute function to check it.
*/
if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) {
/* Query HCA attributes. */
ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr);
if (ret) {
DRV_LOG(ERR, "Unable to read HCA caps in DevX mode.");
rte_errno = ENOTSUP;
goto error;
}
cdev->config.devx = 1;
}
DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT ");

I deduced that following the above patch, the correct value for maximum Rx and Tx descriptors will only be set if DevX is enabled (see the if condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx descriptors will be 1, which will make the ports fail to start. Perhaps we should keep the previous default value (65535) if config.devx == 0 (DevX off)? This could be done like this, for example:

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7708a0b80883..8ba3eb4a32de 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
        info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
        mlx5_set_default_params(dev, info);
        mlx5_set_txlimit_params(dev, info);
-       info->rx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
-       info->tx_desc_lim.nb_max =
-               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       if (priv->sh->cdev->config.devx) {
+               info->rx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+               info->tx_desc_lim.nb_max =
+                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+       }
        if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
            priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
                info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;

Thanks in advance for your help.

Regards,
Edwin Brossette.


[-- Attachment #2: Type: text/html, Size: 16743 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
  2025-03-05  9:17       ` Slava Ovsiienko
@ 2025-03-17 15:02         ` Edwin Brossette
  0 siblings, 0 replies; 7+ messages in thread
From: Edwin Brossette @ 2025-03-17 15:02 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Asaf Penso, igootorov, Laurent Hardy, Olivier Matz,
	Didier Pallard, Jean-Mickael Guerin, dev

[-- Attachment #1: Type: text/plain, Size: 7486 bytes --]

Hello,

Thank you for your answer.
The short patch I joined with my first mail was just a rough example to
report what I tested. I believe you know the driver's code better than I
do, so I wouldn't be opposed to see you fix this issue.
Thank you in advance.

Regards,
Edwin Brossette.

On Wed, Mar 5, 2025 at 10:17 AM Slava Ovsiienko <viacheslavo@nvidia.com>
wrote:

> Hi, Edwin
>
>
>
> Thank you for the patch.
>
> You are quite right, “sh->cdev->config.hca_attr.log_max_wq_sz” is not set
> if DevX is disengaged.
>
> I found some other places where the uninitialized “log_max_wq_sz” might be
> used.
> So. I’d rather prefer to configure the “log_max_wq_sz” for IBV case as
> well, instead of just fixing mlx5_dev_infos_get().
>
>
> There is the property in  “priv->sh->dev_cap.max_qp_wr”, it reflects the
> max number of descriptors if rdma_core is used.
>
> Would you like to update your patch with this? Or would you prefer me to
> do it ?
>
> With best regards,
> Slava
>
>
>
>
>
> *From:* Edwin Brossette <edwin.brossette@6wind.com>
> *Sent:* Wednesday, February 12, 2025 4:34 PM
> *To:* Slava Ovsiienko <viacheslavo@nvidia.com>
> *Cc:* Asaf Penso <asafp@nvidia.com>; igootorov@gmail.com; Laurent Hardy <
> laurent.hardy@6wind.com>; Olivier Matz <olivier.matz@6wind.com>; Didier
> Pallard <didier.pallard@6wind.com>; Jean-Mickael Guerin <jmg@6wind.com>;
> dev@dpdk.org
> *Subject:* Re: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
>
>
>
> Hello,
>
>
>
> Sorry for bothering you again.
>
> May I inquire if this issue is still being worked on ?
>
> If so, when can I expect to see a fix ?
>
>
>
> Best regards,
>
> Edwin Brossette
>
>
>
> On Mon, Dec 23, 2024 at 2:09 PM Slava Ovsiienko <viacheslavo@nvidia.com>
> wrote:
>
> Confirm, it’s a bug, IIUC was introduced by reporting function update.
> AFAIK, we do not test with non-DevX environment anymore, so missed this.
>
> Fix should be provided.
>
>
>
> With best regards,
>
> Slava
>
>
>
> *From:* Asaf Penso <asafp@nvidia.com>
> *Sent:* Sunday, December 22, 2024 9:39 AM
> *To:* igootorov@gmail.com; Slava Ovsiienko <viacheslavo@nvidia.com>
> *Cc:* Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <
> olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>;
> Jean-Mickael Guerin <jmg@6wind.com>; Edwin Brossette <
> edwin.brossette@6wind.com>; dev@dpdk.org
> *Subject:* RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
>
>
>
> Hello Igor and Slava,
>
> Can you please check out this issue?
>
>
>
> Regards,
>
> Asaf Penso
>
>
>
> *From:* Edwin Brossette <edwin.brossette@6wind.com>
> *Sent:* Friday, 20 December 2024 19:06
> *To:* dev@dpdk.org
> *Cc:* Laurent Hardy <laurent.hardy@6wind.com>; Olivier Matz <
> olivier.matz@6wind.com>; Didier Pallard <didier.pallard@6wind.com>;
> Jean-Mickael Guerin <jmg@6wind.com>
> *Subject:* net/mlx5: wrong Rx/Tx descriptor limits when DevX is off
>
>
>
> Hello,
>
> I have run into a regression following an update to stable dpdk-24.11 with
> a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics
> in my lab which have DevX disabled: using mstconfig utility, I can see the
> flag UCTX_EN is not set.
>
> Mainly, the issue is that the ports cannot be started, with the following
> error logs in the journal:
>
> Set nb_rxd=1 (asked=512) for port=0
> Set nb_txd=1 (asked=512) for port=0
> starting port 0
> Initializing port 0 [7c:fe:90:65:e6:54]
> port 0: ntfp1 (mlx5_pci)
> nb_rxq=2 nb_txq=2
> rxq0=c9 rxq1=c25
> txq0=c9 txq1=c25
> port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526
> mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be
> higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
> mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next
> power of two (64)
> mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be
> higher than MLX5_TX_COMP_THRESH, using 33 instead of 1
> mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next
> power of two (64)
> mlx5_net: Port 0 Rx queue 0 CQ creation failure.
> mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory
> rte_eth_dev_start(port 0) failed, error=-12
> Failed to start port 0, set link down
> Failed to start port 0
>
>
> Looking more precisely into the problem, it appears that the number of Rx
> and Tx descriptors configured for my queues is 1. This happens because
> mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is
> unexpected. I identified this patch to be responsible for the regression:
>
> 4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits
>
> https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c
>
> After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never
> configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is
> called in this bit of code:
>
> https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681
>
> /*
> * When CTX is created by Verbs, query HCA attribute is unsupported.
> * When CTX is imported, we cannot know if it is created by DevX or
> * Verbs. So, we use query HCA attribute function to check it.
> */
> if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) {
>
> /* Query HCA attributes. */
> ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr);
> if (ret) {
>
> DRV_LOG(ERR, "Unable to read HCA caps in DevX mode.");
> rte_errno = ENOTSUP;
> goto error;
>
> }
> cdev->config.devx = 1;
>
> }
> DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT ");
>
>
>
> I deduced that following the above patch, the correct value for maximum Rx
> and Tx descriptors will only be set if DevX is enabled (see the if
> condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx
> descriptors will be 1, which will make the ports fail to start. Perhaps we
> should keep the previous default value (65535) if config.devx == 0 (DevX
> off)? This could be done like this, for example:
>
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> b/drivers/net/mlx5/mlx5_ethdev.c
> index 7708a0b80883..8ba3eb4a32de 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct
> rte_eth_dev_info *info)
>         info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
>         mlx5_set_default_params(dev, info);
>         mlx5_set_txlimit_params(dev, info);
> -       info->rx_desc_lim.nb_max =
> -               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> -       info->tx_desc_lim.nb_max =
> -               1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> +       if (priv->sh->cdev->config.devx) {
> +               info->rx_desc_lim.nb_max =
> +                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> +               info->tx_desc_lim.nb_max =
> +                       1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
> +       }
>         if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
>             priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
>                 info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;
>
> Thanks in advance for your help.
>
> Regards,
> Edwin Brossette.
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 15118 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] net/mlx5: fix the maximal queue size query
  2024-12-20 17:05 net/mlx5: wrong Rx/Tx descriptor limits when DevX is off Edwin Brossette
  2024-12-22  7:39 ` Asaf Penso
@ 2025-04-29 12:25 ` Viacheslav Ovsiienko
  1 sibling, 0 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2025-04-29 12:25 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, suanmingm, stable, Edwin Brossette, Dariusz Sosnowski

The mlx5 PMD manages the device using two modes: the Verbs API
and the DevX API. Each API offers its own method for querying
the maximum work queue size (in descriptors).

The corrected patch enhanced the rte_eth_dev_info_get() API
support in mlx5 PMD to return the true maximum number of descriptors.
It also implemented a limit check during queue creation, but this
was applied only to "DevX mode." Consequently, the "Verbs mode"
was overlooked, leading to malfunction on legacy NICs that do
not support DevX.

This patch adds support for Verbs mode, and all limit checks are
updated accordingly.

Fixes: 4c3d7961d900 ("net/mlx5: fix reported Rx/Tx descriptor limits")
Cc: stable@dpdk.org

Reported-by: Edwin Brossette <edwin.brossette@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/common/mlx5/mlx5_prm.h  |  1 +
 drivers/net/mlx5/mlx5.h         |  1 +
 drivers/net/mlx5/mlx5_devx.c    |  2 +-
 drivers/net/mlx5/mlx5_ethdev.c  | 39 +++++++++++++++++++++++++++++----
 drivers/net/mlx5/mlx5_rxq.c     |  2 +-
 drivers/net/mlx5/mlx5_trigger.c |  4 ++--
 drivers/net/mlx5/mlx5_txq.c     | 12 +++++-----
 7 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 742c274a85..7accdeab87 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -41,6 +41,7 @@
 /* Hardware index widths. */
 #define MLX5_CQ_INDEX_WIDTH 24
 #define MLX5_WQ_INDEX_WIDTH 16
+#define MLX5_WQ_INDEX_MAX (1u << (MLX5_WQ_INDEX_WIDTH - 1))
 
 /* WQE Segment sizes in bytes. */
 #define MLX5_WSEG_SIZE 16u
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 0194887a8b..ff182996d3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -2303,6 +2303,7 @@ int mlx5_representor_info_get(struct rte_eth_dev *dev,
 		(((repr_id) >> 12) & 3)
 uint16_t mlx5_representor_id_encode(const struct mlx5_switch_info *info,
 				    enum rte_eth_representor_type hpf_type);
+uint16_t mlx5_dev_get_max_wq_size(struct mlx5_dev_ctx_shared *sh);
 int mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info);
 int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t fw_size);
 const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index a12891a983..9711746edb 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -1593,7 +1593,7 @@ mlx5_txq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx)
 	wqe_size = RTE_ALIGN(wqe_size, MLX5_WQE_SIZE) / MLX5_WQE_SIZE;
 	/* Create Send Queue object with DevX. */
 	wqe_n = RTE_MIN((1UL << txq_data->elts_n) * wqe_size,
-			(uint32_t)priv->sh->dev_cap.max_qp_wr);
+			(uint32_t)mlx5_dev_get_max_wq_size(priv->sh));
 	log_desc_n = log2above(wqe_n);
 	ret = mlx5_txq_create_devx_sq_resources(dev, idx, log_desc_n);
 	if (ret) {
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7708a0b808..7f12194f30 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -314,6 +314,37 @@ mlx5_set_txlimit_params(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	info->tx_desc_lim.nb_mtu_seg_max = nb_max;
 }
 
+/**
+ * Get maximal work queue size in WQEs
+ *
+ * @param sh
+ *   Pointer to the device shared context.
+ * @return
+ *   Maximal number of WQEs in queue
+ */
+uint16_t
+mlx5_dev_get_max_wq_size(struct mlx5_dev_ctx_shared *sh)
+{
+	uint16_t max_wqe = MLX5_WQ_INDEX_MAX;
+
+	if (sh->cdev->config.devx) {
+		/* use HCA properties for DevX config */
+		MLX5_ASSERT(sh->cdev->config.hca_attr.log_max_wq_sz != 0);
+		MLX5_ASSERT(sh->cdev->config.hca_attr.log_max_wq_sz < MLX5_WQ_INDEX_WIDTH);
+		if (sh->cdev->config.hca_attr.log_max_wq_sz != 0 &&
+		    sh->cdev->config.hca_attr.log_max_wq_sz < MLX5_WQ_INDEX_WIDTH)
+			max_wqe = 1u << sh->cdev->config.hca_attr.log_max_wq_sz;
+	} else {
+		/* use IB device capabilities */
+		MLX5_ASSERT(sh->dev_cap.max_qp_wr > 0);
+		MLX5_ASSERT(sh->dev_cap.max_qp_wr <= MLX5_WQ_INDEX_MAX);
+		if (sh->dev_cap.max_qp_wr > 0 &&
+		    (uint32_t)sh->dev_cap.max_qp_wr <= MLX5_WQ_INDEX_MAX)
+			max_wqe = (uint16_t)sh->dev_cap.max_qp_wr;
+	}
+	return max_wqe;
+}
+
 /**
  * DPDK callback to get information about the device.
  *
@@ -327,6 +358,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	unsigned int max;
+	uint16_t max_wqe;
 
 	/* FIXME: we should ask the device for these values. */
 	info->min_rx_bufsize = 32;
@@ -359,10 +391,9 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK;
 	mlx5_set_default_params(dev, info);
 	mlx5_set_txlimit_params(dev, info);
-	info->rx_desc_lim.nb_max =
-		1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
-	info->tx_desc_lim.nb_max =
-		1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz;
+	max_wqe = mlx5_dev_get_max_wq_size(priv->sh);
+	info->rx_desc_lim.nb_max = max_wqe;
+	info->tx_desc_lim.nb_max = max_wqe;
 	if (priv->sh->cdev->config.hca_attr.mem_rq_rmp &&
 	    priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new)
 		info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 5cf7d4971b..a8aaab13c8 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -655,7 +655,7 @@ mlx5_rx_queue_pre_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t *desc,
 	struct mlx5_rxq_priv *rxq;
 	bool empty;
 
-	if (*desc > 1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz) {
+	if (*desc > mlx5_dev_get_max_wq_size(priv->sh)) {
 		DRV_LOG(ERR,
 			"port %u number of descriptors requested for Rx queue"
 			" %u is more than supported",
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 4ee44e9165..8145ad4233 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -217,8 +217,8 @@ mlx5_rxq_start(struct rte_eth_dev *dev)
 		/* Should not release Rx queues but return immediately. */
 		return -rte_errno;
 	}
-	DRV_LOG(DEBUG, "Port %u dev_cap.max_qp_wr is %d.",
-		dev->data->port_id, priv->sh->dev_cap.max_qp_wr);
+	DRV_LOG(DEBUG, "Port %u max work queue size is %d.",
+		dev->data->port_id, mlx5_dev_get_max_wq_size(priv->sh));
 	DRV_LOG(DEBUG, "Port %u dev_cap.max_sge is %d.",
 		dev->data->port_id, priv->sh->dev_cap.max_sge);
 	for (i = 0; i != priv->rxqs_n; ++i) {
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 3e93517323..b14a1a4379 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -333,7 +333,7 @@ mlx5_tx_queue_pre_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t *desc)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 
-	if (*desc > 1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz) {
+	if (*desc > mlx5_dev_get_max_wq_size(priv->sh)) {
 		DRV_LOG(ERR,
 			"port %u number of descriptors requested for Tx queue"
 			" %u is more than supported",
@@ -727,7 +727,7 @@ txq_calc_inline_max(struct mlx5_txq_ctrl *txq_ctrl)
 	struct mlx5_priv *priv = txq_ctrl->priv;
 	unsigned int wqe_size;
 
-	wqe_size = priv->sh->dev_cap.max_qp_wr / desc;
+	wqe_size = mlx5_dev_get_max_wq_size(priv->sh) / desc;
 	if (!wqe_size)
 		return 0;
 	/*
@@ -1082,6 +1082,7 @@ mlx5_txq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_txq_ctrl *tmpl;
+	uint16_t max_wqe;
 
 	tmpl = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, sizeof(*tmpl) +
 			   desc * sizeof(struct rte_mbuf *), 0, socket);
@@ -1107,13 +1108,12 @@ mlx5_txq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	txq_set_params(tmpl);
 	if (txq_adjust_params(tmpl))
 		goto error;
-	if (txq_calc_wqebb_cnt(tmpl) >
-	    priv->sh->dev_cap.max_qp_wr) {
+	max_wqe = mlx5_dev_get_max_wq_size(priv->sh);
+	if (txq_calc_wqebb_cnt(tmpl) > max_wqe) {
 		DRV_LOG(ERR,
 			"port %u Tx WQEBB count (%d) exceeds the limit (%d),"
 			" try smaller queue size",
-			dev->data->port_id, txq_calc_wqebb_cnt(tmpl),
-			priv->sh->dev_cap.max_qp_wr);
+			dev->data->port_id, txq_calc_wqebb_cnt(tmpl), max_wqe);
 		rte_errno = ENOMEM;
 		goto error;
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-04-29 12:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-20 17:05 net/mlx5: wrong Rx/Tx descriptor limits when DevX is off Edwin Brossette
2024-12-22  7:39 ` Asaf Penso
2024-12-23 13:08   ` Slava Ovsiienko
2025-02-12 14:34     ` Edwin Brossette
2025-03-05  9:17       ` Slava Ovsiienko
2025-03-17 15:02         ` Edwin Brossette
2025-04-29 12:25 ` [PATCH] net/mlx5: fix the maximal queue size query Viacheslav Ovsiienko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).