Hello, Sorry for bothering you again. May I inquire if this issue is still being worked on ? If so, when can I expect to see a fix ? Best regards, Edwin Brossette On Mon, Dec 23, 2024 at 2:09 PM Slava Ovsiienko wrote: > Confirm, it’s a bug, IIUC was introduced by reporting function update. > AFAIK, we do not test with non-DevX environment anymore, so missed this. > > Fix should be provided. > > > > With best regards, > > Slava > > > > *From:* Asaf Penso > *Sent:* Sunday, December 22, 2024 9:39 AM > *To:* igootorov@gmail.com; Slava Ovsiienko > *Cc:* Laurent Hardy ; Olivier Matz < > olivier.matz@6wind.com>; Didier Pallard ; > Jean-Mickael Guerin ; Edwin Brossette < > edwin.brossette@6wind.com>; dev@dpdk.org > *Subject:* RE: net/mlx5: wrong Rx/Tx descriptor limits when DevX is off > > > > Hello Igor and Slava, > > Can you please check out this issue? > > > > Regards, > > Asaf Penso > > > > *From:* Edwin Brossette > *Sent:* Friday, 20 December 2024 19:06 > *To:* dev@dpdk.org > *Cc:* Laurent Hardy ; Olivier Matz < > olivier.matz@6wind.com>; Didier Pallard ; > Jean-Mickael Guerin > *Subject:* net/mlx5: wrong Rx/Tx descriptor limits when DevX is off > > > > Hello, > > I have run into a regression following an update to stable dpdk-24.11 with > a number of my Mellanox cx4/5/6 nics. This regression occurs with all nics > in my lab which have DevX disabled: using mstconfig utility, I can see the > flag UCTX_EN is not set. > > Mainly, the issue is that the ports cannot be started, with the following > error logs in the journal: > > Set nb_rxd=1 (asked=512) for port=0 > Set nb_txd=1 (asked=512) for port=0 > starting port 0 > Initializing port 0 [7c:fe:90:65:e6:54] > port 0: ntfp1 (mlx5_pci) > nb_rxq=2 nb_txq=2 > rxq0=c9 rxq1=c25 > txq0=c9 txq1=c25 > port 0: rx_scatter=0 tx_scatter=0 max_rx_frame=1526 > mlx5_net: port 0 number of descriptors requested for Tx queue 0 must be > higher than MLX5_TX_COMP_THRESH, using 33 instead of 1 > mlx5_net: port 0 increased number of descriptors in Tx queue 0 to the next > power of two (64) > mlx5_net: port 0 number of descriptors requested for Tx queue 1 must be > higher than MLX5_TX_COMP_THRESH, using 33 instead of 1 > mlx5_net: port 0 increased number of descriptors in Tx queue 1 to the next > power of two (64) > mlx5_net: Port 0 Rx queue 0 CQ creation failure. > mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory > rte_eth_dev_start(port 0) failed, error=-12 > Failed to start port 0, set link down > Failed to start port 0 > > > Looking more precisely into the problem, it appears that the number of Rx > and Tx descriptors configured for my queues is 1. This happens because > mlx5_dev_infos_get() return a limit of 1 for both Rx and Tx, which is > unexpected. I identified this patch to be responsible for the regression: > > 4c3d7961d9002: net/mlx5: fix reported Rx/Tx descriptor limits > > https://git.dpdk.org/dpdk/commit/?id=4c3d7961d9002bb715a8ee76bcf464d633316d4c > > After doing some debugging, I noticed that hca_attr.log_max_wq_sz is never > configured. This should be done in mlx5_devx_cmd_query_hca_attr() which is > called in this bit of code: > > https://git.dpdk.org/dpdk/tree/drivers/common/mlx5/mlx5_common.c#n681 > > /* > * When CTX is created by Verbs, query HCA attribute is unsupported. > * When CTX is imported, we cannot know if it is created by DevX or > * Verbs. So, we use query HCA attribute function to check it. > */ > if (cdev->config.devx || cdev->config.device_fd != MLX5_ARG_UNSET) { > > /* Query HCA attributes. */ > ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &cdev->config.hca_attr); > if (ret) { > > DRV_LOG(ERR, "Unable to read HCA caps in DevX mode."); > rte_errno = ENOTSUP; > goto error; > > } > cdev->config.devx = 1; > > } > DRV_LOG(DEBUG, "DevX is %ssupported.", cdev->config.devx ? "" : "NOT "); > > > > I deduced that following the above patch, the correct value for maximum Rx > and Tx descriptors will only be set if DevX is enabled (see the if > condition on cdev->config.devx). If it is disabled, then maximum Rx and Tx > descriptors will be 1, which will make the ports fail to start. Perhaps we > should keep the previous default value (65535) if config.devx == 0 (DevX > off)? This could be done like this, for example: > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c > b/drivers/net/mlx5/mlx5_ethdev.c > index 7708a0b80883..8ba3eb4a32de 100644 > --- a/drivers/net/mlx5/mlx5_ethdev.c > +++ b/drivers/net/mlx5/mlx5_ethdev.c > @@ -359,10 +359,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct > rte_eth_dev_info *info) > info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK; > mlx5_set_default_params(dev, info); > mlx5_set_txlimit_params(dev, info); > - info->rx_desc_lim.nb_max = > - 1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz; > - info->tx_desc_lim.nb_max = > - 1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz; > + if (priv->sh->cdev->config.devx) { > + info->rx_desc_lim.nb_max = > + 1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz; > + info->tx_desc_lim.nb_max = > + 1 << priv->sh->cdev->config.hca_attr.log_max_wq_sz; > + } > if (priv->sh->cdev->config.hca_attr.mem_rq_rmp && > priv->obj_ops.rxq_obj_new == devx_obj_ops.rxq_obj_new) > info->dev_capa |= RTE_ETH_DEV_CAPA_RXQ_SHARE; > > Thanks in advance for your help. > > Regards, > Edwin Brossette. > > >