patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Dariusz Sosnowski <dsosnowski@nvidia.com>
To: Sivaprasad Tummala <sivaprasad.tummala@amd.com>,
	Alexander Kozyrev <akozyrev@nvidia.com>,
	Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Cc: <jerinj@marvell.com>, <kirankumark@marvell.com>,
	<ndabilpuram@marvell.com>, <yanzhirun_163@163.com>,
	<david.marchand@redhat.com>, <ktraynor@redhat.com>,
	<thomas@monjalon.net>, <konstantin.ananyev@huawei.com>,
	<konstantin.v.ananyev@yandex.ru>, <bruce.richardson@intel.com>,
	<maxime.coquelin@redhat.com>, <anatoly.burakov@intel.com>,
	<aconole@redhat.com>, <dev@dpdk.org>, <stable@dpdk.org>
Subject: Re: [PATCH] net/mlx5: fix spurious CPU wakeups caused by invalid CQE
Date: Fri, 7 Nov 2025 18:59:36 +0100	[thread overview]
Message-ID: <20251107175936.4yr2quzngd5gnp2l@ds-vm-debian.local> (raw)
In-Reply-To: <20251015133957.4094235-1-sivaprasad.tummala@amd.com>

Hi,

Thank you for the contribution. Please see comments below.

On Wed, Oct 15, 2025 at 01:39:57PM +0000, Sivaprasad Tummala wrote:
> Previously, the PMD used a common monitor callback to determine
> CQE ownership for power-aware polling. However, when a CQE contained
> an invalid opcode(MLX5_CQE_INVALID), ownership bit was not reliable.
> As a result, the monitor condition could falsely indicate CQE
> availability and cause the CPU to wake up unnecessarily during
> low traffic periods.
> 
> This resulted in spurious wakeups in monitor-wait mode and reduced
> the expected power savings, as cores exited the sleep state even
> when no valid CQEs were available.
> 
> This patch introduces a dedicated callback that skips invalid CQEs
> and optimizes power efficiency by preventing false wakeups caused
> by hardware-owned or invalid entries.
> 
> Fixes: a8f0df6bf98d ("net/mlx5: support power monitoring")
> Cc: akozyrev@nvidia.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
> ---
>  drivers/net/mlx5/mlx5_rx.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
> index 420a03068d..2765b4b730 100644
> --- a/drivers/net/mlx5/mlx5_rx.c
> +++ b/drivers/net/mlx5/mlx5_rx.c
> @@ -295,6 +295,20 @@ mlx5_monitor_callback(const uint64_t value,
>  	return (value & m) == v ? -1 : 0;
>  }
>  
> +static int
> +mlx5_monitor_cqe_own_callback(const uint64_t value,
> +		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
> +{
> +	const uint64_t m = opaque[CLB_MSK_IDX];
> +	const uint64_t v = opaque[CLB_VAL_IDX];
> +	const uint64_t match = ((value & m) == v);

Could you please rename "match" variable to "sw_owned"?
This name would better relay the meaning of the checked condition that
CQE owner bit value signifies that CQE is SW owned.

> +	const uint64_t opcode = MLX5_CQE_OPCODE(value);
> +	const uint64_t valid_op = (opcode ^ MLX5_CQE_INVALID);

IMO the usage of bit operations here (although logic is correct) is a bit confusing.
Could you rewrite it in terms of logical operations so it's easier to
follow? For example like this:

	const uint64_t valid_op = opcode != MLX5_CQE_INVALID

	return (sw_owned && valid_op) ? -1 : 0;

This also would properly describe in code the required condition:
CQE can be parsed by SW if and only if owner bit is "SW owned" and CQE
opcode is valid.

> +
> +	/* ownership bit is not valid for invalid opcode; CQE is HW owned */
> +	return -(match & valid_op);
> +}
> +
>  int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
>  {
>  	struct mlx5_rxq_data *rxq = rx_queue;
> @@ -312,12 +326,13 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
>  		pmc->addr = &cqe->validity_iteration_count;
>  		pmc->opaque[CLB_VAL_IDX] = vic;
>  		pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_VIC_INIT;
> +		pmc->fn = mlx5_monitor_callback;

Alex, Slava: Just to double check - in case of enhanced CQE compression
layout, should both CQE opcode and vic be checked?
Right now only vic is checked in power monitor callback for that case.
In Rx datapath both are checked to determine CQE ownership:
https://github.com/DPDK/dpdk/blob/main/drivers/common/mlx5/mlx5_common.h#L277

>  	} else {
>  		pmc->addr = &cqe->op_own;
>  		pmc->opaque[CLB_VAL_IDX] = !!idx;
>  		pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
> +		pmc->fn = mlx5_monitor_cqe_own_callback;
>  	}
> -	pmc->fn = mlx5_monitor_callback;
>  	pmc->size = sizeof(uint8_t);
>  	return 0;
>  }
> -- 
> 2.43.0
> 

Best regards,
Dariusz Sosnowski

      parent reply	other threads:[~2025-11-07 17:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-15 13:39 Sivaprasad Tummala
2025-10-28 14:53 ` Thomas Monjalon
2025-11-07 17:59 ` Dariusz Sosnowski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251107175936.4yr2quzngd5gnp2l@ds-vm-debian.local \
    --to=dsosnowski@nvidia.com \
    --cc=aconole@redhat.com \
    --cc=akozyrev@nvidia.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@huawei.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=ktraynor@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=ndabilpuram@marvell.com \
    --cc=sivaprasad.tummala@amd.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    --cc=yanzhirun_163@163.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).