From: Bing Zhao <bingz@nvidia.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: Slava Ovsiienko <viacheslavo@nvidia.com>,
Matan Azrad <matan@nvidia.com>, dev <dev@dpdk.org>,
"NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>,
Dariusz Sosnowski <dsosnowski@nvidia.com>,
Suanming Mou <suanmingm@nvidia.com>,
Raslan Darawsheh <rasland@nvidia.com>
Subject: RE: [PATCH v2 2/3] net/mlx5: add new devarg for Tx queue consecutive memory
Date: Thu, 26 Jun 2025 13:18:18 +0000 [thread overview]
Message-ID: <PH7PR12MB6905F9250E91DEE1DE0BF4E8D07AA@PH7PR12MB6905.namprd12.prod.outlook.com> (raw)
In-Reply-To: <CAOaVG15O-jY-YJoFpytUVQATJQnDe+saDHD1DQZJD58xFNqhfg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5160 bytes --]
Hi Stephen,
Thanks for your review and comments. I will add the description about the new devarg in our mlx5.rst file to have a detailed description.
Indeed, after some review and internal call discussion with our datapath experts. We would like to change the devarg a little bit but not only 0 / 1 as a chicken bit.
Since the memory accessing footprints and orders may impact the performance. In the perf test, we found that the alignment of the queue address may impact it. The basic starting address alignment is system page size, but it can be bigger.
So the new devarg use will be the log value of the alignment for all queues’ starting addresses. And on different CPU architectures / generations that have different LLC systems can try to use different alignment to get the best performance without rebuilding the binary application from the source code and it is configurable. WDYT?
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Tuesday, June 24, 2025 8:02 PM
To: Bing Zhao <bingz@nvidia.com>
Cc: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>; dev <dev@dpdk.org>; NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Dariusz Sosnowski <dsosnowski@nvidia.com>; Suanming Mou <suanmingm@nvidia.com>; Raslan Darawsheh <rasland@nvidia.com>
Subject: Re: [PATCH v2 2/3] net/mlx5: add new devarg for Tx queue consecutive memory
External email: Use caution opening links or attachments
Why is this needed? Need some documentation. DPDK needs less not more nerd knobs
On Mon, Jun 23, 2025, 14:35 Bing Zhao <bingz@nvidia.com<mailto:bingz@nvidia.com>> wrote:
With this commit, a new device argument is introduced to control
the memory allocation for Tx queues.
By default, 'txq_consec_mem' is 1 to let all the Tx queues use a
consecutive memory area and a single MR.
Signed-off-by: Bing Zhao <bingz@nvidia.com<mailto:bingz@nvidia.com>>
---
drivers/net/mlx5/mlx5.c | 14 ++++++++++++++
drivers/net/mlx5/mlx5.h | 1 +
2 files changed, 15 insertions(+)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index b4bd43aae2..f5beebd2fd 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -185,6 +185,9 @@
/* Device parameter to control representor matching in ingress/egress flows with HWS. */
#define MLX5_REPR_MATCHING_EN "repr_matching_en"
+/* Using consecutive memory address and single MR for all Tx queues. */
+#define MLX5_TXQ_CONSEC_MEM "txq_consec_mem"
+
/* Shared memory between primary and secondary processes. */
struct mlx5_shared_data *mlx5_shared_data;
@@ -1447,6 +1450,8 @@ mlx5_dev_args_check_handler(const char *key, const char *val, void *opaque)
config->cnt_svc.cycle_time = tmp;
} else if (strcmp(MLX5_REPR_MATCHING_EN, key) == 0) {
config->repr_matching = !!tmp;
+ } else if (strcmp(MLX5_TXQ_CONSEC_MEM, key) == 0) {
+ config->txq_consec_mem = !!tmp;
}
return 0;
}
@@ -1486,6 +1491,7 @@ mlx5_shared_dev_ctx_args_config(struct mlx5_dev_ctx_shared *sh,
MLX5_HWS_CNT_SERVICE_CORE,
MLX5_HWS_CNT_CYCLE_TIME,
MLX5_REPR_MATCHING_EN,
+ MLX5_TXQ_CONSEC_MEM,
NULL,
};
int ret = 0;
@@ -1501,6 +1507,7 @@ mlx5_shared_dev_ctx_args_config(struct mlx5_dev_ctx_shared *sh,
config->cnt_svc.cycle_time = MLX5_CNT_SVC_CYCLE_TIME_DEFAULT;
config->cnt_svc.service_core = rte_get_main_lcore();
config->repr_matching = 1;
+ config->txq_consec_mem = 1;
if (mkvlist != NULL) {
/* Process parameters. */
ret = mlx5_kvargs_process(mkvlist, params,
@@ -1584,6 +1591,7 @@ mlx5_shared_dev_ctx_args_config(struct mlx5_dev_ctx_shared *sh,
config->allow_duplicate_pattern);
DRV_LOG(DEBUG, "\"fdb_def_rule_en\" is %u.", config->fdb_def_rule);
DRV_LOG(DEBUG, "\"repr_matching_en\" is %u.", config->repr_matching);
+ DRV_LOG(DEBUG, "\"txq_consec_mem\" is %u.", config->txq_consec_mem);
return 0;
}
@@ -3150,6 +3158,12 @@ mlx5_probe_again_args_validate(struct mlx5_common_device *cdev,
sh->ibdev_name);
goto error;
}
+ if (sh->config.txq_consec_mem ^ config->txq_consec_mem) {
+ DRV_LOG(ERR, "\"txq_consec_mem\" "
+ "configuration mismatch for shared %s context.",
+ sh->ibdev_name);
+ goto error;
+ }
mlx5_free(config);
return 0;
error:
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 5695d0f54a..4e0287cbc0 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -393,6 +393,7 @@ struct mlx5_sh_config {
/* Allow/Prevent the duplicate rules pattern. */
uint32_t fdb_def_rule:1; /* Create FDB default jump rule */
uint32_t repr_matching:1; /* Enable implicit vport matching in HWS FDB. */
+ uint32_t txq_consec_mem:1; /**/
};
/* Structure for VF VLAN workaround. */
--
2.34.1
[-- Attachment #2: Type: text/html, Size: 10740 bytes --]
next prev parent reply other threads:[~2025-06-26 13:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250623173524.128125-1:x-bingz@nvidia.com>
2025-06-23 18:34 ` [PATCH v2 0/3] Use consecutive Tx queues' memory Bing Zhao
2025-06-23 18:34 ` [PATCH v2 1/3] net/mlx5: fix the WQE size calculation for Tx queue Bing Zhao
2025-06-23 18:34 ` [PATCH v2 2/3] net/mlx5: add new devarg for Tx queue consecutive memory Bing Zhao
2025-06-24 12:01 ` Stephen Hemminger
2025-06-26 13:18 ` Bing Zhao [this message]
2025-06-26 14:29 ` Stephen Hemminger
2025-06-26 15:21 ` Thomas Monjalon
2025-06-23 18:34 ` [PATCH v2 3/3] net/mlx5: use consecutive memory for all Tx queues Bing Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH7PR12MB6905F9250E91DEE1DE0BF4E8D07AA@PH7PR12MB6905.namprd12.prod.outlook.com \
--to=bingz@nvidia.com \
--cc=dev@dpdk.org \
--cc=dsosnowski@nvidia.com \
--cc=matan@nvidia.com \
--cc=rasland@nvidia.com \
--cc=stephen@networkplumber.org \
--cc=suanmingm@nvidia.com \
--cc=thomas@monjalon.net \
--cc=viacheslavo@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).