From: Gregory Etelson <getelson@nvidia.com>
To: <dev@dpdk.org>
Cc: <getelson@nvidia.com>, <matan@nvidia.com>, <rasland@nvidia.com>,
<stable@dpdk.org>, Viacheslav Ovsiienko <viacheslavo@nvidia.com>,
"Anatoly Burakov" <anatoly.burakov@intel.com>,
Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Subject: [PATCH] common/mlx5: fix shared mempool subscription
Date: Thu, 3 Nov 2022 12:44:27 +0200 [thread overview]
Message-ID: <20221103104427.1677-1-getelson@nvidia.com> (raw)
MLX5 PMD counted each mempool subscribe invocation. The PMD expected
that the mempool subscription will be deleted after the mempool
counter dropped to 0. However, current PMD design unsubscribes mempool
callbacks only once.
As the result, the PMD destroyed mlx5_common_device but kept
shared RX subscription callback. EAL tried to activate that callback
and crashed.
The patch removes mempool subscriptions counter.
The PMD registers mempool subscription once only. An attempt
to register existing subscription returns EEXIST.
Also, the PMD expects to remove subscription when mempool unsubscribe
was activated.
Fixes: 8ad97e4b3215 ("common/mlx5: fix multi-process mempool registration")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
drivers/common/mlx5/mlx5_common.c | 22 +++++++++++-----------
drivers/common/mlx5/mlx5_common_mr.c | 1 -
drivers/common/mlx5/mlx5_common_mr.h | 1 -
3 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c
index bf22c0694d..0ad14a48c7 100644
--- a/drivers/common/mlx5/mlx5_common.c
+++ b/drivers/common/mlx5/mlx5_common.c
@@ -577,6 +577,11 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp,
}
}
+/**
+ * Primary and secondary processes share the `cdev` pointer.
+ * Callbacks addresses are local in each process.
+ * Therefore, each process can register private callbacks.
+ */
int
mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
{
@@ -588,14 +593,13 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
/* Callback for this device may be already registered. */
ret = rte_mempool_event_callback_register(mlx5_dev_mempool_event_cb,
cdev);
- if (ret != 0 && rte_errno != EEXIST)
- goto exit;
- __atomic_add_fetch(&cdev->mr_scache.mempool_cb_reg_n, 1,
- __ATOMIC_ACQUIRE);
/* Register mempools only once for this device. */
- if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ if (ret == 0 && rte_eal_process_type() == RTE_PROC_PRIMARY) {
rte_mempool_walk(mlx5_dev_mempool_register_cb, cdev);
- ret = 0;
+ goto exit;
+ }
+ if (ret != 0 && rte_errno == EEXIST)
+ ret = 0;
exit:
rte_rwlock_write_unlock(&cdev->mr_scache.mprwlock);
return ret;
@@ -604,15 +608,11 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
static void
mlx5_dev_mempool_unsubscribe(struct mlx5_common_device *cdev)
{
- uint32_t mempool_cb_reg_n;
int ret;
+ MLX5_ASSERT(cdev->dev != NULL);
if (!cdev->config.mr_mempool_reg_en)
return;
- mempool_cb_reg_n = __atomic_sub_fetch(&cdev->mr_scache.mempool_cb_reg_n,
- 1, __ATOMIC_RELEASE);
- if (mempool_cb_reg_n > 0)
- return;
/* Stop watching for mempool events and unregister all mempools. */
ret = rte_mempool_event_callback_unregister(mlx5_dev_mempool_event_cb,
cdev);
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 1d54102b54..0e1d2434ab 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -1138,7 +1138,6 @@ mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket)
&share_cache->dereg_mr_cb);
rte_rwlock_init(&share_cache->rwlock);
rte_rwlock_init(&share_cache->mprwlock);
- share_cache->mempool_cb_reg_n = 0;
/* Initialize B-tree and allocate memory for global MR cache table. */
return mlx5_mr_btree_init(&share_cache->cache,
MLX5_MR_BTREE_CACHE_N * 2, socket);
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index f774ccbf33..13eb350980 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -81,7 +81,6 @@ struct mlx5_mr_share_cache {
uint32_t dev_gen; /* Generation number to flush local caches. */
rte_rwlock_t rwlock; /* MR cache Lock. */
rte_rwlock_t mprwlock; /* Mempool Registration Lock. */
- uint32_t mempool_cb_reg_n; /* Mempool event callback registrants. */
struct mlx5_mr_btree cache; /* Global MR cache table. */
struct mlx5_mr_list mr_list; /* Registered MR list. */
struct mlx5_mr_list mr_free_list; /* Freed MR list. */
--
2.34.1
next reply other threads:[~2022-11-03 10:45 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-03 10:44 Gregory Etelson [this message]
2022-11-06 11:08 ` Raslan Darawsheh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221103104427.1677-1-getelson@nvidia.com \
--to=getelson@nvidia.com \
--cc=anatoly.burakov@intel.com \
--cc=dev@dpdk.org \
--cc=dkozlyuk@nvidia.com \
--cc=matan@nvidia.com \
--cc=rasland@nvidia.com \
--cc=stable@dpdk.org \
--cc=viacheslavo@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).