From: Changchun Zhang <changchun.zhang@oracle.com>
To: "users@dpdk.org" <users@dpdk.org>, "dev@dpdk.org" <dev@dpdk.org>
Subject: RE: cryto_aesni_mb device data contaminated and causing crash when supporting vdev_scan/vdev_action
Date: Thu, 3 Feb 2022 04:41:03 +0000 [thread overview]
Message-ID: <CO1PR10MB4756268C96E9C658BDA01BFF84289@CO1PR10MB4756.namprd10.prod.outlook.com> (raw)
In-Reply-To: <CO1PR10MB4756894C418F39B161899CB284279@CO1PR10MB4756.namprd10.prod.outlook.com>
[-- Attachment #1: Type: text/plain, Size: 4305 bytes --]
The issue can be resolved by allocating the mb_mgr only for the primary process in the PMD probe/create function of the aesni_mb_pmd.
Disabling the scanning/probe is just a work around, I guess that is the reason why the fix in this https://review.spdk.io/gerrit/c/spdk/dpdk/+/1056<https://urldefense.com/v3/__https:/review.spdk.io/gerrit/c/spdk/dpdk/*/1056__;Kw!!ACWV5N9M2RV99hQ!YF30Qsfu_2n30Fh2UhUdVH-1-72sWzP0kFWqtxkp1w3jzijsVGrk6w6v8C1gKCMGq9Bu$> is canceled as the latest DPDK of 21.11 redesigned this PMD. The mb_mgr is removed from the device data and put to the each creation of the crypto session (wondering it has performance degradation).
Thanks,
Changchun (Alex)
From: Changchun Zhang [mailto:changchun.zhang@oracle.com]
Sent: Wednesday, February 2, 2022 11:08 AM
To: users@dpdk.org; dev@dpdk.org
Subject: [External] : cryto_aesni_mb device data contaminated and causing crash when supporting vdev_scan/vdev_action
Hi,
Has anyone noticed that crypto_aesni_mb virtual crypto device has issue of memory crash caused by the scanning and probe on secondary process. Can anyone cast any lights on it.
What I encountered is:
On the primary process, the crypto_aesni_mb device is probed and created successfully and I got the mb_mgr set in the device private data. But during the packet process, the application crashes on accessing the mb_mgr. The deugging shows this mb_mgr address has been changed to an invalid address (non-NULL). Further digging shows this memory contamination occurs after the vdev_action replies the scan request.
In below code, the crash is gone by either disable sending message on VDEV_SCAN_REQ or skip processing the VDEV_SCAN_ONE. It seems the insert_vdev() on secondary process triggers another probe and break the existing device data?
It is also noticed there was an issue which was fixed by this patch https://review.spdk.io/gerrit/c/spdk/dpdk/+/1056<https://urldefense.com/v3/__https:/review.spdk.io/gerrit/c/spdk/dpdk/*/1056__;Kw!!ACWV5N9M2RV99hQ!YF30Qsfu_2n30Fh2UhUdVH-1-72sWzP0kFWqtxkp1w3jzijsVGrk6w6v8C1gKCMGq9Bu$> but this patch is cancelled. This patch was complaining the similar memory issue found during scanning and probing on the secondary process.
static int
vdev_action(const struct rte_mp_msg *mp_msg, const void *peer)
{
struct rte_vdev_device *dev;
struct rte_mp_msg mp_resp;
struct vdev_param *ou = (struct vdev_param *)&mp_resp.param;
const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
const char *devname;
int num;
int ret;
strlcpy(mp_resp.name, VDEV_MP_KEY, sizeof(mp_resp.name));
mp_resp.len_param = sizeof(*ou);
mp_resp.num_fds = 0;
switch (in->type) {
case VDEV_SCAN_REQ:
VDEV_LOG(INFO, "changczh skip vdev, %s", devname);
ou->type = VDEV_SCAN_ONE;
ou->num = 1;
num = 0;
rte_spinlock_recursive_lock(&vdev_device_list_lock);
TAILQ_FOREACH(dev, &vdev_device_list, next) {
devname = rte_vdev_device_name(dev);
if (strlen(devname) == 0) {
VDEV_LOG(INFO, "vdev with no name is not sent");
continue;
}
VDEV_LOG(INFO, "send vdev, %s", devname);
strlcpy(ou->name, devname, RTE_DEV_NAME_MAX_LEN);
if (rte_mp_sendmsg(&mp_resp) < 0)
VDEV_LOG(ERR, "send vdev, %s, failed, %s",
devname, strerror(rte_errno));
num++;
}
rte_spinlock_recursive_unlock(&vdev_device_list_lock);
ou->type = VDEV_SCAN_REP;
ou->num = num;
if (rte_mp_reply(&mp_resp, peer) < 0)
VDEV_LOG(ERR, "Failed to reply a scan request");
break;
case VDEV_SCAN_ONE:
VDEV_LOG(INFO, "receive vdev, %s", in->name);
ret = insert_vdev(in->name, NULL, NULL, false);
if (ret == -EEXIST)
VDEV_LOG(DEBUG, "device already exist, %s", in->name);
else if (ret < 0)
VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
break;
default:
VDEV_LOG(ERR, "vdev cannot recognize this message");
}
return 0;
}
Thanks,
Alex
[-- Attachment #2: Type: text/html, Size: 17721 bytes --]
prev parent reply other threads:[~2022-02-03 4:41 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-02 16:08 Changchun Zhang
2022-02-03 4:41 ` Changchun Zhang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CO1PR10MB4756268C96E9C658BDA01BFF84289@CO1PR10MB4756.namprd10.prod.outlook.com \
--to=changchun.zhang@oracle.com \
--cc=dev@dpdk.org \
--cc=users@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).