* [dpdk-dev] [PATCH] kni: fix kernel deadlock due to async changes
@ 2021-10-11 6:35 Sahithi Singam
2021-11-23 9:40 ` Ferruh Yigit
0 siblings, 1 reply; 2+ messages in thread
From: Sahithi Singam @ 2021-10-11 6:35 UTC (permalink / raw)
To: ferruh.yigit; +Cc: dev, eladv6
From: Sahithi Singam <sahithi.singam@oracle.com<mailto:sahithi.singam@oracle.com>>
Async user request changes resulted in a kernel deadlock when used with linux kernel version>= 5.12.
Starting from linux kernel version 5.12, a new global semaphore dev_addr_sem was introduced in dev_set_mac_address_user() function that should be acquired and released along with rtnl_lock when a mac address set request was received from userspace.
When a mac address set request is received on KNI interface, before sending request to userspace, kni code is releasing rtnl_lock without releasing dev_addr_sem semaphore. After receiving a response it is again trying to hold rtnl_lock. These changes were added as part of async user request changes to fix a kernel deadlock with bifurcated devices.
This code is resulting in deadlock as kni is just releasing rtnl_lock without releasing semaphore while mac address set request on some other device could have acquired rtnl_lock and could be waiting for dev_addr_sem held by the current device.
As a solution, support async user request changes based on a module parameter. This will limit kernel deadlock issue to users using KNI over bifurcated devices with kernel versions >= 5.12.
Bugzilla ID: 816
Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")
Cc: eladv6@gmail.com<mailto:eladv6@gmail.com>
Signed-off-by: Sahithi Singam <sahithi.singam@oracle.com<mailto:sahithi.singam@oracle.com>>
---
kernel/linux/kni/kni_dev.h | 3 +++
kernel/linux/kni/kni_misc.c | 11 +++++++++++ kernel/linux/kni/kni_net.c | 46 ++++++++++++++++++++++++++-------------------
3 files changed, 41 insertions(+), 19 deletions(-)
diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h index c15da311..ea9d23a 100644
--- a/kernel/linux/kni/kni_dev.h
+++ b/kernel/linux/kni/kni_dev.h
@@ -34,6 +34,9 @@
/* Default carrier state for created KNI network interfaces */ extern uint32_t kni_dflt_carrier;
+/* Asynchronous userspace request support */ extern int async_support;
+
/**
* A structure describing the private information for a kni device.
*/
diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c index 2b464c4..685067d 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -41,6 +41,9 @@
static char *carrier;
uint32_t kni_dflt_carrier;
+/* Asynchronous userspace request support */ int async_support;
+
#define KNI_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
static int kni_net_id;
@@ -659,3 +662,11 @@ struct kni_net {
"\t\ton Interfaces will be created with carrier state set to on.\n"
"\t\t"
);
+
+module_param(async_support, int, 0);
+MODULE_PARM_DESC(async_support,
+"Support KNI async user request (default=0):\n"
+"\t\t0 Async user request not supported.\n"
+"\t\tother Async user request supported.\n"
+"\t\t"
+);
diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c index 611719b..664f4b7 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -111,14 +111,16 @@
uint32_t num;
int ret_val;
- ASSERT_RTNL();
+ if (async_support != 0) {
+ ASSERT_RTNL();
- /* If we need to wait and RTNL mutex is held
- * drop the mutex and hold reference to keep device
- */
- if (req->async == 0) {
- dev_hold(dev);
- rtnl_unlock();
+ /* If we need to wait and RTNL mutex is held
+ * drop the mutex and hold reference to keep device
+ */
+ if (req->async == 0) {
+ dev_hold(dev);
+ rtnl_unlock();
+ }
}
mutex_lock(&kni->sync_lock);
@@ -132,12 +134,15 @@
goto fail;
}
- /* No result available since request is handled
- * asynchronously. set response to success.
- */
- if (req->async != 0) {
- req->result = 0;
- goto async;
+ if (async_support != 0) {
+ /* No result available since request is handled
+ * asynchronously. set response to success.
+ */
+ if (req->async != 0) {
+ req->result = 0;
+ ret = 0;
+ goto fail;
+ }
}
ret_val = wait_event_interruptible_timeout(kni->wq,
@@ -155,14 +160,15 @@
}
memcpy(req, kni->sync_kva, sizeof(struct rte_kni_request));
-async:
ret = 0;
fail:
mutex_unlock(&kni->sync_lock);
- if (req->async == 0) {
- rtnl_lock();
- dev_put(dev);
+ if (async_support != 0) {
+ if (req->async == 0) {
+ rtnl_lock();
+ dev_put(dev);
+ }
}
return ret;
}
@@ -207,8 +213,10 @@
/* Setting if_up to 0 means down */
req.if_up = 0;
- /* request async because of the deadlock problem */
- req.async = 1;
+ if (async_support != 0) {
+ /* request async because of the deadlock problem */
+ req.async = 1;
+ }
ret = kni_net_process_request(dev, &req);
--
1.8.3.1
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] kni: fix kernel deadlock due to async changes
2021-10-11 6:35 [dpdk-dev] [PATCH] kni: fix kernel deadlock due to async changes Sahithi Singam
@ 2021-11-23 9:40 ` Ferruh Yigit
0 siblings, 0 replies; 2+ messages in thread
From: Ferruh Yigit @ 2021-11-23 9:40 UTC (permalink / raw)
To: Sahithi Singam; +Cc: dev, eladv6, Igor Ryzhov, Thomas Monjalon
On 10/11/2021 7:35 AM, Sahithi Singam wrote:
> From: Sahithi Singam <sahithi.singam@oracle.com <mailto:sahithi.singam@oracle.com>>
>
> Async user request changes resulted in a kernel deadlock when used with linux kernel version>= 5.12.
>
> Starting from linux kernel version 5.12, a new global semaphore dev_addr_sem was introduced in dev_set_mac_address_user() function that should be acquired and released along with rtnl_lock when a mac address set request was received from userspace.
>
> When a mac address set request is received on KNI interface, before sending request to userspace, kni code is releasing rtnl_lock without releasing dev_addr_sem semaphore. After receiving a response it is again trying to hold rtnl_lock. These changes were added as part of async user request changes to fix a kernel deadlock with bifurcated devices.
>
> This code is resulting in deadlock as kni is just releasing rtnl_lock without releasing semaphore while mac address set request on some other device could have acquired rtnl_lock and could be waiting for dev_addr_sem held by the current device.
>
> As a solution, support async user request changes based on a module parameter. This will limit kernel deadlock issue to users using KNI over bifurcated devices with kernel versions >= 5.12.
>
> Bugzilla ID: 816
>
> Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")
>
> Cc: eladv6@gmail.com <mailto:eladv6@gmail.com>
>
> Signed-off-by: Sahithi Singam <sahithi.singam@oracle.com <mailto:sahithi.singam@oracle.com>>
>
Hi Sahithi,
Since the patch is in html format, it is not detected by patchwork
and we missed it.
Can you please check if this patch different from the one I have sent
for same purpose:
https://patches.dpdk.org/project/dpdk/patch/20211008235830.127167-1-ferruh.yigit@intel.com/
If they are same, would you be OK to continue with above one?
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-11-23 9:40 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11 6:35 [dpdk-dev] [PATCH] kni: fix kernel deadlock due to async changes Sahithi Singam
2021-11-23 9:40 ` Ferruh Yigit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).