DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] kni: fix kernel deadlock due to async changes
@ 2021-10-11  6:35 Sahithi Singam
  2021-11-23  9:40 ` Ferruh Yigit
  0 siblings, 1 reply; 2+ messages in thread
From: Sahithi Singam @ 2021-10-11  6:35 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, eladv6

From: Sahithi Singam <sahithi.singam@oracle.com<mailto:sahithi.singam@oracle.com>>



Async user request changes resulted in a kernel deadlock when used with linux kernel version>= 5.12.



Starting from linux kernel version 5.12, a new global semaphore dev_addr_sem was introduced in dev_set_mac_address_user() function that should be acquired and released along with rtnl_lock when a mac address set request was received from userspace.



When a mac address set request is received on KNI interface, before sending request to userspace, kni code is releasing rtnl_lock without releasing dev_addr_sem semaphore. After receiving a response it is again trying to hold rtnl_lock. These changes were added as part of async user request changes to fix a kernel deadlock with bifurcated devices.



This code is resulting in deadlock as kni is just releasing rtnl_lock without releasing semaphore while mac address set request on some other device  could have acquired rtnl_lock and could be waiting for dev_addr_sem held by the current device.



As a solution, support async user request changes based on a module parameter. This will limit kernel deadlock issue to users using KNI over bifurcated devices with kernel versions >= 5.12.



Bugzilla ID: 816

Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")

Cc: eladv6@gmail.com<mailto:eladv6@gmail.com>



Signed-off-by: Sahithi Singam <sahithi.singam@oracle.com<mailto:sahithi.singam@oracle.com>>

---

kernel/linux/kni/kni_dev.h  |  3 +++

kernel/linux/kni/kni_misc.c | 11 +++++++++++  kernel/linux/kni/kni_net.c  | 46 ++++++++++++++++++++++++++-------------------

3 files changed, 41 insertions(+), 19 deletions(-)



diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h index c15da311..ea9d23a 100644

--- a/kernel/linux/kni/kni_dev.h

+++ b/kernel/linux/kni/kni_dev.h

@@ -34,6 +34,9 @@

/* Default carrier state for created KNI network interfaces */  extern uint32_t kni_dflt_carrier;

+/* Asynchronous userspace request support */ extern int async_support;

+

/**

  * A structure describing the private information for a kni device.

  */

diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c index 2b464c4..685067d 100644

--- a/kernel/linux/kni/kni_misc.c

+++ b/kernel/linux/kni/kni_misc.c

@@ -41,6 +41,9 @@

static char *carrier;

uint32_t kni_dflt_carrier;

+/* Asynchronous userspace request support */ int async_support;

+

#define KNI_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */

 static int kni_net_id;

@@ -659,3 +662,11 @@ struct kni_net {

"\t\ton    Interfaces will be created with carrier state set to on.\n"

"\t\t"

);

+

+module_param(async_support, int, 0);

+MODULE_PARM_DESC(async_support,

+"Support KNI async user request (default=0):\n"

+"\t\t0       Async user request not supported.\n"

+"\t\tother   Async user request supported.\n"

+"\t\t"

+);

diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c index 611719b..664f4b7 100644

--- a/kernel/linux/kni/kni_net.c

+++ b/kernel/linux/kni/kni_net.c

@@ -111,14 +111,16 @@

             uint32_t num;

             int ret_val;

-             ASSERT_RTNL();

+            if (async_support != 0) {

+                           ASSERT_RTNL();

-             /* If we need to wait and RTNL mutex is held

-             * drop the mutex and hold reference to keep device

-             */

-             if (req->async == 0) {

-                           dev_hold(dev);

-                           rtnl_unlock();

+                           /* If we need to wait and RTNL mutex is held

+                           * drop the mutex and hold reference to keep device

+                           */

+                           if (req->async == 0) {

+                                         dev_hold(dev);

+                                         rtnl_unlock();

+                           }

             }

              mutex_lock(&kni->sync_lock);

@@ -132,12 +134,15 @@

                            goto fail;

             }

-             /* No result available since request is handled

-             * asynchronously. set response to success.

-             */

-             if (req->async != 0) {

-                           req->result = 0;

-                           goto async;

+            if (async_support != 0) {

+                           /* No result available since request is handled

+                           * asynchronously. set response to success.

+                           */

+                           if (req->async != 0) {

+                                         req->result = 0;

+                                         ret = 0;

+                                         goto fail;

+                           }

             }

              ret_val = wait_event_interruptible_timeout(kni->wq,

@@ -155,14 +160,15 @@

             }

              memcpy(req, kni->sync_kva, sizeof(struct rte_kni_request));

-async:

             ret = 0;

 fail:

             mutex_unlock(&kni->sync_lock);

-             if (req->async == 0) {

-                           rtnl_lock();

-                           dev_put(dev);

+            if (async_support != 0) {

+                           if (req->async == 0) {

+                                         rtnl_lock();

+                                         dev_put(dev);

+                           }

             }

             return ret;

}

@@ -207,8 +213,10 @@

             /* Setting if_up to 0 means down */

             req.if_up = 0;

-             /* request async because of the deadlock problem */

-             req.async = 1;

+            if (async_support != 0) {

+                           /* request async because of the deadlock problem */

+                           req.async = 1;

+            }

              ret = kni_net_process_request(dev, &req);

--

1.8.3.1




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] kni: fix kernel deadlock due to async changes
  2021-10-11  6:35 [dpdk-dev] [PATCH] kni: fix kernel deadlock due to async changes Sahithi Singam
@ 2021-11-23  9:40 ` Ferruh Yigit
  0 siblings, 0 replies; 2+ messages in thread
From: Ferruh Yigit @ 2021-11-23  9:40 UTC (permalink / raw)
  To: Sahithi Singam; +Cc: dev, eladv6, Igor Ryzhov, Thomas Monjalon

On 10/11/2021 7:35 AM, Sahithi Singam wrote:
> From: Sahithi Singam <sahithi.singam@oracle.com <mailto:sahithi.singam@oracle.com>>
> 
> Async user request changes resulted in a kernel deadlock when used with linux kernel version>= 5.12.
> 
> Starting from linux kernel version 5.12, a new global semaphore dev_addr_sem was introduced in dev_set_mac_address_user() function that should be acquired and released along with rtnl_lock when a mac address set request was received from userspace.
> 
> When a mac address set request is received on KNI interface, before sending request to userspace, kni code is releasing rtnl_lock without releasing dev_addr_sem semaphore. After receiving a response it is again trying to hold rtnl_lock. These changes were added as part of async user request changes to fix a kernel deadlock with bifurcated devices.
> 
> This code is resulting in deadlock as kni is just releasing rtnl_lock without releasing semaphore while mac address set request on some other device  could have acquired rtnl_lock and could be waiting for dev_addr_sem held by the current device.
> 
> As a solution, support async user request changes based on a module parameter. This will limit kernel deadlock issue to users using KNI over bifurcated devices with kernel versions >= 5.12.
> 
> Bugzilla ID: 816
> 
> Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")
> 
> Cc: eladv6@gmail.com <mailto:eladv6@gmail.com>
> 
> Signed-off-by: Sahithi Singam <sahithi.singam@oracle.com <mailto:sahithi.singam@oracle.com>>
> 

Hi Sahithi,

Since the patch is in html format, it is not detected by patchwork
and we missed it.

Can you please check if this patch different from the one I have sent
for same purpose:
https://patches.dpdk.org/project/dpdk/patch/20211008235830.127167-1-ferruh.yigit@intel.com/

If they are same, would you be OK to continue with above one?

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-11-23  9:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11  6:35 [dpdk-dev] [PATCH] kni: fix kernel deadlock due to async changes Sahithi Singam
2021-11-23  9:40 ` Ferruh Yigit

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git