DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@intel.com>
To: dev@dpdk.org
Cc: stable@dpdk.org, Elad Nachman <eladv6@gmail.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Igor Ryzhov <iryzhov@nfware.com>, Dan Gora <dg@adax.com>
Subject: [dpdk-dev] [PATCH v5 3/3] kni: fix kernel deadlock when using mlx devices
Date: Mon, 29 Mar 2021 15:36:55 +0100
Message-ID: <20210329143655.521750-3-ferruh.yigit@intel.com> (raw)
In-Reply-To: <20210329143655.521750-1-ferruh.yigit@intel.com>

KNI runs userspace callback with rtnl lock held, this is not working
fine with some devices that needs to interact with kernel interface in
the callback, like Mellanox devices.

The solution is releasing the rtnl lock before calling the userspace
callback. But it requires two consideration:

1. The rtnl lock needs to released before 'kni->sync_lock', otherwise it
   causes deadlock with multiple KNI devices, please check below the A.
   for the details of the deadlock condition.

2. When rtnl lock is released for interface down event, it cause a
   regression and deadlock, so can't release the rtnl lock for interface
   down event, please check below B. for the details.

As a solution, interface down event is handled asynchronously and for
all other events rtnl lock is released before processing the callback.

A. KNI sync lock is being locked while rtnl is held.
If two threads are calling kni_net_process_request() ,
then the first one will take the sync lock, release rtnl lock then sleep.
The second thread will try to lock sync lock while holding rtnl.
The first thread will wake, and try to lock rtnl, resulting in a
deadlock.  The remedy is to release rtnl before locking the KNI sync
lock.
Since in between nothing is accessing Linux network-wise, no rtnl
locking is needed.

B. There is a race condition in __dev_close_many() processing the
close_list while the application terminates.
It looks like if two KNI interfaces are terminating,
and one releases the rtnl lock, the other takes it,
updating the close_list in an unstable state,
causing the close_list to become a circular linked list,
hence list_for_each_entry() will endlessly loop inside
__dev_close_many() .

To summarize:
request != interface down : unlock rtnl, send request to user-space,
wait for response, send the response error code to caller in user-space.

request == interface down: send request to user-space, return immediately
with error code of 0 (success) to user-space.

Fixes: 3fc5ca2f6352 ("kni: initial import")
Cc: stable@dpdk.org

Signed-off-by: Elad Nachman <eladv6@gmail.com>
---
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Igor Ryzhov <iryzhov@nfware.com>
Cc: Dan Gora <dg@adax.com>

 #	kernel/linux/kni/kni_net.c.rej
---
 kernel/linux/kni/kni_net.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
index 6cf99da0dc92..f259327954b2 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -113,6 +113,14 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
 
 	ASSERT_RTNL();
 
+	/* If we need to wait and RTNL mutex is held
+	 * drop the mutex and hold reference to keep device
+	 */
+	if (req->async == 0) {
+		dev_hold(dev);
+		rtnl_unlock();
+	}
+
 	mutex_lock(&kni->sync_lock);
 
 	/* Construct data */
@@ -152,6 +160,10 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
 
 fail:
 	mutex_unlock(&kni->sync_lock);
+	if (req->async == 0) {
+		rtnl_lock();
+		dev_put(dev);
+	}
 	return ret;
 }
 
@@ -194,6 +206,10 @@ kni_net_release(struct net_device *dev)
 
 	/* Setting if_up to 0 means down */
 	req.if_up = 0;
+
+	/* request async because of the deadlock problem */
+	req.async = 1;
+
 	ret = kni_net_process_request(dev, &req);
 
 	return (ret == 0) ? req.result : ret;
-- 
2.30.2


  parent reply	other threads:[~2021-03-29 14:37 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 14:46 [dpdk-dev] [PATCH] kni: fix rtnl deadlocks and race conditions Elad Nachman
2021-02-19 18:41 ` Ferruh Yigit
2021-02-21  8:03   ` Elad Nachman
2021-02-22 15:58     ` Ferruh Yigit
2021-02-23 12:05 ` [dpdk-dev] [PATCH V2] kni: fix rtnl deadlocks and race conditions v2 Elad Nachman
2021-02-23 12:53   ` Ferruh Yigit
2021-02-23 13:44 ` [dpdk-dev] [PATCH 1/2] kni: fix rtnl deadlocks and race conditions v3 Elad Nachman
2021-02-23 13:45 ` [dpdk-dev] [PATCH 2/2] " Elad Nachman
2021-02-24 12:49   ` Igor Ryzhov
2021-02-24 13:33     ` Elad Nachman
2021-02-24 14:04       ` Igor Ryzhov
2021-02-24 14:06         ` Elad Nachman
2021-02-24 14:41           ` Igor Ryzhov
2021-02-24 14:56             ` Elad Nachman
2021-02-24 15:18               ` Igor Ryzhov
     [not found]                 ` <CACXF7qkhkzFc-=v=iiBzh2V7rLjk1U34VUfPbNrnYJND_0TKHQ@mail.gmail.com>
2021-02-24 16:31                   ` Igor Ryzhov
2021-02-24 15:54     ` Stephen Hemminger
2021-02-25 14:32 ` [dpdk-dev] [PATCH 1/2] kni: fix kernel deadlock when using mlx devices Elad Nachman
2021-02-25 14:32   ` [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4 Elad Nachman
2021-02-25 21:01     ` Igor Ryzhov
2021-02-26 15:48       ` Stephen Hemminger
2021-02-26 17:43         ` Elad Nachman
2021-03-01  8:10           ` Igor Ryzhov
2021-03-01 16:38             ` Stephen Hemminger
2021-03-15 16:58               ` Ferruh Yigit
2021-03-01 20:27             ` Dan Gora
2021-03-01 21:26               ` Dan Gora
2021-03-02 16:44                 ` Elad Nachman
2021-03-15 17:17     ` Ferruh Yigit
2021-03-16 18:35       ` Elad Nachman
2021-03-16 18:42         ` Ferruh Yigit
2021-03-15 17:17   ` [dpdk-dev] [PATCH 1/2] kni: fix kernel deadlock when using mlx devices Ferruh Yigit
2021-03-29 14:36 ` [dpdk-dev] [PATCH v5 1/3] kni: refactor user request processing Ferruh Yigit
2021-03-29 14:36   ` [dpdk-dev] [PATCH v5 2/3] kni: support async user request Ferruh Yigit
2021-03-29 14:36   ` Ferruh Yigit [this message]
2021-04-09 14:56     ` [dpdk-dev] [dpdk-stable] [PATCH v5 3/3] kni: fix kernel deadlock when using mlx devices Ferruh Yigit
2021-04-12 14:35       ` Elad Nachman
2021-04-20 23:07         ` Thomas Monjalon
2021-04-23  8:41           ` Igor Ryzhov
2021-04-23  8:59             ` Ferruh Yigit
2021-04-23 12:43               ` Igor Ryzhov
2021-04-23 12:58                 ` Igor Ryzhov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210329143655.521750-3-ferruh.yigit@intel.com \
    --to=ferruh.yigit@intel.com \
    --cc=dev@dpdk.org \
    --cc=dg@adax.com \
    --cc=eladv6@gmail.com \
    --cc=iryzhov@nfware.com \
    --cc=stable@dpdk.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git