From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0353BA054F; Mon, 15 Mar 2021 18:17:42 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E1BC624276C; Mon, 15 Mar 2021 18:17:41 +0100 (CET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by mails.dpdk.org (Postfix) with ESMTP id CD231242752 for ; Mon, 15 Mar 2021 18:17:39 +0100 (CET) IronPort-SDR: 31oeNANY6iwa9aiYVX5MdJDeIhnBh70srk6TGjK9W3A+KG7e51UrGbpyqAvnBZM2i47+cCGJhV /bnYOT2HaChw== X-IronPort-AV: E=McAfee;i="6000,8403,9924"; a="209034337" X-IronPort-AV: E=Sophos;i="5.81,251,1610438400"; d="scan'208";a="209034337" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2021 10:17:32 -0700 IronPort-SDR: fWNyC/AjtLko6CwEPSfFY0VprMbG3VKtWucS3HeH45F5f2XUeQakGFg1QSbdFKaCRqmWsd58g9 KONJa1Prja9g== X-IronPort-AV: E=Sophos;i="5.81,251,1610438400"; d="scan'208";a="449432783" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.252.23.61]) ([10.252.23.61]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2021 10:17:31 -0700 To: Elad Nachman Cc: iryzhov@nfware.com, stephen@networkplumber.org, dev@dpdk.org, Dan Gora References: <20201126144613.4986-1-eladv6@gmail.com> <20210225143239.14220-1-eladv6@gmail.com> <20210225143239.14220-2-eladv6@gmail.com> From: Ferruh Yigit X-User: ferruhy Message-ID: Date: Mon, 15 Mar 2021 17:17:29 +0000 MIME-Version: 1.0 In-Reply-To: <20210225143239.14220-2-eladv6@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 2/25/2021 2:32 PM, Elad Nachman wrote: > This part of the series includes my fixes for the issues reported > by Ferruh and Igor (and Igor comments for v3 of the patch) > on top of part 1 of the patch series: > > A. KNI sync lock is being locked while rtnl is held. > If two threads are calling kni_net_process_request() , > then the first one will take the sync lock, release rtnl lock then sleep. > The second thread will try to lock sync lock while holding rtnl. > The first thread will wake, and try to lock rtnl, resulting in a deadlock. > The remedy is to release rtnl before locking the KNI sync lock. > Since in between nothing is accessing Linux network-wise, > no rtnl locking is needed. > > B. There is a race condition in __dev_close_many() processing the > close_list while the application terminates. > It looks like if two vEth devices are terminating, > and one releases the rtnl lock, the other takes it, > updating the close_list in an unstable state, > causing the close_list to become a circular linked list, > hence list_for_each_entry() will endlessly loop inside > __dev_close_many() . > Since the description for the original patch indicate the > original motivation was bringing the device up, > I have changed kni_net_process_request() to hold the rtnl mutex > in case of bringing the device down since this is the path called > from __dev_close_many() , causing the corruption of the close_list. > In order to prevent deadlock in Mellanox device in this case, the > code has been modified not to wait for user-space while holding > the rtnl lock. > Instead, after the request has been sent, all locks are relinquished > and the function exits immediately with return code of zero (success). > > To summarize: > request != interface down : unlock rtnl, send request to user-space, > wait for response, send the response error code to caller in user-space. > > request == interface down: send request to user-space, return immediately > with error code of 0 (success) to user-space. > > Signed-off-by: Elad Nachman > > > --- > v4: > * for if down case, send asynchronously with rtnl locked and without > wait, returning immediately to avoid both kernel race conditions > and deadlock in user-space > v3: > * Include original patch and new patch as a series of patch, added a > comment to the new patch > v2: > * rebuild the patch as increment from patch 64106 > * fix comment and blank lines > --- > kernel/linux/kni/kni_net.c | 41 +++++++++++++++++++++++++++------ > lib/librte_kni/rte_kni.c | 7 ++++-- > lib/librte_kni/rte_kni_common.h | 1 + > 3 files changed, 40 insertions(+), 9 deletions(-) > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > index f0b6e9a8d..ba991802b 100644 > --- a/kernel/linux/kni/kni_net.c > +++ b/kernel/linux/kni/kni_net.c > @@ -110,12 +110,34 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req) > void *resp_va; > uint32_t num; > int ret_val; > + int req_is_dev_stop = 0; > + > + /* For configuring the interface to down, > + * rtnl must be held all the way to prevent race condition > + * inside __dev_close_many() between two netdev instances of KNI > + */ > + if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > + req->if_up == 0) > + req_is_dev_stop = 1; Having this request type checks in the 'kni_net_process_request()' function looks like hack. Since adding a new field into the "struct rte_kni_request", that can be a more generic 'asnyc' field, and the requested function, like 'kni_net_release()' can set it to support async requests. And can you please separate the function to add a more generic async request support on its patch, which should do: - add new 'asnyc' field to "struct rte_kni_request" - in 'kni_net_process_request()', if 'req->async' set, do not wait for response - in library, 'lib/librte_kni/rte_kni.c', in 'rte_kni_handle_request()' function, if the request is async don't put the response (These are already done in this patch) Overall it can be three patch set: 1) Function parameter change 2) Add generic async request support (with documentation update) 3) rtnl unlock and make 'kni_net_release()' request async (actual fix) (We can discuss more if to make 'kni_net_release()' async with a kernel parameter or not) What do you think, does it make sense?