From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 258B1A0546; Fri, 19 Feb 2021 19:42:02 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D483016096C; Fri, 19 Feb 2021 19:42:01 +0100 (CET) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by mails.dpdk.org (Postfix) with ESMTP id 7B4F216089B for ; Fri, 19 Feb 2021 19:42:00 +0100 (CET) IronPort-SDR: SeFa9qHmsgIJK4ckLyoXE9tZGgIv/v/0cNMPL9mehzFRz5Zer+yKxkSmLe1uCn9PTB84pwldbI celc3tkOrfIg== X-IronPort-AV: E=McAfee;i="6000,8403,9900"; a="181376427" X-IronPort-AV: E=Sophos;i="5.81,189,1610438400"; d="scan'208";a="181376427" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2021 10:41:59 -0800 IronPort-SDR: rZ1OwyzYSO3qpy+Ft577HeiRSG84mM/jWGxwbeR/HnSALU2XYskCJ2a65HSYgv1DTp3O9cH/g4 ULghEzrBGp0g== X-IronPort-AV: E=Sophos;i="5.81,189,1610438400"; d="scan'208";a="387098419" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.213.251.55]) ([10.213.251.55]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2021 10:41:58 -0800 To: Elad Nachman Cc: dev@dpdk.org, Igor Ryzhov , Stephen Hemminger References: <20201126144613.4986-1-eladv6@gmail.com> From: Ferruh Yigit X-User: ferruhy Message-ID: <9f7284a8-4158-c246-c329-bfec27f543a5@intel.com> Date: Fri, 19 Feb 2021 18:41:54 +0000 MIME-Version: 1.0 In-Reply-To: <20201126144613.4986-1-eladv6@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH] kni: fix rtnl deadlocks and race conditions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 11/26/2020 2:46 PM, Elad Nachman wrote: > This patch leverages on Stephen Hemminger's 64106 patch from Dec 2019, > and fixes the issues reported by Ferruh and Igor: > > A. KNI sync lock is being locked while rtnl is held. > If two threads are calling kni_net_process_request() , > then the first one wil take the sync lock, release rtnl lock then sleep. > The second thread will try to lock sync lock while holding rtnl. > The first thread will wake, and try to lock rtnl, resulting in a deadlock. > The remedy is to release rtnl before locking the KNI sync lock. > Since in between nothing is accessing Linux network-wise, > no rtnl locking is needed. Hi Elad, Thanks for explanation, that clarifies the issue. Also I confirm I don't see the hang, at least as much as I test. > > B. There is a race condition in __dev_close_many() processing the > close_list while the application terminates. > It looks like if two vEth devices are terminating, > and one releases the rtnl lock, the other takes it, > updating the close_list in an unstable state, > causing the close_list to become a circular linked list, > hence list_for_each_entry() will endlessly loop inside > __dev_close_many() . > Since the description for the original patch indicate the > original motivation was bringing the device up, > I have changed kni_net_process_request() to hold the rtnl mutex > in case of bringing the device down since this is the path called > from __dev_close_many() , causing the corruption of the close_list. > I can't reproduce this case, I see the protection in the code, but better to get confirmation from Igor. Overall the issue seems calling a function pointed by 'rte_kni_ops' which requires to acquire the rtnl lock. So I wonder if this can't be handled in the ops function, by processing the request asynchronously, like recording the request, return from 'rte_kni_ops', and process the request afterwards? I assume the application we mention is not kni sample application. > > > Signed-off-by: Elad Nachman > --- > kernel/linux/kni/kni_net.c | 47 +++++++++++++++++++++++++------------- > 1 file changed, 31 insertions(+), 16 deletions(-) > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > index 4b752083d..cf5b0845d 100644 > --- a/kernel/linux/kni/kni_net.c > +++ b/kernel/linux/kni/kni_net.c > @@ -17,6 +17,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -102,18 +103,26 @@ get_data_kva(struct kni_dev *kni, void *pkt_kva) > * It can be called to process the request. > */ > static int > -kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > +kni_net_process_request(struct net_device *dev, struct rte_kni_request *req) > { > + struct kni_dev *kni = netdev_priv(dev); > int ret = -1; > void *resp_va; > uint32_t num; > int ret_val; > + int req_is_dev_stop = 0; > > - if (!kni || !req) { > - pr_err("No kni instance or request\n"); > - return -EINVAL; > - } > + if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > + req->if_up == 0) > + req_is_dev_stop = 1; > > + ASSERT_RTNL(); > + > + if (!req_is_dev_stop) { > + dev_hold(dev); > + rtnl_unlock(); > + } > + > mutex_lock(&kni->sync_lock); > > /* Construct data */ > @@ -125,8 +134,13 @@ kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > goto fail; > } > > + /* Since we need to wait and RTNL mutex is held > + * drop the mutex and hold refernce to keep device > + */ > + Comment seems left here, need to go up. s/refernce/reference > ret_val = wait_event_interruptible_timeout(kni->wq, > kni_fifo_count(kni->resp_q), 3 * HZ); > + > if (signal_pending(current) || ret_val <= 0) { > ret = -ETIME; > goto fail; > @@ -144,6 +158,13 @@ kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > > fail: > mutex_unlock(&kni->sync_lock); > + > + extra empty line > + if (!req_is_dev_stop) { > + rtnl_lock(); > + dev_put(dev); > + } > + > return ret; > } > > @@ -155,7 +176,6 @@ kni_net_open(struct net_device *dev) > { > int ret; > struct rte_kni_request req; > - struct kni_dev *kni = netdev_priv(dev); > > netif_start_queue(dev); > if (kni_dflt_carrier == 1) > @@ -168,7 +188,7 @@ kni_net_open(struct net_device *dev) > > /* Setting if_up to non-zero means up */ > req.if_up = 1; > - ret = kni_net_process_request(kni, &req); > + ret = kni_net_process_request(dev, &req); > Althoug it is not soo confusing, these lines and following ones are noise for this patch, they are just for 'kni_net_process_request' paramter change. What do you think do the 'kni_net_process_request' parameter change in first patch, and fix the issue in second, this way second patch can contain only the actual changes required for fix.