From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BA819A054F; Tue, 16 Mar 2021 19:42:12 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7E5D240A4B; Tue, 16 Mar 2021 19:42:12 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by mails.dpdk.org (Postfix) with ESMTP id ABBFC406A2 for ; Tue, 16 Mar 2021 19:42:10 +0100 (CET) IronPort-SDR: giMUUm3ZutP0N3hxXHdVIN6EU1sY88EErrncw8Ww1I6VCRHI8eCNAxo3J8F8tP2iNJTapZcjjf n76tragj3TSQ== X-IronPort-AV: E=McAfee;i="6000,8403,9925"; a="176449451" X-IronPort-AV: E=Sophos;i="5.81,254,1610438400"; d="scan'208";a="176449451" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2021 11:42:09 -0700 IronPort-SDR: tybDPOfmuRIyKMqfNnRQMQyegDWyiXCaLq8RAmuaQSHvkmrhnyqsJZsgtWwq0FJMdPE/f8SA5R OE5DSHAiFQqQ== X-IronPort-AV: E=Sophos;i="5.81,254,1610438400"; d="scan'208";a="449815050" Received: from apongaro-mobl.ger.corp.intel.com (HELO [10.252.24.201]) ([10.252.24.201]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2021 11:42:06 -0700 To: Elad Nachman Cc: Igor Ryzhov , Stephen Hemminger , dev@dpdk.org, Dan Gora References: <20201126144613.4986-1-eladv6@gmail.com> <20210225143239.14220-1-eladv6@gmail.com> <20210225143239.14220-2-eladv6@gmail.com> From: Ferruh Yigit X-User: ferruhy Message-ID: <8eaaa648-9e34-4bc8-2f80-374a8b81d25b@intel.com> Date: Tue, 16 Mar 2021 18:42:01 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 3/16/2021 6:35 PM, Elad Nachman wrote: > Hi, > > Owing to my current development schedule and obligations, I see no opportunity > to make this set of changes in the near future. > I can do on top of your work if you don't mind? > Sorry, > > Elad. > > בתאריך יום ב׳, 15 במרץ 2021, 19:17, מאת Ferruh Yigit ‏ >: > > On 2/25/2021 2:32 PM, Elad Nachman wrote: > > This part of the series includes my fixes for the issues reported > > by Ferruh and Igor (and Igor comments for v3 of the patch) > > on top of part 1 of the patch series: > > > > A. KNI sync lock is being locked while rtnl is held. > > If two threads are calling kni_net_process_request() , > > then the first one will take the sync lock, release rtnl lock then sleep. > > The second thread will try to lock sync lock while holding rtnl. > > The first thread will wake, and try to lock rtnl, resulting in a deadlock. > > The remedy is to release rtnl before locking the KNI sync lock. > > Since in between nothing is accessing Linux network-wise, > > no rtnl locking is needed. > > > > B. There is a race condition in __dev_close_many() processing the > > close_list while the application terminates. > > It looks like if two vEth devices are terminating, > > and one releases the rtnl lock, the other takes it, > > updating the close_list in an unstable state, > > causing the close_list to become a circular linked list, > > hence list_for_each_entry() will endlessly loop inside > > __dev_close_many() . > > Since the description for the original patch indicate the > > original motivation was bringing the device up, > > I have changed kni_net_process_request() to hold the rtnl mutex > > in case of bringing the device down since this is the path called > > from __dev_close_many() , causing the corruption of the close_list. > > In order to prevent deadlock in Mellanox device in this case, the > > code has been modified not to wait for user-space while holding > > the rtnl lock. > > Instead, after the request has been sent, all locks are relinquished > > and the function exits immediately with return code of zero (success). > > > > To summarize: > > request != interface down : unlock rtnl, send request to user-space, > > wait for response, send the response error code to caller in user-space. > > > > request == interface down: send request to user-space, return immediately > > with error code of 0 (success) to user-space. > > > > Signed-off-by: Elad Nachman > > > > > > > --- > > v4: > > * for if down case, send asynchronously with rtnl locked and without > >    wait, returning immediately to avoid both kernel race conditions > >    and deadlock in user-space > > v3: > > * Include original patch and new patch as a series of patch, added a > >    comment to the new patch > > v2: > > * rebuild the patch as increment from patch 64106 > > * fix comment and blank lines > > --- > >   kernel/linux/kni/kni_net.c      | 41 +++++++++++++++++++++++++++------ > >   lib/librte_kni/rte_kni.c        |  7 ++++-- > >   lib/librte_kni/rte_kni_common.h |  1 + > >   3 files changed, 40 insertions(+), 9 deletions(-) > > > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > > index f0b6e9a8d..ba991802b 100644 > > --- a/kernel/linux/kni/kni_net.c > > +++ b/kernel/linux/kni/kni_net.c > > @@ -110,12 +110,34 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > >       void *resp_va; > >       uint32_t num; > >       int ret_val; > > +     int req_is_dev_stop = 0; > > + > > +     /* For configuring the interface to down, > > +      * rtnl must be held all the way to prevent race condition > > +      * inside __dev_close_many() between two netdev instances of KNI > > +      */ > > +     if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > > +                     req->if_up == 0) > > +             req_is_dev_stop = 1; > > Having this request type checks in the 'kni_net_process_request()' function > looks like hack. > Since adding a new field into the "struct rte_kni_request", that can be a more > generic 'asnyc' field, and the requested function, like 'kni_net_release()' can > set it to support async requests. > > And can you please separate the function to add a more generic async request > support on its patch, which should do: > - add new 'asnyc' field to "struct rte_kni_request" > - in 'kni_net_process_request()', if 'req->async' set, do not wait for response > - in library, 'lib/librte_kni/rte_kni.c', in 'rte_kni_handle_request()' > function, if the request is async don't put the response > (These are already done in this patch) > > Overall it can be three patch set: > 1) Function parameter change > 2) Add generic async request support (with documentation update) > 3) rtnl unlock and make 'kni_net_release()' request async (actual fix) > (We can discuss more if to make 'kni_net_release()' async with a kernel > parameter or not) > > What do you think, does it make sense? >