From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CC588A0547; Wed, 24 Feb 2021 15:04:55 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A04931608D2; Wed, 24 Feb 2021 15:04:55 +0100 (CET) Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by mails.dpdk.org (Postfix) with ESMTP id D048F4069B for ; Wed, 24 Feb 2021 15:04:54 +0100 (CET) Received: by mail-ej1-f45.google.com with SMTP id mm21so2974728ejb.12 for ; Wed, 24 Feb 2021 06:04:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nfware-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/+NLB28OmmDlU1jwxTDM8m+1bhKxQ6gS0JZ8FkMO0lk=; b=Oftjj0/2LK0LFkfCXb9laG2Ja4ng2zH50jbdkxj/JWWZkf+TVIyafSeWO/Ar09R17y gR76BQuiRzHzfJfj5KZih2Mp+6GMFn43Wedy3J22dOa1NTcOt/sBpSGdDUn6CRWdPmw7 +pDjdGgd7khetwaQn67Hp8ygVCsc8BOkI+8ScPE9rM83aayPRDigJJ5TTQixkFmcqRuX bbKrlDqAoyySiZ0SoHN6QLUDvR5jHTFuvtN4jxzPLkoVk/YGxilK5x3E9MdXFKBL4DDc Y7POxm4XpLlFbkiB7nmwW3Ugipjiaj6ufVRvvFPt2OauFmsoW7MsGXSh0sfPcD7sd9L1 wUYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/+NLB28OmmDlU1jwxTDM8m+1bhKxQ6gS0JZ8FkMO0lk=; b=Q+vS9aaveJeHYJs3UGaHdmCAie62odgzUlt2nIx4BgGZXsUkGoYhO2cvtnUXY+0q2f DJLdUkdTmItB2xF4oucig83kE00vpnMCpqjZmyLRakt3IubRWMmYkZJi0p289VA1p+Gu VqWnHs80NgejBPvWZTOK8CYQREppgK6Y/cFzy4N+8zLv4HcsIRnPFnBASJ2RzD0DDG3e crfvlS3bBuJSFM6CGN2C/2CeQYBgKxqsr/v83F0iIfHGOVeIBcog6W0jgE+J0vTsp1xa bQHj2SuZCaYfDp5KrP9wimS4Gjwa2yZiqcGELuv0XfIDwTvGrhApftQo3dyRJfZtF6lC zpPw== X-Gm-Message-State: AOAM532toIA4snBU59Leu+eYc51rwp/ClVaQdFdRoLZ8vanZfOSyPNP/ RIEetDguKTcdTxM/fhYYXvFOm9GBE2xY72POMEWWZQ== X-Google-Smtp-Source: ABdhPJxT5Wxt5KhpxlEeBRG0rDrBNH9NTRFwOipteO+ypNgH80lkbygxg/xaiE9kdF0YPuj9GKDz6KvGrICNjdzyBwY= X-Received: by 2002:a17:906:5fc8:: with SMTP id k8mr13676281ejv.104.1614175494465; Wed, 24 Feb 2021 06:04:54 -0800 (PST) MIME-Version: 1.0 References: <20201126144613.4986-1-eladv6@gmail.com> <20210223134504.699-1-eladv6@gmail.com> In-Reply-To: From: Igor Ryzhov Date: Wed, 24 Feb 2021 17:04:42 +0300 Message-ID: To: Elad Nachman Cc: Ferruh Yigit , Stephen Hemminger , dev Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Elad, I understand your point. But the fact that this fix works for you doesn't mean that it will work for all DPDK users. For example, I provided two simple commands: "ip link set up" and "ip link set down". Your fix works for only one of them. For me, this is not a proper fix. It may work for you because you don't disable interfaces, but it will fail for users who do. On Wed, Feb 24, 2021 at 4:33 PM Elad Nachman wrote: > Currently KNI has a lot of issues with deadlocks locking the code, > after this commit, they are gone, and the code runs properly without > crashing. > That was tested with over 100 restarts of the application, which > previously required a hard reset of the board. > > I think this benefit overweights the complication of the code. > > The function is called with rtnl locked because this is how the Linux > kernel is designed to work - it is not designed to work with deferral > to user-space mid-function. > > To fix all such requests you need to reach an agreement with Linux > netdev, which is unlikely. > > Calling user-space can be done asynchronously, as Ferruh asked, but > then you will always have to return success, even on failure, as Linux > kernel does not have a mechanism to asynchronously report on failure > for such system calls. > > IMHO - weighting the non-reporting of failure versus how the code > looks (as it functions perfectly OK), I decided to go with > functionality. > > FYI, > > Elad. > > On Wed, Feb 24, 2021 at 2:50 PM Igor Ryzhov wrote: > > > > This looks more like a hack than an actual fix to me. > > > > After this commit: > > "ip link set up" is sent to the userspace with unlocked rtnl_lock > > "ip link set down" is sent to the userspace with locked rtnl_lock > > > > How is this really fixing anything? IMHO it only complicates the code. > > If talking with userspace under rtnl_lock is a problem, then we should > fix all such requests, not only part of them. > > If it is not a problem, then I don't see any point in merging this. > > > > On Tue, Feb 23, 2021 at 4:45 PM Elad Nachman wrote: > >> > >> This part of the series includes my fixes for the issues reported > >> by Ferruh and Igor on top of part 1 of the patch series: > >> > >> A. KNI sync lock is being locked while rtnl is held. > >> If two threads are calling kni_net_process_request() , > >> then the first one will take the sync lock, release rtnl lock then > sleep. > >> The second thread will try to lock sync lock while holding rtnl. > >> The first thread will wake, and try to lock rtnl, resulting in a > deadlock. > >> The remedy is to release rtnl before locking the KNI sync lock. > >> Since in between nothing is accessing Linux network-wise, > >> no rtnl locking is needed. > >> > >> B. There is a race condition in __dev_close_many() processing the > >> close_list while the application terminates. > >> It looks like if two vEth devices are terminating, > >> and one releases the rtnl lock, the other takes it, > >> updating the close_list in an unstable state, > >> causing the close_list to become a circular linked list, > >> hence list_for_each_entry() will endlessly loop inside > >> __dev_close_many() . > >> Since the description for the original patch indicate the > >> original motivation was bringing the device up, > >> I have changed kni_net_process_request() to hold the rtnl mutex > >> in case of bringing the device down since this is the path called > >> from __dev_close_many() , causing the corruption of the close_list. > >> > >> Signed-off-by: Elad Nachman > >> --- > >> v3: > >> * Include original patch and new patch as a series of patch, added a > >> comment to the new patch > >> v2: > >> * rebuild the patch as increment from patch 64106 > >> * fix comment and blank lines > >> --- > >> kernel/linux/kni/kni_net.c | 29 +++++++++++++++++++++-------- > >> 1 file changed, 21 insertions(+), 8 deletions(-) > >> > >> diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > >> index f0b6e9a8d..017e44812 100644 > >> --- a/kernel/linux/kni/kni_net.c > >> +++ b/kernel/linux/kni/kni_net.c > >> @@ -110,9 +110,26 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > >> void *resp_va; > >> uint32_t num; > >> int ret_val; > >> + int req_is_dev_stop = 0; > >> + > >> + /* For configuring the interface to down, > >> + * rtnl must be held all the way to prevent race condition > >> + * inside __dev_close_many() between two netdev instances of KNI > >> + */ > >> + if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > >> + req->if_up == 0) > >> + req_is_dev_stop = 1; > >> > >> ASSERT_RTNL(); > >> > >> + /* Since we need to wait and RTNL mutex is held > >> + * drop the mutex and hold reference to keep device > >> + */ > >> + if (!req_is_dev_stop) { > >> + dev_hold(dev); > >> + rtnl_unlock(); > >> + } > >> + > >> mutex_lock(&kni->sync_lock); > >> > >> /* Construct data */ > >> @@ -124,16 +141,8 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > >> goto fail; > >> } > >> > >> - /* Since we need to wait and RTNL mutex is held > >> - * drop the mutex and hold refernce to keep device > >> - */ > >> - dev_hold(dev); > >> - rtnl_unlock(); > >> - > >> ret_val = wait_event_interruptible_timeout(kni->wq, > >> kni_fifo_count(kni->resp_q), 3 * HZ); > >> - rtnl_lock(); > >> - dev_put(dev); > >> > >> if (signal_pending(current) || ret_val <= 0) { > >> ret = -ETIME; > >> @@ -152,6 +161,10 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > >> > >> fail: > >> mutex_unlock(&kni->sync_lock); > >> + if (!req_is_dev_stop) { > >> + rtnl_lock(); > >> + dev_put(dev); > >> + } > >> return ret; > >> } > >> > >> -- > >> 2.17.1 > >> >