From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2A8A5A054F; Mon, 1 Mar 2021 21:28:04 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8ADDA4069E; Mon, 1 Mar 2021 21:28:03 +0100 (CET) Received: from mail-io1-f44.google.com (mail-io1-f44.google.com [209.85.166.44]) by mails.dpdk.org (Postfix) with ESMTP id 224C540041 for ; Mon, 1 Mar 2021 21:28:02 +0100 (CET) Received: by mail-io1-f44.google.com with SMTP id k2so13780124ioh.5 for ; Mon, 01 Mar 2021 12:28:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FOD1QlPubnyGaYXjESW7UteWnSy7jgoDjwbbbNJ+0qM=; b=oJi2nrWVSmtq7VP45WSF0z/6x/wdny0iH//q8RUyc1TPP1PAHny++HkYNu6mGk7+x5 U+vuGDFo2yaOYcFYz/7jrkzVJfDGnT964LPC3fSMGIaJhTySl3Ec4Zp3Pb6YUtSIyCDs SPpDbAIMUd6sThGyX52bwcd5zAjrm1eI4NJcbkzsb0kIDgKMtHquI/Z8DaZ/ppawTF5o IZfZHMzhuciP3lkCWL0Pf9Xkp7VRgZt7oJxESmgo7e8gt5HafR0jtyZ09iBUCmmF9eGp bbClfCItwkJZ3ec3825/mkqjjQc+wB+sdUnwMqWGWYfW1aFL+Eao2bIlqgIOdhG+SzY6 WPLQ== X-Gm-Message-State: AOAM531KAMTJ9Uvrab+EIVe/hG08+iq9gDLt+Aew6SZU1VtaoePL3j1Y I18alvZkR045go3XsX1dhw+/jeyKtzDRHPH3GGI= X-Google-Smtp-Source: ABdhPJwmqzqL5QzSx1OcXgdu1+2xxH9ph4TdwEN02GQ2NcWKz0JIMO6tGKsvWXRWtNgrfr0yaMWnpil+BuhihX6b/WE= X-Received: by 2002:a6b:d21a:: with SMTP id q26mr15087699iob.128.1614630481376; Mon, 01 Mar 2021 12:28:01 -0800 (PST) MIME-Version: 1.0 References: <20201126144613.4986-1-eladv6@gmail.com> <20210225143239.14220-1-eladv6@gmail.com> <20210225143239.14220-2-eladv6@gmail.com> <20210226074817.7e9e0a71@hermes.local> In-Reply-To: From: Dan Gora Date: Mon, 1 Mar 2021 17:27:25 -0300 Message-ID: To: Igor Ryzhov Cc: Elad Nachman , Stephen Hemminger , Ferruh Yigit , dev Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi All, Sorry to butt in on this, but I fixed this same issue about 3 years ago in my application, but I was never able to get the changes integrated and eventually just gave up trying. The rule with KNI is: 1) The app should have a separate control thread per rte_kni which just spins calling rte_kni_handle_request(). This ensures that other threads calling rte_kni_XXX functions will always get a response. 2) In order to deal with lockups and timeouts when closing the device, I sent patches which separated the closing process into two steps: rte_kni_release() which would unregister the underlying netdev, then rte_kni_free() which would free the KNI portions of the KNI device. When rte_kni_release() is called the kernel netdev is unregistered and a response is sent back to the application, the control thread calling rte_kni_handle_request() is still running, so the application will still get a response back from the kernel and not lock up, the application then kills the control thread so that rte_kni_handle_request() is not called again, then the application calls rte_kni_free() which frees all of the FIFOs and closes the device. If anyone is interested the patches are probably still floating around patchwork. If not you can check them out here: https://github.com/danielgora/dpdk.git thanks- dan On Mon, Mar 1, 2021 at 5:10 AM Igor Ryzhov wrote: > > Stephen, > > No, I don't have a better proposal, but I think it is not correct to change > the behavior of KNI (making link down without a real response). > Even though we know that communicating with userspace under rtnl_lock is a > bad idea, it works as it is for many years already. > > Elad, > > I agree with you that KNI should be removed from the main tree if it is not > possible to fix this __dev_close_many issue. > There were discussions about this multiple times already, but no one is > working on this AFAIK. > Last time the discussion was a month ago: > https://www.mail-archive.com/dev@dpdk.org/msg196033.html > > Igor > > On Fri, Feb 26, 2021 at 8:43 PM Elad Nachman wrote: > > > The way the kernel handles its locks and lists for the dev close many > > path, there is no way you can go around this with rtnl unlocked : > > " > > > > There is a race condition in __dev_close_many() processing the > > close_list while the application terminates. > > It looks like if two vEth devices are terminating, > > and one releases the rtnl lock, the other takes it, > > updating the close_list in an unstable state, > > causing the close_list to become a circular linked list, > > hence list_for_each_entry() will endlessly loop inside > > __dev_close_many() . > > > > " > > And I don't expect David Miller will bend the kernel networking for DPDK > > or KNI. > > > > But - Stephen - if you can personally convince David to accept a > > kernel patch which will separate the close_list locking mechanism to a > > separate (RCU?) lock, then I can introduce first a patch to the kernel > > which will add a lock for the close_list, this way rtnl can be > > unlocked for the if down case. > > > > After that kernel patch, your original patch + relocation of the sync > > mutex locking will do the job . > >lockups > > Otherwise, rtnl has to be kept locked all of the way for the if down > > case in order to prevent corruption causing a circular linked list out > > of the close_list, causing a hang in the kernel. > >lockups > > Currently, the rtnl lock is the only thing keeping the close_list from > > corruption. > > > > If you doubt rtnl cannot be unlocked for dev close path, you can > > consult David for his opinion, as I think it is critical to understand > > what the kernel can or cannot do, or expects to be done before we can > > unlock its locks as we wish inside rte_kni.ko . > > > > Otherwise, if we are still in disagreement on how to patch this set of > > problems, I think the responsible way around it is to completely > > remove kni from the main dpdk tree and move it to dpdk-kmods > > repository. > > > > I know BSD style open-source does not carry legal responsibility from > > the developers, but I think when a bunch of developers know a piece of > > code is highly buggy, they should not leave it for countless new users > > to bounce their head desperately against, if they cannot agree on a > > correct way to solve the bunch of problems, of which I think we all > > agree exist (we just do not agree on the proper solution or patch)... > > > > That's my two cents, > > > > Elad. > > > > On Fri, Feb 26, 2021 at 5:49 PM Stephen Hemminger > > wrote: > > > > > > On Fri, 26 Feb 2021 00:01:01 +0300 > > > Igor Ryzhov wrote: > > > > > > > Hi Elad, > > > > > > > > Thanks for the patch, but this is still NACK from me. > > > > > > > > The only real advantage of KNI over other exceptional-path techniques > > > > like virtio-user is the ability to configure DPDK-managed interfaces > > > > directly > > > > from the kernel using well-known utils like iproute2. A very important > > part > > > > of this is getting responses from the DPDK app and knowing the actual > > > > result of command execution. > > > > If you're making async requests to the application and you don't know > > > > the result, then what's the point of using KNI at all? > > > > > > > > Igor > > > > > > Do you have a better proposal that keeps the request result but does not > > > call userspace with lock held. > > > > > > PS: I also have strong dislike of KNI, as designed it would have been > > rejected > > > by Linux kernel developers. A better solution would be userspace > > version of > > > something like devlink devices. But doing control operations by proxy is > > > a locking nightmare. > >