From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B57DDA0C52 for ; Tue, 23 Nov 2021 17:22:24 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A6321410E6; Tue, 23 Nov 2021 17:22:24 +0100 (CET) Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by mails.dpdk.org (Postfix) with ESMTP id 86C1E40040 for ; Tue, 23 Nov 2021 17:22:22 +0100 (CET) Received: by mail-pl1-f179.google.com with SMTP id b11so17482227pld.12 for ; Tue, 23 Nov 2021 08:22:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oLljhRiZZic5g6heErgd3ZQve1gIKEEDoNbsOKnmS0o=; b=q8MRKbqDa0coO5gEURt8oHnv7M3zSclznltHXOAfV1ng/jcflsA/ZAI0dxHngw8WCp kCchRicO++T7KAvf15QsKYt/cwzEpkcprEqi9N0BN2gox3Pym8/9rTOjwqd2uiNmOI1b r/QZ40YFt+gUXwY/k/e3bFt9/XYBEeMvnSzkjm41UVyB3cAyOziw2KZEX/Im/T0R1CcB DThLIhmdsotPophYy7VuJCrdjdhXcqLeY9kVdr9J2XOX90W/BcAMEZ8h0GQlJsFVvT9r l3Tr5ilACnEfBjsl2XANe437vfw2+JM7Xa+YbiUKsXdoeuyHELgj2Hd7rBTio0pJ0F0q UF/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oLljhRiZZic5g6heErgd3ZQve1gIKEEDoNbsOKnmS0o=; b=SaqwLKUWV1mItLy6KHQJksaURK8GEGyRwX8/A+S1KXarL8vzH4jwCydn+nm7iOhs8j wpeW1UaAit7cSXTISn1mwhYNHUGHNNcNqDXDT7I58WtafMzwW6pcBo/JMAbqxgbx7Vg3 zpmrq6vHf4a9YoWs7R1SvGogvatF3s6Fgj0+03jJ28tXxAFAm3pMSbbX9VQVqUS0QdVl LFyG8deAO/04GMuVy4Wgj4INlIWbmWPfB45IyuOs2BEqZh1mmHZb/OFAmI7y+33zmUYM m2Ht2KQRP1AmDAlGFh7TeUXGojlLaAO2yQQkHHH3nXtymggV0NwGIE6ZMIWB/4StduQZ 05kQ== X-Gm-Message-State: AOAM530e2v/T3FfIr0NqzAhgvgLOobuk9crxUBb8Sc3mVUG9BSVCkgbU JGzhkZol6Sti0ZrO077RircnkQ== X-Google-Smtp-Source: ABdhPJxZAe1bQOyOFEIV28xQysDMePn1TY8mB1Lv2JxCxwCBScSltOf4kZAjbW3mAUJStkRYzPKh9A== X-Received: by 2002:a17:902:aa86:b0:145:90c:f4aa with SMTP id d6-20020a170902aa8600b00145090cf4aamr8394318plr.79.1637684541534; Tue, 23 Nov 2021 08:22:21 -0800 (PST) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id mh1sm1802453pjb.6.2021.11.23.08.22.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Nov 2021 08:22:21 -0800 (PST) Date: Tue, 23 Nov 2021 08:22:18 -0800 From: Stephen Hemminger To: Ferruh Yigit Cc: Thomas Monjalon , David Marchand , Elad Nachman , , , "Igor Ryzhov" , Eric Christian Subject: Re: [PATCH] kni: restrict bifurcated device support Message-ID: <20211123082218.11b894b0@hermes.local> In-Reply-To: References: <20211008235830.127167-1-ferruh.yigit@intel.com> <20211008190335.1fdc8f4a@hermes.local> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org On Tue, 23 Nov 2021 09:54:01 +0000 Ferruh Yigit wrote: > On 10/9/2021 3:03 AM, Stephen Hemminger wrote: > > On Sat, 9 Oct 2021 00:58:30 +0100 > > Ferruh Yigit wrote: > > > >> To enable bifurcated device support, rtnl_lock is released before calling > >> userspace callbacks and asynchronous requests are enabled. > >> > >> But these changes caused more issues, like bug #809, #816. To reduce the > >> scope of the problems, the bifurcated device support related changes are > >> only enabled when it is requested explicitly with new 'enable_bifurcated' > >> module parameter. > >> And bifurcated device support is disabled by default. > >> > >> So the bifurcated device related problems are isolated and they can be > >> fixed without impacting all use cases. > >> > >> Bugzilla ID: 816 > >> Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device") > >> Cc: stable@dpdk.org > >> > >> Signed-off-by: Ferruh Yigit > > > > Calling userspace with semaphore held is still risky and buggy. > > There is no guarantee that the userspace DPDK application will be well behaved. > > And if it is not, the spinning holding RTNL would break any other network management > > functions in the kernel. > > > > Hi Stephen, > > Because of what you described above, we tried the option that releases the RTNL > lock before userspace callback, > That caused a deadlock in the ifdown, and we add a workaround for it. > > But now we are facing with two more issues because of the rtnl lock release, > - Bugzilla ID: 816, MAC set cause kernel deadlock > - Some requests are overwritten (because of the workaround we put for ifdown) > > This patch just converts the default behavior to what it was previously. > Releasing rtnl lock still supported with the module param, but I think it > is not good idea to make it default while it is know that it is buggy. > > @Thomas, @David, > Can it be possible to get this patch for -rc4? Since it has potential > to cause a deadlock in kernel as it is. > > I can send a new version with documentation update. Is it possible for userspace to choose the correct behavior instead of module option. Users will do it wrong. > > > These are the kind of problems that make me think it there should be a > > big "DO NOT USE THIS" onto KNI. Maybe make it print a big nasty message > > (see kernel VFIO without IOMMU description) or mark kernel as tainted?? > > > > See: https://fedoraproject.org/wiki/KernelStagingPolicy > > > > Something like: > > > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > > index 611719b5ee27..d47fc6133cbe 100644 > > --- a/kernel/linux/kni/kni_net.c > > +++ b/kernel/linux/kni/kni_net.c > > @@ -838,6 +838,14 @@ kni_net_init(struct net_device *dev) > > dev->header_ops = &kni_net_header_ops; > > dev->ethtool_ops = &kni_net_ethtool_ops; > > dev->watchdog_timeo = WD_TIMEOUT; > > + > > + /* > > + * KNI is unsafe since it requires calling userspace to do > > + * control operations. And the overall quality according to > > + * kernel standards is the same as devices in staging. > > + */ > > + add_taint(TAINT_CRAP, LOCKDEP_STILL_OK); > > + netdev_warn(dev, "Adding kernel taint for KNI because it is not safe\n"); > > I am for discussing above separate from this patch, since this patch > restores the behavior that exist since KNI module exists. Any user of KNI will already get tainted because KNI is out of tree driver.