From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 13113A0C4C; Tue, 23 Nov 2021 17:22:24 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CFA5D40040; Tue, 23 Nov 2021 17:22:23 +0100 (CET) Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by mails.dpdk.org (Postfix) with ESMTP id 732164003C for ; Tue, 23 Nov 2021 17:22:22 +0100 (CET) Received: by mail-pj1-f42.google.com with SMTP id fv9-20020a17090b0e8900b001a6a5ab1392so3364704pjb.1 for ; Tue, 23 Nov 2021 08:22:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oLljhRiZZic5g6heErgd3ZQve1gIKEEDoNbsOKnmS0o=; b=q8MRKbqDa0coO5gEURt8oHnv7M3zSclznltHXOAfV1ng/jcflsA/ZAI0dxHngw8WCp kCchRicO++T7KAvf15QsKYt/cwzEpkcprEqi9N0BN2gox3Pym8/9rTOjwqd2uiNmOI1b r/QZ40YFt+gUXwY/k/e3bFt9/XYBEeMvnSzkjm41UVyB3cAyOziw2KZEX/Im/T0R1CcB DThLIhmdsotPophYy7VuJCrdjdhXcqLeY9kVdr9J2XOX90W/BcAMEZ8h0GQlJsFVvT9r l3Tr5ilACnEfBjsl2XANe437vfw2+JM7Xa+YbiUKsXdoeuyHELgj2Hd7rBTio0pJ0F0q UF/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oLljhRiZZic5g6heErgd3ZQve1gIKEEDoNbsOKnmS0o=; b=NPwKOcI18J69TlU98/6XnCfOibnLOIFU/Z+YVdUOXGoupNb9Ph7wcsaI+ZbO5xumKM QkzlN9eH2qQgdFznRCqJZp2jch6/IsIVnjoH+05hHTG9pQDIDkJC6fUq+6Bzdos2XlOW 586cc8US+ivJCshxrki9v6OBBzkwu5pjYJ1xmIYdLswHqc9gSwnqmrDniudN/2zItn/y cMCZpEVLTBMg4kw3RWAsWWXOflXa1GwU5fohpX7oEYaIcFDqvV9ENHL5e9J2aqTfu5Gi FLujWgDJHQmSEr66vCMZvVMFLEZcwOjvDNIOup+fSxg480WoByVRHVLJq7PBtOfajzJH CoJw== X-Gm-Message-State: AOAM533Geu6LaXhN2FCbRvhZLfqMWF4Por3sEvQx08o0lWbxKqtiuAc+ sU4RDKht/53zrcprOybXI62t6Q== X-Google-Smtp-Source: ABdhPJxZAe1bQOyOFEIV28xQysDMePn1TY8mB1Lv2JxCxwCBScSltOf4kZAjbW3mAUJStkRYzPKh9A== X-Received: by 2002:a17:902:aa86:b0:145:90c:f4aa with SMTP id d6-20020a170902aa8600b00145090cf4aamr8394318plr.79.1637684541534; Tue, 23 Nov 2021 08:22:21 -0800 (PST) Received: from hermes.local (204-195-33-123.wavecable.com. [204.195.33.123]) by smtp.gmail.com with ESMTPSA id mh1sm1802453pjb.6.2021.11.23.08.22.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Nov 2021 08:22:21 -0800 (PST) Date: Tue, 23 Nov 2021 08:22:18 -0800 From: Stephen Hemminger To: Ferruh Yigit Cc: Thomas Monjalon , David Marchand , Elad Nachman , , , "Igor Ryzhov" , Eric Christian Subject: Re: [PATCH] kni: restrict bifurcated device support Message-ID: <20211123082218.11b894b0@hermes.local> In-Reply-To: References: <20211008235830.127167-1-ferruh.yigit@intel.com> <20211008190335.1fdc8f4a@hermes.local> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, 23 Nov 2021 09:54:01 +0000 Ferruh Yigit wrote: > On 10/9/2021 3:03 AM, Stephen Hemminger wrote: > > On Sat, 9 Oct 2021 00:58:30 +0100 > > Ferruh Yigit wrote: > > > >> To enable bifurcated device support, rtnl_lock is released before calling > >> userspace callbacks and asynchronous requests are enabled. > >> > >> But these changes caused more issues, like bug #809, #816. To reduce the > >> scope of the problems, the bifurcated device support related changes are > >> only enabled when it is requested explicitly with new 'enable_bifurcated' > >> module parameter. > >> And bifurcated device support is disabled by default. > >> > >> So the bifurcated device related problems are isolated and they can be > >> fixed without impacting all use cases. > >> > >> Bugzilla ID: 816 > >> Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device") > >> Cc: stable@dpdk.org > >> > >> Signed-off-by: Ferruh Yigit > > > > Calling userspace with semaphore held is still risky and buggy. > > There is no guarantee that the userspace DPDK application will be well behaved. > > And if it is not, the spinning holding RTNL would break any other network management > > functions in the kernel. > > > > Hi Stephen, > > Because of what you described above, we tried the option that releases the RTNL > lock before userspace callback, > That caused a deadlock in the ifdown, and we add a workaround for it. > > But now we are facing with two more issues because of the rtnl lock release, > - Bugzilla ID: 816, MAC set cause kernel deadlock > - Some requests are overwritten (because of the workaround we put for ifdown) > > This patch just converts the default behavior to what it was previously. > Releasing rtnl lock still supported with the module param, but I think it > is not good idea to make it default while it is know that it is buggy. > > @Thomas, @David, > Can it be possible to get this patch for -rc4? Since it has potential > to cause a deadlock in kernel as it is. > > I can send a new version with documentation update. Is it possible for userspace to choose the correct behavior instead of module option. Users will do it wrong. > > > These are the kind of problems that make me think it there should be a > > big "DO NOT USE THIS" onto KNI. Maybe make it print a big nasty message > > (see kernel VFIO without IOMMU description) or mark kernel as tainted?? > > > > See: https://fedoraproject.org/wiki/KernelStagingPolicy > > > > Something like: > > > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > > index 611719b5ee27..d47fc6133cbe 100644 > > --- a/kernel/linux/kni/kni_net.c > > +++ b/kernel/linux/kni/kni_net.c > > @@ -838,6 +838,14 @@ kni_net_init(struct net_device *dev) > > dev->header_ops = &kni_net_header_ops; > > dev->ethtool_ops = &kni_net_ethtool_ops; > > dev->watchdog_timeo = WD_TIMEOUT; > > + > > + /* > > + * KNI is unsafe since it requires calling userspace to do > > + * control operations. And the overall quality according to > > + * kernel standards is the same as devices in staging. > > + */ > > + add_taint(TAINT_CRAP, LOCKDEP_STILL_OK); > > + netdev_warn(dev, "Adding kernel taint for KNI because it is not safe\n"); > > I am for discussing above separate from this patch, since this patch > restores the behavior that exist since KNI module exists. Any user of KNI will already get tainted because KNI is out of tree driver.