From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id D1C99A0540
	for <public@inbox.dpdk.org>; Mon, 27 Jul 2020 19:33:18 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 57B9A1C027;
	Mon, 27 Jul 2020 19:33:18 +0200 (CEST)
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id DFA9F1023;
 Mon, 27 Jul 2020 19:33:14 +0200 (CEST)
IronPort-SDR: 0asNhBOS0RRmB8DbQqpdmslwiQMTd/jn8k+Wb8od7Jk4SRStAuOK/YyNIKTgl6xD38GYcYUMKz
 GDADtJvDNJRA==
X-IronPort-AV: E=McAfee;i="6000,8403,9695"; a="169195744"
X-IronPort-AV: E=Sophos;i="5.75,402,1589266800"; d="scan'208";a="169195744"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga004.jf.intel.com ([10.7.209.38])
 by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 27 Jul 2020 10:33:13 -0700
IronPort-SDR: IrtTWVcmDMXyZ5CLrxdSnaE+DfrsURfvprhdmV5eToDfGM5+Mk7gcnMkQkk9rI2S/316gUi7H7
 FCV3FrEpPyjA==
X-IronPort-AV: E=Sophos;i="5.75,402,1589266800"; d="scan'208";a="434015869"
Received: from fyigit-mobl.ger.corp.intel.com (HELO [10.213.196.62])
 ([10.213.196.62])
 by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 27 Jul 2020 10:33:12 -0700
To: Stephen Hemminger <stephen@networkplumber.org>,
 Thomas Monjalon <thomas@monjalon.net>
Cc: dev@dpdk.org, stable@dpdk.org
References: <20191222175551.17684-1-stephen@networkplumber.org>
 <ffa192d1-99d7-a636-c1bf-7f64dfde91b4@intel.com> <3101970.h16uAIiOU7@xps>
 <20200505171454.00274f10@hermes.lan>
From: Ferruh Yigit <ferruh.yigit@intel.com>
Autocrypt: addr=ferruh.yigit@intel.com; keydata=
 mQINBFXZCFABEADCujshBOAaqPZpwShdkzkyGpJ15lmxiSr3jVMqOtQS/sB3FYLT0/d3+bvy
 qbL9YnlbPyRvZfnP3pXiKwkRoR1RJwEo2BOf6hxdzTmLRtGtwWzI9MwrUPj6n/ldiD58VAGQ
 +iR1I/z9UBUN/ZMksElA2D7Jgg7vZ78iKwNnd+vLBD6I61kVrZ45Vjo3r+pPOByUBXOUlxp9
 GWEKKIrJ4eogqkVNSixN16VYK7xR+5OUkBYUO+sE6etSxCr7BahMPKxH+XPlZZjKrxciaWQb
 +dElz3Ab4Opl+ZT/bK2huX+W+NJBEBVzjTkhjSTjcyRdxvS1gwWRuXqAml/sh+KQjPV1PPHF
 YK5LcqLkle+OKTCa82OvUb7cr+ALxATIZXQkgmn+zFT8UzSS3aiBBohg3BtbTIWy51jNlYdy
 ezUZ4UxKSsFuUTPt+JjHQBvF7WKbmNGS3fCid5Iag4tWOfZoqiCNzxApkVugltxoc6rG2TyX
 CmI2rP0mQ0GOsGXA3+3c1MCdQFzdIn/5tLBZyKy4F54UFo35eOX8/g7OaE+xrgY/4bZjpxC1
 1pd66AAtKb3aNXpHvIfkVV6NYloo52H+FUE5ZDPNCGD0/btFGPWmWRmkPybzColTy7fmPaGz
 cBcEEqHK4T0aY4UJmE7Ylvg255Kz7s6wGZe6IR3N0cKNv++O7QARAQABtCVGZXJydWggWWln
 aXQgPGZlcnJ1aC55aWdpdEBpbnRlbC5jb20+iQJsBBMBCgBWAhsDAh4BAheABQsJCAcDBRUK
 CQgLBRYCAwEABQkKqZZ8FiEE0jZTh0IuwoTjmYHH+TPrQ98TYR8FAl6ha3sXGHZrczovL2tl
 eXMub3BlbnBncC5vcmcACgkQ+TPrQ98TYR8uLA//QwltuFliUWe60xwmu9sY38c1DXvX67wk
 UryQ1WijVdIoj4H8cf/s2KtyIBjc89R254KMEfJDao/LrXqJ69KyGKXFhFPlF3VmFLsN4XiT
 PSfxkx8s6kHVaB3O183p4xAqnnl/ql8nJ5ph9HuwdL8CyO5/7dC/MjZ/mc4NGq5O9zk3YRGO
 lvdZAp5HW9VKW4iynvy7rl3tKyEqaAE62MbGyfJDH3C/nV/4+mPc8Av5rRH2hV+DBQourwuC
 ci6noiDP6GCNQqTh1FHYvXaN4GPMHD9DX6LtT8Fc5mL/V9i9kEVikPohlI0WJqhE+vQHFzR2
 1q5nznE+pweYsBi3LXIMYpmha9oJh03dJOdKAEhkfBr6n8BWkWQMMiwfdzg20JX0o7a/iF8H
 4dshBs+dXdIKzPfJhMjHxLDFNPNH8zRQkB02JceY9ESEah3wAbzTwz+e/9qQ5OyDTQjKkVOo
 cxC2U7CqeNt0JZi0tmuzIWrfxjAUulVhBmnceqyMOzGpSCQIkvalb6+eXsC9V1DZ4zsHZ2Mx
 Hi+7pCksdraXUhKdg5bOVCt8XFmx1MX4AoV3GWy6mZ4eMMvJN2hjXcrreQgG25BdCdcxKgqp
 e9cMbCtF+RZax8U6LkAWueJJ1QXrav1Jk5SnG8/5xANQoBQKGz+yFiWcgEs9Tpxth15o2v59
 gXK5Ag0EV9ZMvgEQAKc0Db17xNqtSwEvmfp4tkddwW9XA0tWWKtY4KUdd/jijYqc3fDD54ES
 YpV8QWj0xK4YM0dLxnDU2IYxjEshSB1TqAatVWz9WtBYvzalsyTqMKP3w34FciuL7orXP4Ai
 bPtrHuIXWQOBECcVZTTOdZYGAzaYzxiAONzF9eTiwIqe9/oaOjTwTLnOarHt16QApTYQSnxD
 UQljeNvKYt1lZE/gAUUxNLWsYyTT+22/vU0GDUahsJxs1+f1yEr+OGrFiEAmqrzpF0lCS3f/
 3HVTU6rS9cK3glVUeaTF4+1SK5ZNO35piVQCwphmxa+dwTG/DvvHYCtgOZorTJ+OHfvCnSVj
 sM4kcXGjJPy3JZmUtyL9UxEbYlrffGPQI3gLXIGD5AN5XdAXFCjjaID/KR1c9RHd7Oaw0Pdc
 q9UtMLgM1vdX8RlDuMGPrj5sQrRVbgYHfVU/TQCk1C9KhzOwg4Ap2T3tE1umY/DqrXQgsgH7
 1PXFucVjOyHMYXXugLT8YQ0gcBPHy9mZqw5mgOI5lCl6d4uCcUT0l/OEtPG/rA1lxz8ctdFB
 VOQOxCvwRG2QCgcJ/UTn5vlivul+cThi6ERPvjqjblLncQtRg8izj2qgmwQkvfj+h7Ex88bI
 8iWtu5+I3K3LmNz/UxHBSWEmUnkg4fJlRr7oItHsZ0ia6wWQ8lQnABEBAAGJAjwEGAEKACYC
 GwwWIQTSNlOHQi7ChOOZgcf5M+tD3xNhHwUCXqFrngUJCKxSYAAKCRD5M+tD3xNhH3YWD/9b
 cUiWaHJasX+OpiuZ1Li5GG3m9aw4lR/k2lET0UPRer2Jy1JsL+uqzdkxGvPqzFTBXgx/6Byz
 EMa2mt6R9BCyR286s3lxVS5Bgr5JGB3EkpPcoJT3A7QOYMV95jBiiJTy78Qdzi5LrIu4tW6H
 o0MWUjpjdbR01cnj6EagKrDx9kAsqQTfvz4ff5JIFyKSKEHQMaz1YGHyCWhsTwqONhs0G7V2
 0taQS1bGiaWND0dIBJ/u0pU998XZhmMzn765H+/MqXsyDXwoHv1rcaX/kcZIcN3sLUVcbdxA
 WHXOktGTQemQfEpCNuf2jeeJlp8sHmAQmV3dLS1R49h0q7hH4qOPEIvXjQebJGs5W7s2vxbA
 5u5nLujmMkkfg1XHsds0u7Zdp2n200VC4GQf8vsUp6CSMgjedHeF9zKv1W4lYXpHp576ZV7T
 GgsEsvveAE1xvHnpV9d7ZehPuZfYlP4qgo2iutA1c0AXZLn5LPcDBgZ+KQZTzm05RU1gkx7n
 gL9CdTzVrYFy7Y5R+TrE9HFUnsaXaGsJwOB/emByGPQEKrupz8CZFi9pkqPuAPwjN6Wonokv
 ChAewHXPUadcJmCTj78Oeg9uXR6yjpxyFjx3vdijQIYgi5TEGpeTQBymLANOYxYWYOjXk+ae
 dYuOYKR9nbPv+2zK9pwwQ2NXbUBystaGyQ==
Message-ID: <da21a634-2a1a-7371-4233-97ee80da8248@intel.com>
Date: Mon, 27 Jul 2020 18:33:08 +0100
MIME-Version: 1.0
In-Reply-To: <20200505171454.00274f10@hermes.lan>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH] kni: fix kernel deadlock when
 using mlx devices
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
Errors-To: stable-bounces@dpdk.org
Sender: "stable" <stable-bounces@dpdk.org>

On 5/6/2020 1:14 AM, Stephen Hemminger wrote:
> On Wed, 18 Mar 2020 16:17:57 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
>> 17/01/2020 17:43, Ferruh Yigit:
>>> On 12/22/2019 5:55 PM, Stephen Hemminger wrote:  
>>>> This fixes a deadlock when using KNI with bifurcated drivers.
>>>> Bringing kni device up always times out when using Mellanox
>>>> devices.
>>>>
>>>> The kernel KNI driver sends message to userspace to complete
>>>> the request. For the case of bifurcated driver, this may involve
>>>> an additional request to kernel to change state. This request
>>>> would deadlock because KNI was holding the RTNL mutex.
>>>>
>>>> This was a bad design which goes back to the original code.
>>>> A workaround is for KNI driver to drop RTNL while waiting.
>>>> To prevent the device from disappearing while the operation
>>>> is in progress, it needs to hold reference to network device
>>>> while waiting.
>>>>
>>>> As an added benefit, an useless error check can also be removed.
>>>>
>>>> Fixes: 3fc5ca2f6352 ("kni: initial import")
>>>> Cc: stable@dpdk.org
>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>> ---  
>>>
>>> This patch cause a hang on my server, not sure what exactly was the problem but
>>> kernel log was continuously printing "Cannot send to req_q". Will dig more.  
>>
>> Ferruh, did you have a chance to check what is hanging?
>> Stephen, is there any news on your side?
>>
>>
> 
> It did not hang when I tested it. The bug report is still open
> 

Sorry for the delay, since I am working remotely I was worried about loosing the
connection to my server, finally I did create a virtual environment to test again.

I confirm the hang observed %100 when two different process updates the kni
interface, like two different process sets the mtu. Without this patch this
works fine.

I understand the motivation of the patch, but with change there is a possibility
to hang the server, which we can't allow, need to find another way. Can updating
mlx interface wait KNI interface operation to complete?