From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by dpdk.org (Postfix) with ESMTP id 535511B2EB for ; Thu, 18 Jan 2018 23:23:28 +0100 (CET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id BF83020B0B; Thu, 18 Jan 2018 17:23:25 -0500 (EST) Received: from frontend2 ([10.202.2.161]) by compute1.internal (MEProxy); Thu, 18 Jan 2018 17:23:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=mesmtp; bh=8csF82qcT9SwS+EyJkXTHxqdg9 TY9gmdPB3rCONaBDo=; b=KX0jqvim/lGGLl76D7DQ0PxLYeAjipzdcTflinedWv ZRiJN93MCL/8mAj0U7HqhvSlJDXZOwzsF+0O/ZUp3YORbeFvEHcvXK05cisIsbpM 3RPOmXNzHiskoNkzVJf37Dc9ihuQL2gU3fiBL8BGfm4Y8Xx8f1jNbeYyNFjVpnee E= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=8csF82 qcT9SwS+EyJkXTHxqdg9TY9gmdPB3rCONaBDo=; b=dOG9uqNu355Haj8dNKjYOk xzNQH5xTLo/J7NKuB8KxchvuKCimtrPy8FWAsoifDJ2DRPOPRfH0OtSWeajCfnmw YEt6dRvFm9BSNAkwYtjd17W53uH4HCaU2VnJkm/i5aAM3DfMGqme2pPMpasAqh0i o+eMw/sXsSnzao48YN6ftRU1OFhWx2fSJckvRh087Ij34rPDDOs24MQKOAm+NRnT xZ581PEfJS8GxTyLMlnbhstBsV8iv7wnfmBinOYxEiNnzCgmjc5n8/AS3xznRw/T P06NoOG1UL86mvgbXrPQaU5/SXQCaA19VgfgWb9MSQQ57+T76MJFTyFcZzblM5FQ == X-ME-Sender: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 68324240F8; Thu, 18 Jan 2018 17:23:25 -0500 (EST) From: Thomas Monjalon To: =?ISO-8859-1?Q?Ga=EBtan?= Rivet Cc: Ferruh Yigit , Ophir Munk , dev@dpdk.org, Olga Shern , Matan Azrad Date: Thu, 18 Jan 2018 23:22:51 +0100 Message-ID: <16685617.u0SECMPC3f@xps> In-Reply-To: <32c10eb9-fa51-a409-4720-6a92c3b97398@intel.com> References: <1506203877-2090-1-git-send-email-ophirmu@mellanox.com> <20171023083612.GK3596@bidouze.vm.6wind.com> <32c10eb9-fa51-a409-4720-6a92c3b97398@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v3] net/failsafe: fix calling device during RMV events X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jan 2018 22:23:28 -0000 29/11/2017 20:17, Ferruh Yigit: > >>> On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote: > >>>> This commit prevents control path operations from failing after a sub > >>>> device removal. > >>>> > >>>> Following are the failure steps: > >>>> 1. The physical device is removed due to change in one of PF > >>>> parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > >>>> Within 2 seconds Interrupt thread initializes the actual device > >>>> removal, then every 2 seconds it tries to re-sync (plug in) the > >>>> device. The trials fail as long as VF parameter mismatches the PF > >>> parameter. > >>>> 4. A control thread initiates a control operation on failsafe which > >>>> initiates this operation on the device. > >>>> 5. A race condition occurs between the control thread and interrupt > >>>> thread when accessing the device data structures. > >>>> > >>>> This commit prevents the race condition in step 5. Before this commit > >>>> if a device was removed and then a control thread operation was > >>>> initiated on failsafe - in some cases failsafe called the sub device > >>>> operation instead of avoiding it. Such cases could lead to operations > >>> failures. [...] > > Reminder of this patch remaining from previous release. Gaetan, what is the decision for this possible race condition? Can we try to fix it in 18.02?