From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by dpdk.org (Postfix) with ESMTP id 374351B2DA for ; Fri, 19 Jan 2018 00:35:49 +0100 (CET) Received: by mail-wm0-f50.google.com with SMTP id i186so141181wmi.4 for ; Thu, 18 Jan 2018 15:35:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=qkyet+cXohYP6P2npeNTaV8j69z1PjmvklAYlbNd0mE=; b=CsEDWSBZwFS9+hHgo5S6IYHECyRyO80tkgtK498UK8WTm2ZT56GXULTGcI5Akf4NMU 8KQ8SZGqIT74hVrRCZPuivgS3sSp/ablh/ujAHQHo6cYxv8s9udVoTTuPXvl/+yQnN6k yMjO88CtD2QMc0nMk2mb96WsUkovHZcbfj5t7Kn8Z2nDFqZIAd9gqr35cMi1bm7a6eMG Vq7G0ItQsVLSNNa/jlLDSnhDRx/e9mh6VksIv9bd0n/tPqZj2+JmUsbJNApRxCidGX+g OPMpr+1DH+PiL7aw6ay51awuIq5Sct5c7KuOuZWFK+ctWFlwl9mob6KfoXom6paYATtF PIJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=qkyet+cXohYP6P2npeNTaV8j69z1PjmvklAYlbNd0mE=; b=hJy2F3rwb5De9/Y7hlYlLqkID5HCyZ6EhQ2S80LaUMWzMmvPfHvE+ikubfyZVy6qWV 4yE9s69fEnALh0NqqLWtLzOzultwTZJ1gkLnTp3g/O9dL457VNr6NiJKyd6Ak/gr4M2o Rcyf447CusN+ezj1P3U/Y9GBJcMmun3P53ZXqitanAUeKhsNHEd0d4bZtK9i6F3g74Lk 26mXOIFqiQplgx8FjNZaWnIqIMTdz/3jjo19r3iyUoC9I6mP55i84fzVxDZKWbwf5w1+ wDGrh9PyP4/rv98brPIcfsNhVkndBN8prH/qC+88ygmbdAM4uk8GrE8e4M5M9wUCjvK2 669A== X-Gm-Message-State: AKwxytfkJX5wFw9EFIekgxTtPX5UIAsN8tf9S/jJhXF1xlTXMKp4bTHY 6YqsfZvw50eIHkAWFWq8GfwC4g== X-Google-Smtp-Source: ACJfBotM0w86vL8lQj8r2DPR9yIcriaVfc9Wip8RfW6DcmqAKicCIRj7kL284kWS4Y8XbgSOsbjY6w== X-Received: by 10.28.37.5 with SMTP id l5mr3064972wml.143.1516318548537; Thu, 18 Jan 2018 15:35:48 -0800 (PST) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id 194sm56728wmu.37.2018.01.18.15.35.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 18 Jan 2018 15:35:47 -0800 (PST) Date: Fri, 19 Jan 2018 00:35:34 +0100 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Thomas Monjalon Cc: Ferruh Yigit , Ophir Munk , dev@dpdk.org, Olga Shern , Matan Azrad Message-ID: <20180118233534.ayzy64wzrhskkbdv@bidouze.vm.6wind.com> References: <1506203877-2090-1-git-send-email-ophirmu@mellanox.com> <20171023083612.GK3596@bidouze.vm.6wind.com> <32c10eb9-fa51-a409-4720-6a92c3b97398@intel.com> <16685617.u0SECMPC3f@xps> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <16685617.u0SECMPC3f@xps> User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v3] net/failsafe: fix calling device during RMV events X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jan 2018 23:35:49 -0000 On Thu, Jan 18, 2018 at 11:22:51PM +0100, Thomas Monjalon wrote: > 29/11/2017 20:17, Ferruh Yigit: > > >>> On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote: > > >>>> This commit prevents control path operations from failing after a sub > > >>>> device removal. > > >>>> > > >>>> Following are the failure steps: > > >>>> 1. The physical device is removed due to change in one of PF > > >>>> parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > > >>>> Within 2 seconds Interrupt thread initializes the actual device > > >>>> removal, then every 2 seconds it tries to re-sync (plug in) the > > >>>> device. The trials fail as long as VF parameter mismatches the PF > > >>> parameter. > > >>>> 4. A control thread initiates a control operation on failsafe which > > >>>> initiates this operation on the device. > > >>>> 5. A race condition occurs between the control thread and interrupt > > >>>> thread when accessing the device data structures. > > >>>> > > >>>> This commit prevents the race condition in step 5. Before this commit > > >>>> if a device was removed and then a control thread operation was > > >>>> initiated on failsafe - in some cases failsafe called the sub device > > >>>> operation instead of avoiding it. Such cases could lead to operations > > >>> failures. > [...] > > > > Reminder of this patch remaining from previous release. > > Gaetan, what is the decision for this possible race condition? This patchset had several issues that I outlined. > Can we try to fix it in 18.02? These patches could go in with a rework. If you feel like it I can review those fixes in the coming weeks if new versions are submitted. -- Gaëtan Rivet 6WIND