From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gaetan.rivet@6wind.com>
Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50])
 by dpdk.org (Postfix) with ESMTP id 374351B2DA
 for <dev@dpdk.org>; Fri, 19 Jan 2018 00:35:49 +0100 (CET)
Received: by mail-wm0-f50.google.com with SMTP id i186so141181wmi.4
 for <dev@dpdk.org>; Thu, 18 Jan 2018 15:35:49 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=6wind-com.20150623.gappssmtp.com; s=20150623;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:content-transfer-encoding:in-reply-to
 :user-agent; bh=qkyet+cXohYP6P2npeNTaV8j69z1PjmvklAYlbNd0mE=;
 b=CsEDWSBZwFS9+hHgo5S6IYHECyRyO80tkgtK498UK8WTm2ZT56GXULTGcI5Akf4NMU
 8KQ8SZGqIT74hVrRCZPuivgS3sSp/ablh/ujAHQHo6cYxv8s9udVoTTuPXvl/+yQnN6k
 yMjO88CtD2QMc0nMk2mb96WsUkovHZcbfj5t7Kn8Z2nDFqZIAd9gqr35cMi1bm7a6eMG
 Vq7G0ItQsVLSNNa/jlLDSnhDRx/e9mh6VksIv9bd0n/tPqZj2+JmUsbJNApRxCidGX+g
 OPMpr+1DH+PiL7aw6ay51awuIq5Sct5c7KuOuZWFK+ctWFlwl9mob6KfoXom6paYATtF
 PIJw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:content-transfer-encoding
 :in-reply-to:user-agent;
 bh=qkyet+cXohYP6P2npeNTaV8j69z1PjmvklAYlbNd0mE=;
 b=hJy2F3rwb5De9/Y7hlYlLqkID5HCyZ6EhQ2S80LaUMWzMmvPfHvE+ikubfyZVy6qWV
 4yE9s69fEnALh0NqqLWtLzOzultwTZJ1gkLnTp3g/O9dL457VNr6NiJKyd6Ak/gr4M2o
 Rcyf447CusN+ezj1P3U/Y9GBJcMmun3P53ZXqitanAUeKhsNHEd0d4bZtK9i6F3g74Lk
 26mXOIFqiQplgx8FjNZaWnIqIMTdz/3jjo19r3iyUoC9I6mP55i84fzVxDZKWbwf5w1+
 wDGrh9PyP4/rv98brPIcfsNhVkndBN8prH/qC+88ygmbdAM4uk8GrE8e4M5M9wUCjvK2
 669A==
X-Gm-Message-State: AKwxytfkJX5wFw9EFIekgxTtPX5UIAsN8tf9S/jJhXF1xlTXMKp4bTHY
 6YqsfZvw50eIHkAWFWq8GfwC4g==
X-Google-Smtp-Source: ACJfBotM0w86vL8lQj8r2DPR9yIcriaVfc9Wip8RfW6DcmqAKicCIRj7kL284kWS4Y8XbgSOsbjY6w==
X-Received: by 10.28.37.5 with SMTP id l5mr3064972wml.143.1516318548537;
 Thu, 18 Jan 2018 15:35:48 -0800 (PST)
Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com.
 [62.23.145.78])
 by smtp.gmail.com with ESMTPSA id 194sm56728wmu.37.2018.01.18.15.35.47
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Thu, 18 Jan 2018 15:35:47 -0800 (PST)
Date: Fri, 19 Jan 2018 00:35:34 +0100
From: =?iso-8859-1?Q?Ga=EBtan?= Rivet <gaetan.rivet@6wind.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>,
 Ophir Munk <ophirmu@mellanox.com>, dev@dpdk.org,
 Olga Shern <olgas@mellanox.com>, Matan Azrad <matan@mellanox.com>
Message-ID: <20180118233534.ayzy64wzrhskkbdv@bidouze.vm.6wind.com>
References: <1506203877-2090-1-git-send-email-ophirmu@mellanox.com>
 <20171023083612.GK3596@bidouze.vm.6wind.com>
 <32c10eb9-fa51-a409-4720-6a92c3b97398@intel.com>
 <16685617.u0SECMPC3f@xps>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <16685617.u0SECMPC3f@xps>
User-Agent: NeoMutt/20170113 (1.7.2)
Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v3] net/failsafe: fix calling
 device during RMV events
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jan 2018 23:35:49 -0000

On Thu, Jan 18, 2018 at 11:22:51PM +0100, Thomas Monjalon wrote:
> 29/11/2017 20:17, Ferruh Yigit:
> > >>> On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote:
> > >>>> This commit prevents control path operations from failing after a sub
> > >>>> device removal.
> > >>>>
> > >>>> Following are the failure steps:
> > >>>> 1. The physical device is removed due to change in one of PF
> > >>>> parameters (e.g. MTU) 2. The interrupt thread flags the device 3.
> > >>>> Within 2 seconds Interrupt thread initializes the actual device
> > >>>> removal, then every 2 seconds it tries to re-sync (plug in) the
> > >>>> device. The trials fail as long as VF parameter mismatches the PF
> > >>> parameter.
> > >>>> 4. A control thread initiates a control operation on failsafe which
> > >>>> initiates this operation on the device.
> > >>>> 5. A race condition occurs between the control thread and interrupt
> > >>>> thread when accessing the device data structures.
> > >>>>
> > >>>> This commit prevents the race condition in step 5. Before this commit
> > >>>> if a device was removed and then a control thread operation was
> > >>>> initiated on failsafe - in some cases failsafe called the sub device
> > >>>> operation instead of avoiding it. Such cases could lead to operations
> > >>> failures.
> [...]
> > 
> > Reminder of this patch remaining from previous release.
> 
> Gaetan, what is the decision for this possible race condition?

This patchset had several issues that I outlined.

> Can we try to fix it in 18.02?

These patches could go in with a rework. If you feel like it I can
review those fixes in the coming weeks if new versions are submitted.

-- 
Gaƫtan Rivet
6WIND