From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gaetan.rivet@6wind.com>
Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45])
 by dpdk.org (Postfix) with ESMTP id 6C6F71D90
 for <dev@dpdk.org>; Thu, 14 Dec 2017 11:49:10 +0100 (CET)
Received: by mail-wm0-f45.google.com with SMTP id t8so10400317wmc.3
 for <dev@dpdk.org>; Thu, 14 Dec 2017 02:49:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=6wind-com.20150623.gappssmtp.com; s=20150623;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:content-transfer-encoding:in-reply-to
 :user-agent; bh=vRCUZnL7uAcs1txH24upe1YZj5js6/3NL1FXw6X6mEs=;
 b=ZyisbSCazRnvkJQwjZy2Lh8PKzlorJ/XFS1LalZA4RAw4Jwu4leoc20x7LFdGVZAhQ
 infIeuKfGuekj4xUDpwVBLN/AVRsydPxMdbR0sGPKhQG1YE5EmsEvyuALCfYl2Zp8ZGp
 NpwDcoP+jSBgScivcZdayr7FvinDuDPxoh7EtbqS7/dDHvBzQl9OIWR7uFATxH3BrhZP
 4FzoVBcqsKLRRCpwFyzGS++Vk5UWQgHADqB0WFYB9ooVbmiKzzEnSFq7K7ZLTblzTT02
 OOVh+rAmK/YMA5Xa6e9fOn8wQxk0h9SS39wKeC2YHpZkC5Ew8RoeTDQkbGdwwON1IBA0
 edRQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:content-transfer-encoding
 :in-reply-to:user-agent;
 bh=vRCUZnL7uAcs1txH24upe1YZj5js6/3NL1FXw6X6mEs=;
 b=BOQppCkpHD66yCwaOdk3Ny67qLDeBS+U4fBzdQDyVLIEsuO4QprinojLFQTCCdidEb
 /SQqEiSs9H4tiV7Lz87Djrdgv5qzAlNOUtNXq6Hn4CfWB99c1QG9yiWllmWLsSThtrHg
 0DN5qmMhSS5AXgfy5PbXoYBoSXw+w19ZA8P99BnRehJVH7rxIXvhKNs5AUJbLTOd0F+7
 dUibCadbU0EBIhzQFWm3g2ok/XspRS5E1G7MWR+7n12dlijQJQWQtJiVe98HGQokZy0L
 tw0pgcEQAnH/Jewm3isfa+BU7rqv1GlPojJBlzDOjyS2HgBvRq+EcJjnMhcrc823cwx4
 eVrA==
X-Gm-Message-State: AKGB3mJDYKauSTeVAv7WMGPJZVf/AZZB3wu9vK4vaS4dQk4K6t5hKaRD
 4qDIt2FF5qC7NuHWFvcWFZguxw==
X-Google-Smtp-Source: ACJfBotWmzcWAiu8eZl40K7vcQ0Zs9lRW2ulcmf1TGUjgDIF2U3xClYWEm3pmEdly6OhDKpo30dfJg==
X-Received: by 10.80.140.33 with SMTP id p30mr11688983edp.19.1513248549769;
 Thu, 14 Dec 2017 02:49:09 -0800 (PST)
Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com.
 [62.23.145.78])
 by smtp.gmail.com with ESMTPSA id a38sm3030633edf.3.2017.12.14.02.49.08
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Thu, 14 Dec 2017 02:49:08 -0800 (PST)
Date: Thu, 14 Dec 2017 11:48:56 +0100
From: =?iso-8859-1?Q?Ga=EBtan?= Rivet <gaetan.rivet@6wind.com>
To: Matan Azrad <matan@mellanox.com>
Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>,
 Thomas Monjalon <thomas@monjalon.net>,
 "dev@dpdk.org" <dev@dpdk.org>, "stable@dpdk.org" <stable@dpdk.org>
Message-ID: <20171214104856.d5qgnawuzb54l36z@bidouze.vm.6wind.com>
References: <1509637324-13525-1-git-send-email-matan@mellanox.com>
 <1513175370-16583-1-git-send-email-matan@mellanox.com>
 <1513175370-16583-5-git-send-email-matan@mellanox.com>
 <20171213151641.g42zr7zupbsdgxsv@bidouze.vm.6wind.com>
 <HE1PR0502MB3659B4A6797E3A7A2FC78834D2350@HE1PR0502MB3659.eurprd05.prod.outlook.com>
 <20171213160916.e3rmxmhfhqz72wco@bidouze.vm.6wind.com>
 <20171213215545.kywwximn2g5xm5x5@bidouze.vm.6wind.com>
 <HE1PR0502MB36596023F82792ABB3D84C7CD20A0@HE1PR0502MB3659.eurprd05.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <HE1PR0502MB36596023F82792ABB3D84C7CD20A0@HE1PR0502MB3659.eurprd05.prod.outlook.com>
User-Agent: NeoMutt/20170113 (1.7.2)
Subject: Re: [dpdk-dev] [PATCH v2 4/4] net/failsafe: fix removed device
	handling
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Dec 2017 10:49:10 -0000

On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote:
> Hi Gaetan
> 

<snip>

> > >
> > > If you add this check in the iterator itself, you would skip removed
> > > devices before attempting operating upon them, right?
> > >
> > > Then it should probably help with your issue, unless you tested it and
> > > verified that it didnt?
> > >
> > > Something like this:
> > >
> > > ---8<---
> > >
> > > diff --git a/drivers/net/failsafe/failsafe_private.h
> > > b/drivers/net/failsafe/failsafe_private.h
> > > index d81cc3ca6..62ddc0689 100644
> > > --- a/drivers/net/failsafe/failsafe_private.h
> > > +++ b/drivers/net/failsafe/failsafe_private.h
> > > @@ -316,8 +316,12 @@ fs_find_next(struct rte_eth_dev *dev,
> > >         subs = PRIV(dev)->subs;
> > >         tail = PRIV(dev)->subs_tail;
> > >         while (sid < tail) {
> > > +               if (min_state > DEV_PROBED &&
> > > +                   fs_is_removed(&sub[sid]))
> > > +                       goto next;
> > >                 if (subs[sid].state >= min_state)
> > >                         break;
> > > +next:
> > >                 sid++;
> > >         }
> > >         *sid_out = sid;
> > >
> > > --->8---
> > >
> > > Only issue being that it is completely racy, but as this MT-unsafe
> > > property is inescapable we might as well ignore it and go for KISS.
> > >
> > > If that's enough, I would prefer instead of having this additional
> > > check added to all rte_eth operations.
> > >
> > 
> > Ok, actually you were right here to do it this way. The "is_removed"
> > check needs to happen after the operation attempt to effectively mitigate
> > the possible race. Checking before attempting the call will be much less
> > effective.
> > 
> > That being said, would it be cleaner to have eth_dev ops return -ENODEV
> > directly, and check against it within fail-safe?
> > 
> 
> I think that according to "is_removed" semantic we must return a Boolean value (Each value different from '0' means that the device is removed) like other functions in c library (for example isspace()).
> 

Sure, I wasn't discussing the interface proposed by rte_eth_dev_is_removed().

What I meant was to ask whether checking rte_eth_dev_is_removed() would
be more interesting in the ethdev layer, making the eth_dev_ops return
-ENODEV regardless of the previous error if this check is supported by
the driver and signal that the port is removed.

I think this information could be interesting to other systems, not just
fail-safe.

-- 
Gaƫtan Rivet
6WIND