From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bruce.richardson@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id AB8CB2B9D
 for <dev@dpdk.org>; Fri,  5 Aug 2016 14:29:46 +0200 (CEST)
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by fmsmga101.fm.intel.com with ESMTP; 05 Aug 2016 05:29:45 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.28,474,1464678000"; d="scan'208";a="1020221916"
Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.220.44])
 by fmsmga001.fm.intel.com with SMTP; 05 Aug 2016 05:29:43 -0700
Received: by  (sSMTP sendmail emulation); Fri, 05 Aug 2016 13:29:42 +0025
Date: Fri, 5 Aug 2016 13:29:42 +0100
From: Bruce Richardson <bruce.richardson@intel.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Igor Ryzhov <iryzhov@nfware.com>, dev@dpdk.org,
 David Marchand <david.marchand@6wind.com>,
 "Liu, Yuanhan" <yuanhan.liu@intel.com>
Message-ID: <20160805122942.GA33788@bricha3-MOBL3>
References: <E1025756-36BA-4BC3-AA5D-279AE1025530@nfware.com>
 <57A32814.1000404@intel.com>
 <0AB336DF-9C66-4826-BA17-EDE1F8D6A2EA@nfware.com>
 <57A34175.1040204@intel.com>
 <9C1AE4BF-9530-4596-BCF6-09E9AF7E55F3@nfware.com>
 <57A3638D.2060300@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <57A3638D.2060300@intel.com>
Organization: Intel Research and =?iso-8859-1?Q?De=ACvel?=
 =?iso-8859-1?Q?opment?= Ireland Ltd.
User-Agent: Mutt/1.5.23 (2014-03-12)
Subject: Re: [dpdk-dev] rte_eth_dev_attach returns 0,
 although device is not attached
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Aug 2016 12:29:47 -0000

On Thu, Aug 04, 2016 at 04:47:25PM +0100, Ferruh Yigit wrote:
> On 8/4/2016 3:54 PM, Igor Ryzhov wrote:
> > 
> >> 4 авг. 2016 г., в 16:21, Ferruh Yigit <ferruh.yigit@intel.com
> >> <mailto:ferruh.yigit@intel.com>> написал(а):
> >>
> >> On 8/4/2016 12:51 PM, Igor Ryzhov wrote:
> >>> Hello Ferruh,
> >>>
> >>>> 4 авг. 2016 г., в 14:33, Ferruh Yigit <ferruh.yigit@intel.com
> >>>> <mailto:ferruh.yigit@intel.com>> написал(а):
> >>>>
> >>>> Hi Igor,
> >>>>
> >>>> On 8/3/2016 5:58 PM, Igor Ryzhov wrote:
> >>>>> Hello.
> >>>>>
> >>>>> Function rte_eth_dev_attach can return false positive result.
> >>>>> It happens because rte_eal_pci_probe_one returns zero if no driver
> >>>>> is found for the device:
> >>>>> ret = pci_probe_all_drivers(dev);
> >>>>> if (ret < 0)
> >>>>> goto err_return;
> >>>>> return 0;
> >>>>> (pci_probe_all_drivers returns 1 in that case)
> >>>>>
> >>>>> For example, it can be easily reproduced by trying to attach virtio
> >>>>> device, managed by kernel driver.
> >>>>
> >>>> You are right, and I did able to reproduce this issue with virtio as you
> >>>> suggest.
> >>>>
> >>>> But I wonder why rte_eth_dev_get_port_by_addr() is not catching this.
> >>>> Perhaps a dev->attached check needs to be added into this function.
> >>
> >> With a second check, rte_eth_dev_get_port_by_addr() catches it if the
> >> driver is missing.
> >>
> >> But for virtio case, problem is not missing driver.
> >> Problem is eth_virtio_dev_init() is returning a positive value on fail.
> >>
> >> Call stack is:
> >> rte_eal_pci_probe_one
> >>    pci_probe_all_drivers
> >>        rte_eal_pci_probe_one_driver
> >>            rte_eth_dev_init
> >>               eth_virtio_dev_init
> >>
> >> So rte_eal_pci_probe_one_driver() also returns positive value, as no
> >> driver found, and rte_eth_dev_get_port_by_addr() returns a valid
> >> port_id, since rte_eth_dev_init() allocated an eth_dev.
> >>
> >> Briefly, this can be fixed in virtio pmd, instead of eal pci.
> >>
> >>>>
> >>>>>
> >>>>> I think it should be:
> >>>>> ret = pci_probe_all_drivers(dev);
> >>>>> if (ret)
> >>>>> goto err_return;
> >>>>> return 0;
> >>>>
> >>>> Your proposal looks good to me. Will you send a patch?
> >>>
> >>
> >> Original code silently ignores the if driver is missing for that dev,
> >> although it is still questionable, I think we can keep this as it is.
> >>
> >>> Patch sent.
> >>
> >> Sorry for this, but can you please test with following modification in
> >> virtio:
> >> index 07d6449..c74eeee 100644
> >> --- a/drivers/net/virtio/virtio_ethdev.c
> >> +++ b/drivers/net/virtio/virtio_ethdev.c
> >> @@ -1156,7 +1156,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
> >>        if (pci_dev) {
> >>                ret = vtpci_init(pci_dev, hw, &dev_flags);
> >>                if (ret)
> >> -                       return ret;
> >> +                       return -1;
> >>        }
> >>
> >>        /* Reset the device although not necessary at startup */
> > 
> > I think it's not a good change, because it will break the idea of this
> > patch - http://dpdk.org/browse/dpdk/commit/?id=ac5e1d83
> 
> Yes, breaks this one, I wasn't aware of this patch. But in this patch,
> commit log says: "return 1 to tell the upper layer we
> don't take over this device.", I am not sure upper layer designed for this.
> 
> > 
> > Also, with your patch the application will not start, because
> > rte_eal_pci_probe will fail:
> > 
> > if (ret < 0)
> > rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
> >  " cannot be used\n", dev->addr.domain, dev->addr.bus,
> >  dev->addr.devid, dev->addr.function);
> 
> Yes it fails, and this looks like intended behavior. This failure is
> correct according code.
> 
> > 
> > And now I think that maybe we should change the way rte_eal_pci_probe works.
> > I think we shouldn't stop the application if just one of PCI devices is
> > not probed successfully.
> 
> Agreed. Overall rte_exit() usage already discussed a few times.
> 
> I think best option is:
> - don't exit app if rte_eal_pci_probe() fails, only print an error.

Whether or not the pci probe exits the app or not, I think it should signal
a serious error if the probe fails and a device was explicitly whitelisted on
the commandline. Given the user explicitly requested the device, a failure to
use is probably a problem which requires the user to fix before running the
app.

/Bruce