From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f171.google.com (mail-pd0-f171.google.com [209.85.192.171]) by dpdk.org (Postfix) with ESMTP id E1BCDB46D for ; Wed, 18 Feb 2015 13:20:12 +0100 (CET) Received: by pdev10 with SMTP id v10so761983pde.10 for ; Wed, 18 Feb 2015 04:20:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=fawoEe1Sd/WY1hqygqppADAe4n14jluLQDBEQCeQ0cc=; b=Lr/aQZWIKuRfxi0yuBn9QC/lCmnclb75xOLNSCeRHAArR/Cyr9rdwLmKf6mRr3uvD/ 2BhXdJh62WoI+J21Ql5hLC0O5IDE8k4yMzAOFzn+osi6um296SvbfH++DsxHzWkVVh6/ 98HnHKL6J1vAFe1YzzqkHwg1ctBqsQADlnxynjuRvl4Mo3+zBR/XhWktUc+7916V6oXT pLe2JPgEeb9F0Eih/hZ5dl3lbBznj9SNEksMFHJWO2L9jjt28Ra8U8EH5mmmz2VGcF8I 3QoLcPEr0jxgoezqcCFvHnIvUWABGs5ugFggDSF/27EQ4qOgLNWTHMwmEkdnH4511+uG x5HA== X-Gm-Message-State: ALoCoQkeIoyGxHRs3NY9c5lucCLwANr8R1YXk6xRi+0qGApJtfIZiZRfZE+Rkn8NfiuNoHAJus1f X-Received: by 10.68.252.198 with SMTP id zu6mr57047529pbc.102.1424262012074; Wed, 18 Feb 2015 04:20:12 -0800 (PST) Received: from [10.16.129.101] (napt.igel.co.jp. [219.106.231.132]) by mx.google.com with ESMTPSA id k3sm20560291pde.79.2015.02.18.04.20.10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Feb 2015 04:20:11 -0800 (PST) Message-ID: <54E48378.8050707@igel.co.jp> Date: Wed, 18 Feb 2015 21:20:08 +0900 From: Tetsuya Mukawa User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: "Iremonger, Bernard" , Thomas Monjalon References: <1423470639-15744-2-git-send-email-mukawa@igel.co.jp> <54E3F10A.4090303@igel.co.jp> <8CEF83825BEC744B83065625E567D7C2049EAA9F@IRSMSX108.ger.corp.intel.com> <2590241.Y3C96lux3o@xps13> <8CEF83825BEC744B83065625E567D7C2049EAB13@IRSMSX108.ger.corp.intel.com> In-Reply-To: <8CEF83825BEC744B83065625E567D7C2049EAB13@IRSMSX108.ger.corp.intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci address comparison APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Feb 2015 12:20:13 -0000 On 2015/02/18 20:39, Iremonger, Bernard wrote: > >> -----Original Message----- >> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] >> Sent: Wednesday, February 18, 2015 10:33 AM >> To: Iremonger, Bernard >> Cc: Tetsuya Mukawa; dev@dpdk.org; ivan.boule@6wind.com >> Subject: Re: [dpdk-dev] [PATCH v8 04/14] eal/pci: Consolidate pci addr= ess comparison APIs >> >> Hi Bernard, >> >> 2015-02-18 10:26, Iremonger, Bernard: >>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tetsuya Mukawa >>>> On 2015/02/18 10:02, Thomas Monjalon wrote: >>>>> 2015-02-17 15:14, Tetsuya Mukawa: >>>>>> On 2015/02/17 9:44, Thomas Monjalon wrote: >>>>>>> 2015-02-16 13:14, Tetsuya Mukawa: >>>>>>>> @@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_co= nf *conf) >>>>>>>> } >>>>>>>> else { >>>>>>>> struct rte_pci_device *dev2 =3D NULL; >>>>>>>> + int ret; >>>>>>>> >>>>>>>> TAILQ_FOREACH(dev2, &pci_device_list, next) { >>>>>>>> - if (pci_addr_comparison(&dev->addr, &dev2->addr)) >>>>>>>> + ret =3D eal_compare_pci_addr(&dev->addr, &dev2->addr); >>>>>>>> + if (ret > 0) >>>>>>>> continue; >>>>>>>> - else { >>>>>>>> + else if (ret < 0) { >>>>>>>> TAILQ_INSERT_BEFORE(dev2, dev, next); >>>>>>>> return 0; >>>>>>>> + } else { /* already registered */ >>>>>>>> + /* update pt_driver */ >>>>>>>> + dev2->pt_driver =3D dev->pt_driver; >>>>>>>> + dev2->max_vfs =3D dev->max_vfs; >>>>>>>> + memmove(dev2->mem_resource, >>>>>>>> + dev->mem_resource, >>>>>>>> + sizeof(dev->mem_resource)); >>>>>>>> + free(dev); >>>>>>>> + return 0; >>>>>>> Could you comment this "else part" please? >>>>>> PCI device list is allocated when rte_eal_init() is called. At >>>>>> the time, to fill pci device information, sysfs value is used. >>>>>> If sysfs values written by kernel device driver will not be >>>>>> changed by igb_uio, vfio or pci_uio_genereic, above code isn't nee= ded. >>>>>> But actually above values are changed or added by them. >>>>>> >>>>>> Here is a step to cause issue. >>>>>> 1. Boot linux. >>>>>> 2. Start DPDK application without any physical NIC ports. >>>>>> - Here, some sysfs values are read, and store to pci device list.= >>>>>> 3. igb_uio starts managing a port. >>>>>> - Here, some sysfs values are changed. >>>>>> 4. Add a NIC port to DPDK application using hotplug functions. >>>>>> - Here, we need to replace some values. >>>>> I think that you are showing that something is wrongly designed in >>>>> these EAL structures. I suggest to try cleaning this mess instead o= f workarounding. >>> Hi Tetsuya, Thomas, >>> I think that redesigning the EAL structures is beyond the scope of th= is patchset and should be >> undertaken as a separate task. >> >> I strongly disagrees this opinion. We should never workaround design p= roblems and add more >> complex/weird code. >> I think that this kind of consideration is the heart of some design pr= oblems we have to face today. >> Please let's stop adding some code which just works without thinking t= he whole design. >> >>> I suspect there may be a problem in the original code when a device = which was using a kernel driver >> is bound to igb_uio. The igb_uio driver adds /sys/bus/pci/devices/00= 00\:05\:00.0/max_vfs. > Hi Tomas, Tetsuya, > > In general, I agree that we should not workaround design problems. > In this case I don't think there is a problem with the rte_pci_device a= nd pci_device_list structures. I agree with it. > The "already registered" device has been replaced. It would probably be= cleaner to remove the "already registered" device from the list and then= add the new device to the list rather than update the "already registere= d" device. > I guess "replacing" will not work, because rte_pci_device structure is also registered in rte_eth_dev structure. If we remove and free the pci device, I guess something goes wrong in ethdev library. Just removing is one more option, but it means there is a working pci device that is not registered in the pci_device_list. I guess it's weird.= I still think updating may be correct behavior. The pci_device_list is used like below when rte_eal_init() is called. 1. When rte_eal_pci_init() is called, all pci devices are registered in the pci_device_list. 2. When rte_eal_dev_init() is called, dev_driver_list is traversed, and if a PCI device for a driver is found in the pci_device_list, init() of the driver is called. I guess it's not so strange design. But this design assumes pci_device_list is latest while matching a driver registered in dev_driver_list. Now, we have hotplug patch. I guess we should do same thing. Before matching, we should update the pci_device_list. Thanks, Tetsuya > Regards, > > Bernard. >