From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) by dpdk.org (Postfix) with ESMTP id 112165A54 for ; Wed, 11 Feb 2015 09:14:30 +0100 (CET) Received: by mail-pa0-f52.google.com with SMTP id ey11so2436969pad.11 for ; Wed, 11 Feb 2015 00:14:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=lX4Y/JNNxC82y5hq28AZeaTJgUZyvfkEu1o1c5XPE70=; b=DaQjh/5kNxBm3dztWB0odSFdzdjuZPOlF2Nalb0vTpDkYEYaEyzie8dSJ7CLAGnS7O H2ozkft8IOjh8DqUuNpt4NtFIcPvL1x5jLkSixRFOnCKJTTPHuGRIt/1PdDEhybT7UTN ESS5qCSaQlvbZvZLv8gblMG9XmGfZ8UQ0JTbzKtHc5qMqQti5CnJagZydcC5ooOE2K26 MT5CaXxFyvD3DRAdgwwZGkCqM8dxhJXQqM4cAiorbgN4LvOHDVuJ1Egf4ym3otA1I9fa P8kUx4/Bh04g6iadMvy7nVnz5dJ/a1ClarmxLL6c4tbtMIAUID6JsAMk9t8R5dM3oRJZ mMYw== X-Gm-Message-State: ALoCoQkwtzjbX3EviW+k/3grkSY1th+RqKWO+/4HrUl+mHkVF1eSpfgpzd8cOmC9JOqHBgbhVcnc X-Received: by 10.70.47.104 with SMTP id c8mr44305511pdn.42.1423642469342; Wed, 11 Feb 2015 00:14:29 -0800 (PST) Received: from [10.16.129.101] (napt.igel.co.jp. [219.106.231.132]) by mx.google.com with ESMTPSA id tu7sm166697pac.6.2015.02.11.00.14.27 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Feb 2015 00:14:28 -0800 (PST) Message-ID: <54DB0F5F.90806@igel.co.jp> Date: Wed, 11 Feb 2015 17:14:23 +0900 From: Tetsuya Mukawa User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: "Qiu, Michael" , "Iremonger, Bernard" , "dev@dpdk.org" References: <1422763322-13742-4-git-send-email-mukawa@igel.co.jp> <1423470639-15744-1-git-send-email-mukawa@igel.co.jp> <1423470639-15744-5-git-send-email-mukawa@igel.co.jp> <533710CFB86FA344BFBF2D6802E60286CE71EA@SHSMSX101.ccr.corp.intel.com> <8CEF83825BEC744B83065625E567D7C2049DF5CA@IRSMSX108.ger.corp.intel.com> <533710CFB86FA344BFBF2D6802E60286CE7D25@SHSMSX101.ccr.corp.intel.com> <54DAE045.6000208@igel.co.jp> <54DAE142.6090204@igel.co.jp> <533710CFB86FA344BFBF2D6802E60286CE7E63@SHSMSX101.ccr.corp.intel.com> In-Reply-To: <533710CFB86FA344BFBF2D6802E60286CE7E63@SHSMSX101.ccr.corp.intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v7 04/14] eal/pci: Consolidate pci address comparison APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Feb 2015 08:14:30 -0000 On 2015/02/11 15:29, Qiu, Michael wrote: > On 2/11/2015 12:57 PM, Tetsuya Mukawa wrote: >> On 2015/02/11 13:53, Tetsuya Mukawa wrote: >>> On 2015/02/11 12:27, Qiu, Michael wrote: >>>> On 2/10/2015 11:11 PM, Iremonger, Bernard wrote: >>>>>> -----Original Message----- >>>>>> From: Qiu, Michael >>>>>> Sent: Monday, February 9, 2015 1:10 PM >>>>>> To: Tetsuya Mukawa; dev@dpdk.org >>>>>> Cc: Iremonger, Bernard >>>>>> Subject: Re: [PATCH v7 04/14] eal/pci: Consolidate pci address com= parison APIs >>>>>> >>>>>> On 2/9/2015 4:31 PM, Tetsuya Mukawa wrote: >>>>>>> This patch replaces pci_addr_comparison() and memcmp() of pci >>>>>>> addresses by eal_compare_pci_addr(). >>>>>>> >>>>>>> v5: >>>>>>> - Fix pci_scan_one to handle pt_driver correctly. >>>>>>> v4: >>>>>>> - Fix calculation method of eal_compare_pci_addr(). >>>>>>> - Add parameter checking. >>>>>>> >>>>>>> Signed-off-by: Tetsuya Mukawa >>>>>>> --- >>>>>>> lib/librte_eal/bsdapp/eal/eal_pci.c | 25 ++++++++---------= ------ >>>>>>> lib/librte_eal/common/eal_common_pci.c | 2 +- >>>>>>> lib/librte_eal/common/include/rte_pci.h | 34 +++++++++++++++++= ++++++++++++++ >>>>>>> lib/librte_eal/linuxapp/eal/eal_pci.c | 25 ++++++++---------= ------ >>>>>>> lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 2 +- >>>>>>> 5 files changed, 54 insertions(+), 34 deletions(-) >>>>>>> >>>>>>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c >>>>>>> b/lib/librte_eal/bsdapp/eal/eal_pci.c >>>>>>> index 74ecce7..c844d58 100644 >>>>>>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c >>>>>>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c >>>>>>> @@ -270,20 +270,6 @@ pci_uio_map_resource(struct rte_pci_device *= dev) >>>>>>> return (0); >>>>>>> } >>>>>>> >>>>>>> -/* Compare two PCI device addresses. */ -static int >>>>>>> -pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_ad= dr >>>>>>> *addr2) -{ >>>>>>> - uint64_t dev_addr =3D (addr->domain << 24) + (addr->bus << 16) = + (addr->devid << 8) + addr- >>>>>>> function; >>>>>>> - uint64_t dev_addr2 =3D (addr2->domain << 24) + (addr2->bus << 1= 6) + (addr2->devid << 8) + >>>>>> addr2->function; >>>>>>> - >>>>>>> - if (dev_addr > dev_addr2) >>>>>>> - return 1; >>>>>>> - else >>>>>>> - return 0; >>>>>>> -} >>>>>>> - >>>>>>> - >>>>>>> /* Scan one pci sysfs entry, and fill the devices list from it. = */ >>>>>>> static int pci_scan_one(int dev_pci_fd, struct pci_conf *conf) @= @ >>>>>>> -356,13 +342,20 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *= conf) >>>>>>> } >>>>>>> else { >>>>>>> struct rte_pci_device *dev2 =3D NULL; >>>>>>> + int ret; >>>>>>> >>>>>>> TAILQ_FOREACH(dev2, &pci_device_list, next) { >>>>>>> - if (pci_addr_comparison(&dev->addr, &dev2->addr)) >>>>>>> + ret =3D eal_compare_pci_addr(&dev->addr, &dev2->addr); >>>>>>> + if (ret > 0) >>>>>>> continue; >>>>>>> - else { >>>>>>> + else if (ret < 0) { >>>>>>> TAILQ_INSERT_BEFORE(dev2, dev, next); >>>>>>> return 0; >>>>>>> + } else { /* already registered */ >>>>>>> + /* update pt_driver */ >>>>>>> + dev2->pt_driver =3D dev->pt_driver; >>>>>>> + free(dev); >>>>>>> + return 0; >>>>>>> } >>>>>>> } >>>>>>> TAILQ_INSERT_TAIL(&pci_device_list, dev, next); diff --git >>>>>>> a/lib/librte_eal/common/eal_common_pci.c >>>>>>> b/lib/librte_eal/common/eal_common_pci.c >>>>>>> index f3c7f71..a89f5c3 100644 >>>>>>> --- a/lib/librte_eal/common/eal_common_pci.c >>>>>>> +++ b/lib/librte_eal/common/eal_common_pci.c >>>>>>> @@ -93,7 +93,7 @@ static struct rte_devargs *pci_devargs_lookup(s= truct rte_pci_device *dev) >>>>>>> if (devargs->type !=3D RTE_DEVTYPE_BLACKLISTED_PCI && >>>>>>> devargs->type !=3D RTE_DEVTYPE_WHITELISTED_PCI) >>>>>>> continue; >>>>>>> - if (!memcmp(&dev->addr, &devargs->pci.addr, sizeof(dev->addr))= ) >>>>>>> + if (!eal_compare_pci_addr(&dev->addr, &devargs->pci.addr)) >>>>>>> return devargs; >>>>>>> } >>>>>>> return NULL; >>>>>>> diff --git a/lib/librte_eal/common/include/rte_pci.h >>>>>>> b/lib/librte_eal/common/include/rte_pci.h >>>>>>> index 7f2d699..4814cd7 100644 >>>>>>> --- a/lib/librte_eal/common/include/rte_pci.h >>>>>>> +++ b/lib/librte_eal/common/include/rte_pci.h >>>>>>> @@ -269,6 +269,40 @@ eal_parse_pci_DomBDF(const char *input, stru= ct >>>>>>> rte_pci_addr *dev_addr) } #undef GET_PCIADDR_FIELD >>>>>>> >>>>>>> +/* Compare two PCI device addresses. */ >>>>>>> +/** >>>>>>> + * Utility function to compare two PCI device addresses. >>>>>>> + * >>>>>>> + * @param addr >>>>>>> + * The PCI Bus-Device-Function address to compare >>>>>>> + * @param addr2 >>>>>>> + * The PCI Bus-Device-Function address to compare >>>>>>> + * @return >>>>>>> + * 0 on equal PCI address. >>>>>>> + * Positive on addr is greater than addr2. >>>>>>> + * Negative on addr is less than addr2, or error. >>>>>>> + */ >>>>>>> +static inline int >>>>>>> +eal_compare_pci_addr(struct rte_pci_addr *addr, struct rte_pci_a= ddr >>>>>>> +*addr2) { >>>>>>> + uint64_t dev_addr, dev_addr2; >>>>>>> + >>>>>>> + if ((addr =3D=3D NULL) || (addr2 =3D=3D NULL)) >>>>>>> + return -1; >>>>>>> + >>>>>>> + dev_addr =3D (addr->domain << 24) | (addr->bus << 16) | >>>>>>> + (addr->devid << 8) | addr->function; >>>>>>> + dev_addr2 =3D (addr2->domain << 24) | (addr2->bus << 16) | >>>>>>> + (addr2->devid << 8) | addr2->function; >>>>>>> + >>>>>>> + if (dev_addr > dev_addr2) >>>>>>> + return 1; >>>>>>> + else if (dev_addr < dev_addr2) >>>>>>> + return -1; >>>>>>> + else >>>>>>> + return 0; >>>>>>> +} >>>>>>> + >>>>>>> /** >>>>>>> * Probe the PCI bus for registered drivers. >>>>>>> * >>>>>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c >>>>>>> b/lib/librte_eal/linuxapp/eal/eal_pci.c >>>>>>> index c0ca5a5..d847102 100644 >>>>>>> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c >>>>>>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c >>>>>>> @@ -229,20 +229,6 @@ error: >>>>>>> return -1; >>>>>>> } >>>>>>> >>>>>>> -/* Compare two PCI device addresses. */ -static int >>>>>>> -pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_ad= dr >>>>>>> *addr2) -{ >>>>>>> - uint64_t dev_addr =3D (addr->domain << 24) + (addr->bus << 16) = + (addr->devid << 8) + addr- >>>>>>> function; >>>>>>> - uint64_t dev_addr2 =3D (addr2->domain << 24) + (addr2->bus << 1= 6) + (addr2->devid << 8) + >>>>>> addr2->function; >>>>>>> - >>>>>>> - if (dev_addr > dev_addr2) >>>>>>> - return 1; >>>>>>> - else >>>>>>> - return 0; >>>>>>> -} >>>>>>> - >>>>>>> - >>>>>>> /* Scan one pci sysfs entry, and fill the devices list from it. = */ >>>>>>> static int pci_scan_one(const char *dirname, uint16_t domain, ui= nt8_t >>>>>>> bus, @@ -353,13 +339,20 @@ pci_scan_one(const char *dirname, uint= 16_t >>>>>>> domain, uint8_t bus, >>>>>>> } >>>>>>> else { >>>>>>> struct rte_pci_device *dev2 =3D NULL; >>>>>>> + int ret; >>>>>>> >>>>>>> TAILQ_FOREACH(dev2, &pci_device_list, next) { >>>>>>> - if (pci_addr_comparison(&dev->addr, &dev2->addr)) >>>>>>> + ret =3D eal_compare_pci_addr(&dev->addr, &dev2->addr); >>>>>>> + if (ret > 0) >>>>>>> continue; >>>>>>> - else { >>>>>>> + else if (ret < 0) { >>>>>>> TAILQ_INSERT_BEFORE(dev2, dev, next); >>>>>>> return 0; >>>>>>> + } else { /* already registered */ >>>>>>> + /* update pt_driver */ >>>>>>> + dev2->pt_driver =3D dev->pt_driver; >>>>> Hi Tetsuya, >>>>> >>>>> I am seeing a problem with the librte_pmd_ixgbe code where dev->max= _vfs is being lost in some scenarios. >>>>> The following line should be added here: >>>>> dev2->max_vfs =3D dev->max_vfs; >>>>> >>>>> numa_mode should probably be updated too (although it is not causin= g a problem at present). >>>>> dev2->numa_mode =3D dev->numa_mode; >>>> I'm very curious, why those field miss? I haven't see any places cle= ar >>>> this field. >>>> >>>> What is the root cause? >>> Hi Michael, >>> >>> Here is my guess. >>> The above function creates pci device list. >> I am sorry. I forgot to add below information. >> >> "max_vfs" or "numa_node" value is came from sysfs when the above >> function is processed. > Yes, but it has already been registered, why it missed? Yes, it has been registered already, but probably should be updated. I guess sysfs value will be changed when igb_uio starts managing the devi= ce. ex) 1. Boot linux 2. start a dpdk application with no port. 3. pci device list is registered. - Here, "max_vfs" is came from sysfs. Or there is no such a entry. 4. igb_uio binds the device. 5. I guess max_vfs value of sysfs is changed. Or max_vfs entry is create= d. 6. The dpdk application calls hotplug function. - Here, I guess we need to update "max_vfs" value. Above is a just my assumption. It may be good to wait for Bernard's reply. Thanks, Tetsuya > Thanks, > Michael >>> And current DPDK implementation assumes all devices needed to be mana= ged >>> are under igb_uio or vfio when above code is processed. >>> To add hotplug function, we also need to think some devices will star= t >>> to be managed under igb_uio or vfio after initializing pci device lis= t. >>> Anyway, I guess "max_vfs" value will be changed when igb_uio or vfio >>> manages the device. >>> >>> Hi Bernard, >>> >>> Could you please check "max_vfs" and "num_node" values, then check th= e >>> values again after the device is managed by igb_uio or vfio? >>> In my environment, it seems max_vfs is created by igb_uio. >>> But my NIC doesn't have VF, so behavior might be different in your >>> environment. >>> I guess "numa_node" should not be changed theoretically. >>> >>> If my guess is correct, how about replacing following values? >>> - driver >>> - max_vfs >>> - resource >>> - (numa_node) >>> Except for above value, I guess other value shouldn't be changed even= >>> after the device is managed by igb_uio or vfio. >>> >>> Thanks, >>> Tetsuya >>> >>>> Thanks, >>>> Michael >>>> >>>>> Regards, >>>>> >>>>> Bernard. >>>>> >>>>> >>>>> >>>>> >>>>>>> + free(dev); >>>>>>> + return 0; >>>>>>> } >>>>>>> } >>>>>>> TAILQ_INSERT_TAIL(&pci_device_list, dev, next); diff --git >>>>>>> a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>>>>> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>>>>> index e53f06b..1da3507 100644 >>>>>>> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>>>>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>>>>> @@ -123,7 +123,7 @@ pci_uio_map_secondary(struct rte_pci_device *= dev) >>>>>>> TAILQ_FOREACH(uio_res, pci_res_list, next) { >>>>>>> >>>>>>> /* skip this element if it doesn't match our PCI address */ >>>>>>> - if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr)))= >>>>>>> + if (eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr)) >>>>>>> continue; >>>>>>> >>>>>>> for (i =3D 0; i !=3D uio_res->nb_maps; i++) { >>>>>> Acked-by: Michael Qiu