From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f180.google.com (mail-pd0-f180.google.com [209.85.192.180]) by dpdk.org (Postfix) with ESMTP id 723E45A17 for ; Wed, 11 Feb 2015 05:53:31 +0100 (CET) Received: by pdjy10 with SMTP id y10so1755993pdj.13 for ; Tue, 10 Feb 2015 20:53:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=kNq1tRfZJOzIej4chJk1IJWGSsMYlHDJi6Ilf/sZc8g=; b=DkqhGjOtWTAnPrzyPrnbdQUDEyrADM0GsGPSFV7udOQdiyNuax2FkI94Qv+NMgzmWp mhYfFv0eg76icM8/UUHZj62avqlfeOtgfWYIaoNe31XxN2Ci9bfsjvklRMd/Kc40KsfC aKnVHl3yMRrrjrR+42Ml1fEZGMTn3eNmeTl8npS+/V4A/tpSqW0SWGdo8FED+SkcOda5 31wNjsGoa5HJgv1c0d7+HyI2+qijymAXIuibbmG/mdNGsJ1F3hSsvvLrooKIfRa22+ZP J0klQB4zTOicNmdJJDlLAQYwyT6Tsig2iSY/s25P+W9scs9x3b/BpYt1cETdvCYFtnUR ELYA== X-Gm-Message-State: ALoCoQmUJBSsFtVD8y/bt9xJBudBBkODOLwsQv0oZ8ILvwUeqzS/aOB1x6MmQ/u7N5v2LqD+zHy9 X-Received: by 10.66.141.109 with SMTP id rn13mr43187356pab.113.1423630410666; Tue, 10 Feb 2015 20:53:30 -0800 (PST) Received: from [10.16.129.101] (napt.igel.co.jp. [219.106.231.132]) by mx.google.com with ESMTPSA id nk6sm20912188pdb.89.2015.02.10.20.53.29 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Feb 2015 20:53:30 -0800 (PST) Message-ID: <54DAE045.6000208@igel.co.jp> Date: Wed, 11 Feb 2015 13:53:25 +0900 From: Tetsuya Mukawa User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: "Qiu, Michael" , "Iremonger, Bernard" , "dev@dpdk.org" References: <1422763322-13742-4-git-send-email-mukawa@igel.co.jp> <1423470639-15744-1-git-send-email-mukawa@igel.co.jp> <1423470639-15744-5-git-send-email-mukawa@igel.co.jp> <533710CFB86FA344BFBF2D6802E60286CE71EA@SHSMSX101.ccr.corp.intel.com> <8CEF83825BEC744B83065625E567D7C2049DF5CA@IRSMSX108.ger.corp.intel.com> <533710CFB86FA344BFBF2D6802E60286CE7D25@SHSMSX101.ccr.corp.intel.com> In-Reply-To: <533710CFB86FA344BFBF2D6802E60286CE7D25@SHSMSX101.ccr.corp.intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v7 04/14] eal/pci: Consolidate pci address comparison APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Feb 2015 04:53:31 -0000 On 2015/02/11 12:27, Qiu, Michael wrote: > On 2/10/2015 11:11 PM, Iremonger, Bernard wrote: >>> -----Original Message----- >>> From: Qiu, Michael >>> Sent: Monday, February 9, 2015 1:10 PM >>> To: Tetsuya Mukawa; dev@dpdk.org >>> Cc: Iremonger, Bernard >>> Subject: Re: [PATCH v7 04/14] eal/pci: Consolidate pci address compar= ison APIs >>> >>> On 2/9/2015 4:31 PM, Tetsuya Mukawa wrote: >>>> This patch replaces pci_addr_comparison() and memcmp() of pci >>>> addresses by eal_compare_pci_addr(). >>>> >>>> v5: >>>> - Fix pci_scan_one to handle pt_driver correctly. >>>> v4: >>>> - Fix calculation method of eal_compare_pci_addr(). >>>> - Add parameter checking. >>>> >>>> Signed-off-by: Tetsuya Mukawa >>>> --- >>>> lib/librte_eal/bsdapp/eal/eal_pci.c | 25 ++++++++------------= --- >>>> lib/librte_eal/common/eal_common_pci.c | 2 +- >>>> lib/librte_eal/common/include/rte_pci.h | 34 ++++++++++++++++++++= +++++++++++ >>>> lib/librte_eal/linuxapp/eal/eal_pci.c | 25 ++++++++------------= --- >>>> lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 2 +- >>>> 5 files changed, 54 insertions(+), 34 deletions(-) >>>> >>>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c >>>> b/lib/librte_eal/bsdapp/eal/eal_pci.c >>>> index 74ecce7..c844d58 100644 >>>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c >>>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c >>>> @@ -270,20 +270,6 @@ pci_uio_map_resource(struct rte_pci_device *dev= ) >>>> return (0); >>>> } >>>> >>>> -/* Compare two PCI device addresses. */ -static int >>>> -pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_addr >>>> *addr2) -{ >>>> - uint64_t dev_addr =3D (addr->domain << 24) + (addr->bus << 16) + (= addr->devid << 8) + addr- >>>> function; >>>> - uint64_t dev_addr2 =3D (addr2->domain << 24) + (addr2->bus << 16) = + (addr2->devid << 8) + >>> addr2->function; >>>> - >>>> - if (dev_addr > dev_addr2) >>>> - return 1; >>>> - else >>>> - return 0; >>>> -} >>>> - >>>> - >>>> /* Scan one pci sysfs entry, and fill the devices list from it. */ >>>> static int pci_scan_one(int dev_pci_fd, struct pci_conf *conf) @@ >>>> -356,13 +342,20 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *con= f) >>>> } >>>> else { >>>> struct rte_pci_device *dev2 =3D NULL; >>>> + int ret; >>>> >>>> TAILQ_FOREACH(dev2, &pci_device_list, next) { >>>> - if (pci_addr_comparison(&dev->addr, &dev2->addr)) >>>> + ret =3D eal_compare_pci_addr(&dev->addr, &dev2->addr); >>>> + if (ret > 0) >>>> continue; >>>> - else { >>>> + else if (ret < 0) { >>>> TAILQ_INSERT_BEFORE(dev2, dev, next); >>>> return 0; >>>> + } else { /* already registered */ >>>> + /* update pt_driver */ >>>> + dev2->pt_driver =3D dev->pt_driver; >>>> + free(dev); >>>> + return 0; >>>> } >>>> } >>>> TAILQ_INSERT_TAIL(&pci_device_list, dev, next); diff --git >>>> a/lib/librte_eal/common/eal_common_pci.c >>>> b/lib/librte_eal/common/eal_common_pci.c >>>> index f3c7f71..a89f5c3 100644 >>>> --- a/lib/librte_eal/common/eal_common_pci.c >>>> +++ b/lib/librte_eal/common/eal_common_pci.c >>>> @@ -93,7 +93,7 @@ static struct rte_devargs *pci_devargs_lookup(stru= ct rte_pci_device *dev) >>>> if (devargs->type !=3D RTE_DEVTYPE_BLACKLISTED_PCI && >>>> devargs->type !=3D RTE_DEVTYPE_WHITELISTED_PCI) >>>> continue; >>>> - if (!memcmp(&dev->addr, &devargs->pci.addr, sizeof(dev->addr))) >>>> + if (!eal_compare_pci_addr(&dev->addr, &devargs->pci.addr)) >>>> return devargs; >>>> } >>>> return NULL; >>>> diff --git a/lib/librte_eal/common/include/rte_pci.h >>>> b/lib/librte_eal/common/include/rte_pci.h >>>> index 7f2d699..4814cd7 100644 >>>> --- a/lib/librte_eal/common/include/rte_pci.h >>>> +++ b/lib/librte_eal/common/include/rte_pci.h >>>> @@ -269,6 +269,40 @@ eal_parse_pci_DomBDF(const char *input, struct >>>> rte_pci_addr *dev_addr) } #undef GET_PCIADDR_FIELD >>>> >>>> +/* Compare two PCI device addresses. */ >>>> +/** >>>> + * Utility function to compare two PCI device addresses. >>>> + * >>>> + * @param addr >>>> + * The PCI Bus-Device-Function address to compare >>>> + * @param addr2 >>>> + * The PCI Bus-Device-Function address to compare >>>> + * @return >>>> + * 0 on equal PCI address. >>>> + * Positive on addr is greater than addr2. >>>> + * Negative on addr is less than addr2, or error. >>>> + */ >>>> +static inline int >>>> +eal_compare_pci_addr(struct rte_pci_addr *addr, struct rte_pci_addr= >>>> +*addr2) { >>>> + uint64_t dev_addr, dev_addr2; >>>> + >>>> + if ((addr =3D=3D NULL) || (addr2 =3D=3D NULL)) >>>> + return -1; >>>> + >>>> + dev_addr =3D (addr->domain << 24) | (addr->bus << 16) | >>>> + (addr->devid << 8) | addr->function; >>>> + dev_addr2 =3D (addr2->domain << 24) | (addr2->bus << 16) | >>>> + (addr2->devid << 8) | addr2->function; >>>> + >>>> + if (dev_addr > dev_addr2) >>>> + return 1; >>>> + else if (dev_addr < dev_addr2) >>>> + return -1; >>>> + else >>>> + return 0; >>>> +} >>>> + >>>> /** >>>> * Probe the PCI bus for registered drivers. >>>> * >>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c >>>> b/lib/librte_eal/linuxapp/eal/eal_pci.c >>>> index c0ca5a5..d847102 100644 >>>> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c >>>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c >>>> @@ -229,20 +229,6 @@ error: >>>> return -1; >>>> } >>>> >>>> -/* Compare two PCI device addresses. */ -static int >>>> -pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_addr >>>> *addr2) -{ >>>> - uint64_t dev_addr =3D (addr->domain << 24) + (addr->bus << 16) + (= addr->devid << 8) + addr- >>>> function; >>>> - uint64_t dev_addr2 =3D (addr2->domain << 24) + (addr2->bus << 16) = + (addr2->devid << 8) + >>> addr2->function; >>>> - >>>> - if (dev_addr > dev_addr2) >>>> - return 1; >>>> - else >>>> - return 0; >>>> -} >>>> - >>>> - >>>> /* Scan one pci sysfs entry, and fill the devices list from it. */ >>>> static int pci_scan_one(const char *dirname, uint16_t domain, uint8= _t >>>> bus, @@ -353,13 +339,20 @@ pci_scan_one(const char *dirname, uint16_= t >>>> domain, uint8_t bus, >>>> } >>>> else { >>>> struct rte_pci_device *dev2 =3D NULL; >>>> + int ret; >>>> >>>> TAILQ_FOREACH(dev2, &pci_device_list, next) { >>>> - if (pci_addr_comparison(&dev->addr, &dev2->addr)) >>>> + ret =3D eal_compare_pci_addr(&dev->addr, &dev2->addr); >>>> + if (ret > 0) >>>> continue; >>>> - else { >>>> + else if (ret < 0) { >>>> TAILQ_INSERT_BEFORE(dev2, dev, next); >>>> return 0; >>>> + } else { /* already registered */ >>>> + /* update pt_driver */ >>>> + dev2->pt_driver =3D dev->pt_driver; >> Hi Tetsuya, >> >> I am seeing a problem with the librte_pmd_ixgbe code where dev->max_vf= s is being lost in some scenarios. >> The following line should be added here: >> dev2->max_vfs =3D dev->max_vfs; >> >> numa_mode should probably be updated too (although it is not causing a= problem at present). >> dev2->numa_mode =3D dev->numa_mode; > I'm very curious, why those field miss? I haven't see any places clear > this field. > > What is the root cause? Hi Michael, Here is my guess. The above function creates pci device list. And current DPDK implementation assumes all devices needed to be managed are under igb_uio or vfio when above code is processed. To add hotplug function, we also need to think some devices will start to be managed under igb_uio or vfio after initializing pci device list. Anyway, I guess "max_vfs" value will be changed when igb_uio or vfio manages the device. Hi Bernard, Could you please check "max_vfs" and "num_node" values, then check the values again after the device is managed by igb_uio or vfio? In my environment, it seems max_vfs is created by igb_uio. But my NIC doesn't have VF, so behavior might be different in your environment. I guess "numa_node" should not be changed theoretically. If my guess is correct, how about replacing following values? - driver - max_vfs - resource - (numa_node) Except for above value, I guess other value shouldn't be changed even after the device is managed by igb_uio or vfio. Thanks, Tetsuya > Thanks, > Michael > >> Regards, >> >> Bernard. >> >> >> >> >>>> + free(dev); >>>> + return 0; >>>> } >>>> } >>>> TAILQ_INSERT_TAIL(&pci_device_list, dev, next); diff --git >>>> a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>> index e53f06b..1da3507 100644 >>>> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c >>>> @@ -123,7 +123,7 @@ pci_uio_map_secondary(struct rte_pci_device *dev= ) >>>> TAILQ_FOREACH(uio_res, pci_res_list, next) { >>>> >>>> /* skip this element if it doesn't match our PCI address */ >>>> - if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr))) >>>> + if (eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr)) >>>> continue; >>>> >>>> for (i =3D 0; i !=3D uio_res->nb_maps; i++) { >>> Acked-by: Michael Qiu