From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0313AA051C; Tue, 11 Feb 2020 09:38:46 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B882A2BFA; Tue, 11 Feb 2020 09:38:45 +0100 (CET) Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by dpdk.org (Postfix) with ESMTP id 369822BAB for ; Tue, 11 Feb 2020 09:38:44 +0100 (CET) X-Originating-IP: 37.58.153.206 Received: from [10.0.3.185] (bny206.haproxy.com [37.58.153.206]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id D095B4001A; Tue, 11 Feb 2020 08:38:43 +0000 (UTC) To: Somnath Kotur Cc: dev References: <20200204091548.2861-1-somnath.kotur@broadcom.com> <83f7e3a5-c36f-a755-1c71-0c248fe48d0e@u256.net> From: Gaetan Rivet Message-ID: <926ec7c7-b23d-267a-ed43-6ef4d7bb916a@u256.net> Date: Tue, 11 Feb 2020 09:38:44 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] eal: fix to set the rte_device ptr's device args before hotplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" [...] >>> (gdb) p dev2 >>> $5 = (struct rte_pci_device *) 0x54de5e0 >>> (gdb) p /x *dev2 >>> $6 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device = >>> {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver = >>> 0x3a6b7f0, >>> bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain = >>> 0x0, bus = 0x6, devid = 0x2, function = 0x0} >>> >>> we hit the pci_device ( 0x54de5e0) that was created during the first >>> time probe of the parent device ? >>> Since it is already probed, it goes to line 381 where it does frees >>> the just allocated 'pci_device' inside pci_scan_one() via 'free(dev)' >>> >>> As you can see this pci_device does not have it's devargs set (rightly >>> so as initially there were no devargs for this device) >>> >>> But as shown in the stack trace in my previous reply, when >>> pci_find_device() walks the rte_pci_devices list , it finds this >>> earlier probed device (without devargs) >>> >> >> Alright, I think you are right, this is what is happening. >> The issue IMO, is that the PCI scan is thus hitting an already existing device, but we have >> missed the case where the new device has more info that the previous one (linked devargs). >> > That is correct > >>> pci_find_device (start=0x0, cmp=0x4b92d8 , >>> data=0x55a4af8) at /root/dpdk-19.11/drivers/bus/pci/pci_common.c:426 >>> 426 return &pdev->device; >>> (gdb) p pdev >>> $15 = (struct rte_pci_device *) 0x54de5e0 >>> >>> (gdb) p /x *pdev >>> $16 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device = >>> {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver = >>> 0x3a6b7f0, >>> bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain = >>> 0x0, bus = 0x6, devid = 0x2, function = 0x0}, >>> >>> Hope this explains why the pci_device has NULL devargs at this point >>> and how my fix to set it at this point helps solve the problem? >>> >>> Please let me know your thoughts >>> >> >> I think the proper fix is instead to have a clean update during the PCI scan. >> The proper way is to keep the old device, but override its fields as new info was found. > > Agreed >> >> Something like calling pci_name_set(dev2); line 359 maybe. BSD should also be updated for consistency. >> >> My issue with your patch is that I think this issue is very specific to the PCI bus and the capabilities >> of some devices on it. It would be better to have a fully compliant scan + probe ops considering >> the supported capabilities, instead of forcing this on all buses. > Sure, so i'm guessing you meant something like this ? > > dex 740a2cd..92a41c2 100644 > --- a/drivers/bus/pci/linux/pci.c > +++ b/drivers/bus/pci/linux/pci.c > @@ -377,6 +377,8 @@ > */ > RTE_LOG(ERR, EAL, > "Unexpected device scan at %s!\n", > filename); > + else if (dev2->device.devargs > != dev->device.devargs) > + pci_name_set(dev2); > } > free(dev); > } > > Which is basically checking for devargs mismatch between the 'new' > device and a match with an > already probed device and return the old device while adjusting the > new info (devargs) > Is that OK? > >> >> I'm wondering whether this can happen with an already existing devargs? If so, is it more correct to keep >> the new one, or ignore the probe, free the new devargs and report an error? If the former, >> please clean up properly the devargs using rte_devargs_remove() (before calling pci_name_set() of course). Hello Somnath, Yes, this is basically it, however you also need to take care of a potential already existing devargs here, because the PCI bus does not guarantee that all devargs of a device will be named the same.