From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 17BDBA0541; Fri, 7 Feb 2020 06:53:18 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E88ED1C07B; Fri, 7 Feb 2020 06:53:16 +0100 (CET) Received: from mail-vk1-f195.google.com (mail-vk1-f195.google.com [209.85.221.195]) by dpdk.org (Postfix) with ESMTP id 301141C069 for ; Fri, 7 Feb 2020 06:53:15 +0100 (CET) Received: by mail-vk1-f195.google.com with SMTP id b69so282243vke.9 for ; Thu, 06 Feb 2020 21:53:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=P3AQyI8fgf4YkFzaCMt1dTakc0v7BHsKTwA1LWtUF+4=; b=OUoAA7EFhXG6IeF/3vdIn4RoDiE1OS/pf8IE615A1AxehMNmSCZl9AskvxOiwH3Hri OWrIgx/su5623qPaQp5VGMGDTH8OFaD9rVXXSRgpG2zmnf6ncJyjmEhlILBstwQ4tpzZ JTN+LndcE5ytEb7+IIdxW03gPyRhPzMPjRv+o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=P3AQyI8fgf4YkFzaCMt1dTakc0v7BHsKTwA1LWtUF+4=; b=qqjbBb5Xz5btOl7Sv5xj2DC4ajLAMrEZY7uiHG7BaAfc4v2xS4ouTb544gQIZIfXh4 9Xhw/ViFj0Tj97F2qE0hCVWb2Nc2R9c6+I+alYIgg1eC/yEJszHkerNL8fcHjG4mxf6n x3i1R5fcLVSl78qX7BE8BuZv1/uWAKzapsKSLcx8sXMCfFmJ00Ed/YJ0ShErfhcg90zE EPVlS2u5ZKu+3DV87VMdny721i8L4ZBS+y+wsn4XVBuzx2w2PM1/F8DEusrjj3wkQe0r RnOWVwqE2GCWEJd1HWV6Is4/fSTNvtLQGGEyyFjVwf3/oKqJ7sPQbex0l0K4NtlfgZ4i EArw== X-Gm-Message-State: APjAAAU2BJC7fxaOUyc2+SsjSrdjwDAqemBstn0uAq+fkzlHs+NCmDUN j8nTOtEDh3V0BZEtMfWAl5O65V9PYJH6qMTYjWA0eqbD X-Google-Smtp-Source: APXvYqyCJ0zBkxLXyyitTYo+GZ6d11XhYgmwVmlh9gkGMVR+1ujSXS4pTil5uMGoPE7MpkHrbIvV93cfSLXVxHzORJA= X-Received: by 2002:a1f:2753:: with SMTP id n80mr4121175vkn.24.1581054794086; Thu, 06 Feb 2020 21:53:14 -0800 (PST) MIME-Version: 1.0 References: <20200204091548.2861-1-somnath.kotur@broadcom.com> In-Reply-To: From: Somnath Kotur Date: Fri, 7 Feb 2020 11:22:50 +0530 Message-ID: To: Gaetan Rivet Cc: dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH] eal: fix to set the rte_device ptr's device args before hotplug X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" HI Gaetan, On Thu, Feb 6, 2020 at 9:36 AM Somnath Kotur w= rote: > > Hello Gaetan, > Sorry my bad, i think i captured it for the wrong BDF ... > So the overarching goal here is that i want to create/add a > representor port for 06:02.01 using 06:02:00 as the backing/parent > device on the vswitch. > > I recaptured the debug with 06:02:00 > > Breakpoint 1, pci_name_set (dev=3D0x5d44e00) at > /root/dpdk-19.11/drivers/bus/pci/pci_common.c:65 > 65 dev->name, sizeof(dev->name)); > (gdb) p /x *dev > $3 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, device =3D {next = =3D > {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x0, driver =3D 0x0, bus = =3D > 0x3a59a00, numa_node =3D 0x0, devargs =3D 0x0}, addr =3D {domain =3D 0x0,= bus > =3D 0x6, devid =3D 0x2, function =3D 0x0}, id =3D {class_id =3D 0x20000, > vendor_id =3D 0x14e4, > device_id =3D 0x16dc, subsystem_vendor_id =3D 0x14e4, > subsystem_device_id =3D 0x16d7} > > devargs =3D pci_devargs_lookup(dev); > > p *devargs > $5 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x3a67970 = }, > type =3D RTE_DEVTYPE_WHITELISTED_PCI, policy =3D RTE_DEV_WHITELISTED, nam= e > =3D "0000:06:02.0", '\000' , {args =3D 0x5d1c8a0 > "representor=3D[1]", drv_str =3D 0x5d1c8a0 "representor=3D[1]"}, > bus =3D 0x3a59a00 , cls =3D 0x0, bus_str =3D 0x0, cls_str = =3D > 0x0, data =3D 0x0} > > So as you can see here , this time the devargs is *not NULL* and has > the value of 'representor=3D[1]' > > local_dev_probe (devargs=3D0x5cefec0 "0000:06:02.0,representor=3D[1]", > new_dev=3D0x7ffd3e872758) at > /root/dpdk-19.11/lib/librte_eal/common/eal_common_dev.c:162 > 162 dev =3D da->bus->find_device(NULL, cmp_dev_name, da->name= ); > (gdb) p *da > $7 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x3a67970 = }, > type =3D RTE_DEVTYPE_WHITELISTED_PCI, policy =3D RTE_DEV_WHITELISTED, nam= e > =3D "0000:06:02.0", '\000' , {args =3D 0x5d1c8a0 > "representor=3D[1]", drv_str =3D 0x5d1c8a0 "representor=3D[1]"}, > bus =3D 0x3a59a00 , cls =3D 0x0, bus_str =3D 0x0, cls_str = =3D > 0x0, data =3D 0x0} > > (gdb) p *dev > $8 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x5c6df00 > "0000:06:02.0", driver =3D 0x3a6b7f0 , bus =3D 0x3a59a00 > , numa_node =3D 0, devargs =3D 0x0} > > So it appears that 'find_device' is returning NULL devargs ? > > I found that indeed is the case in pci_find_device() > > pci_find_device (start=3D0x0, cmp=3D0x4b92d8 , > data=3D0x5bb53f8) at /root/dpdk-19.11/drivers/bus/pci/pci_common.c:426 > 426 return &pdev->device; > > gdb) p pdev->device > $2 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x5b14f00 > "0000:06:02.0", driver =3D 0x3a6b7f0 , bus =3D 0x3a59a00 > , numa_node =3D 0, devargs =3D 0x0} > Any updates on this ? My thoughts on this are just as I'd suspected / suggested earlier the parent device (06:02:0) gets probed once during ovs-vswitchd bring up right after these 2 cmds ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=3Dtrue # Now start ovs-vswitchd (--no-ovsdb-server, since it is already started) ovs-ctl --no-ovsdb-server --db-sock=3D"$DB_SOCK" start After this point the pci probe was invoked on the parent device, but without any devargs hence it is NULL. Now when the 'ovs add-port' cmd is issued to add a representor port for example say 06:02:01 (using 06:02:00 as parent device) ovs-vsctl add-port ovsbr0 vfrep1 -- set Interface vfrep1 type=3Ddpdk options:dpdk-devargs=3D0000:06:02.0,representor=3D[1] What happens is *before* pci_name_set() p dev $1 =3D (struct rte_pci_device *) 0x55bd6b0 (gdb) p /x *dev $2 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, device =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x0, driver =3D 0x0, bus =3D 0x3a59a00, numa_node =3D 0x0, devargs =3D 0x0}, addr =3D {domain =3D 0x0, bus =3D 0x6, devid =3D 0x2, function =3D 0x0}.......} *after* pci_name_set() p /x *dev $3 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, device =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x0, driver =3D 0x0, bus =3D 0x3a59a00, numa_node =3D 0x0, devargs =3D 0x55a4ae0},addr =3D {domain =3D 0x0, bus =3D 0x6, devid =3D= 0x2, function =3D 0x0}......} As you can see, 'devargs' for the pci_device is now populated .. But if you go further down in pci_scan_one() pci_scan_one (dirname=3D0x7ffea330a5e0 "/sys/bus/pci/devices/0000:06:02.0", addr=3D0x7ffea330a5d0) at /root/dpdk-int_nxt/drivers/bus/pci/linux/pci.c:351 351 if (!rte_dev_is_probed(&dev2->devic= e)) { (gdb) p dev2 $5 =3D (struct rte_pci_device *) 0x54de5e0 (gdb) p /x *dev2 $6 =3D {next =3D {tqe_next =3D 0x5307460, tqe_prev =3D 0x5539f80}, device = =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x54e4f00, driver = =3D 0x3a6b7f0, bus =3D 0x3a59a00, numa_node =3D 0x0, devargs =3D 0x0}, addr =3D {domai= n =3D 0x0, bus =3D 0x6, devid =3D 0x2, function =3D 0x0} we hit the pci_device ( 0x54de5e0) that was created during the first time probe of the parent device ? Since it is already probed, it goes to line 381 where it does frees the just allocated 'pci_device' inside pci_scan_one() via 'free(dev)' As you can see this pci_device does not have it's devargs set (rightly so as initially there were no devargs for this device) But as shown in the stack trace in my previous reply, when pci_find_device() walks the rte_pci_devices list , it finds this earlier probed device (without devargs) pci_find_device (start=3D0x0, cmp=3D0x4b92d8 , data=3D0x55a4af8) at /root/dpdk-19.11/drivers/bus/pci/pci_common.c:426 426 return &pdev->device; (gdb) p pdev $15 =3D (struct rte_pci_device *) 0x54de5e0 (gdb) p /x *pdev $16 =3D {next =3D {tqe_next =3D 0x5307460, tqe_prev =3D 0x5539f80}, device = =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x54e4f00, driver = =3D 0x3a6b7f0, bus =3D 0x3a59a00, numa_node =3D 0x0, devargs =3D 0x0}, addr =3D {domai= n =3D 0x0, bus =3D 0x6, devid =3D 0x2, function =3D 0x0}, Hope this explains why the pci_device has NULL devargs at this point and how my fix to set it at this point helps solve the problem? Please let me know your thoughts Thanks Som > Som > > On Wed, Feb 5, 2020 at 2:47 PM Gaetan Rivet wrote: > > > > On 05/02/2020 09:52, Somnath Kotur wrote: > > > Hello Gaetan, > > > > > > > > > On Tue, Feb 4, 2020 at 3:19 PM Gaetan Rivet wrote: > > >> > > >> On 04/02/2020 10:15, Somnath Kotur wrote: > > >>> As per the comments in this code section, "since there is a matchin= g device, > > >>> it is now its responsibility to manage the devargs we've just inser= ted." > > >>> But the matching device ptr's devargs is still uninitialized or not= pointing > > >>> to the newest dev_args that were passed as a parameter to local_dev= _probe(). > > >>> This is needed particularly in the case when *probe is called again= * on an > > >>> already probed device(the parent device for the representor) as par= t of adding > > >>> a representor port to an OVS switch(OVS-DPDK) like so: > > >>> ovs-vsctl add-port ovsbr0 vfrep1 -- set Interface vfrep1 type=3Ddpd= k \ > > >>> options:dpdk-devargs=3D0000:06:02.0,representor=3D[1] > > >>> > > >>> Fixes: 7e8b26650146 ("eal: fix hotplug add / remove") > > >>> > > >>> Signed-off-by: Somnath Kotur > > >>> --- > > >>> lib/librte_eal/common/eal_common_dev.c | 1 + > > >>> 1 file changed, 1 insertion(+) > > >>> > > >>> diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_ea= l/common/eal_common_dev.c > > >>> index 9e4f09d..311eef5 100644 > > >>> --- a/lib/librte_eal/common/eal_common_dev.c > > >>> +++ b/lib/librte_eal/common/eal_common_dev.c > > >>> @@ -171,6 +171,7 @@ static int cmp_dev_name(const struct rte_device= *dev, const void *_name) > > >>> * those devargs shouldn't be removed manually anymore. > > >>> */ > > >>> > > >>> + dev->devargs =3D da; > > >>> ret =3D dev->bus->plug(dev); > > >>> if (ret > 0) > > >>> ret =3D -ENOTSUP; > > >>> > > >> > > >> Hello Somnath, > > >> > > >> On a surface level, the fix does not seem correct. > > >> The comment > > >> > > >> "Since there is a matching device, it is now its responsibility > > >> to manage the devargs we've just inserted. From this point, > > >> those devargs should'nt be removed manually anymore." > > >> > > >> means that the err_devarg label is not correct, on further error in = the > > >> function, returning the error code and cleaning up the device is suf= ficient. > > >> > > >> Setting the devargs for a device is the responsibility of the bus sc= an function. > > >> In the PCI bus for example, this is done in pci_name_set(), called o= nce a device name is fully qualified > > >> after scanning the system and thus being able to match a devargs to = the device name. > > >> > > >> Can you please give more information about the device bus, and maybe= trace the path taken > > >> by the line "ret =3D da->bus->scan();" a few lines above your edit? = If your dev->devargs is not set > > >> afterward, it seems the bug would be there. > > >> > > > Sure, here is the stack trace from the pci_name_set() > > > pci_name_set (dev=3D0x570eab0) at > > > /root/dpdk-int_nxt/drivers/bus/pci/pci_common.c:65 > > > 65 dev->name, sizeof(dev->name)); > > > (gdb) p /x *dev > > > $1 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, device =3D {ne= xt =3D > > > {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x0, driver =3D 0x0, b= us =3D > > > 0x3a59a00, numa_node =3D 0x0, devargs =3D 0x0}, addr =3D {domain =3D = 0x0, bus > > > =3D 0x6, devid =3D 0x2, function =3D 0x1}, id =3D {class_id =3D 0x200= 00, > > > vendor_id =3D 0x14e4, > > > device_id =3D 0x16dc, subsystem_vendor_id =3D 0x14e4, > > > subsystem_device_id =3D 0x16d7}......... > > > (gdb) bt > > > #0 pci_name_set (dev=3D0x570eab0) at > > > /root/dpdk19.11/drivers/bus/pci/pci_common.c:65 > > > #1 0x0000000000457b69 in pci_scan_one (dirname=3D0x7ffdb46ef230 > > > "/sys/bus/pci/devices/0000:06:02.1", addr=3D0x7ffdb46ef220) at > > > /root/dpdk-19.11/drivers/bus/pci/linux/pci.c:305 > > > #2 0x0000000000458188 in rte_pci_scan () at > > > /root/dpdk-int_nxt/drivers/bus/pci/linux/pci.c:488 > > > #3 0x00000000004b9530 in local_dev_probe (devargs=3D0x527cc10 > > > "0000:06:02.0,representor=3D[1]", new_dev=3D0x7ffdb46f02b8) at > > > /root/dpdk-19.11/lib/librte_eal/common/eal_common_dev.c:158 > > > #4 0x00000000004b9726 in rte_dev_probe (devargs=3D0x527cc10 > > > "0000:06:02.0,representor=3D[1]") at > > > /root/dpdk-int_nxt/lib/librte_eal/common/eal_common_dev.c:227 > > > #5 0x00000000033d196a in netdev_dpdk_process_devargs > > > (dev=3D0x2000e4800, devargs=3D0x527cc10 "0000:06:02.0,representor=3D[= 1]", > > > errp=3D0x7ffdb46f0478) at lib/netdev-dpdk.c:1798 > > > #6 0x00000000033d1d94 in netdev_dpdk_set_config (netdev=3D0x2000e488= 0, > > > args=3D0x56edf10, errp=3D0x7ffdb46f0478) at lib/netdev-dpdk.c:1921 > > > #7 0x00000000032c2e48 in netdev_set_config (netdev=3D0x2000e4880, > > > args=3D0x56edf10, errp=3D0x7ffdb46f0598) at lib/netdev.c:494 > > > #8 0x00000000031e9a4e in iface_set_netdev_config > > > (iface_cfg=3D0x56edc90, netdev=3D0x2000e4880, errp=3D0x7ffdb46f0598) = at > > > vswitchd/bridge.c:2015 > > > #9 0x00000000031e9bd1 in iface_do_create (br=3D0x52801d0, > > > iface_cfg=3D0x56edc90, ofp_portp=3D0x7ffdb46f05a4, netdevp=3D0x7ffdb4= 6f05a8, > > > errp=3D0x7ffdb46f0598) at vswitchd/bridge.c:2049 > > > #10 0x00000000031e9da6 in iface_create (br=3D0x52801d0, > > > iface_cfg=3D0x56edc90, port_cfg=3D0x56bae30) at vswitchd/bridge.c:210= 0 > > > #11 0x00000000031e7494 in bridge_add_ports__ (br=3D0x52801d0, > > > wanted_ports=3D0x52802b0, with_requested_port=3Dfalse) at > > > vswitchd/bridge.c:1164 > > > #12 0x00000000031e7525 in bridge_add_ports (br=3D0x52801d0, > > > wanted_ports=3D0x52802b0) at vswitchd/bridge.c:1180 > > > #13 0x00000000031e6a4f in bridge_reconfigure (ovs_cfg=3D0x52b2c40) at > > > vswitchd/bridge.c:893 > > > #14 0x00000000031ed37c in bridge_run () at vswitchd/bridge.c:3324 > > > #15 0x00000000031f2977 in main (argc=3D11, argv=3D0x7ffdb46f0878) at > > > vswitchd/ovs-vswitchd.c:127 > > > > > >> It could be the bus that is not properly implemented, the devargs na= me not fully qualified, a special path taken by a vdev (I'm not sure what t= he ovsbr0 device is for example), or a specific representor pluging functio= n, difficult to pinpoint exactly right now. > > > As you can see , the DBDF along with the 'representor=3D[1]' is passe= d > > > all the way down to local_dev_probe(). However when i traced further > > > inside pci_name_set() =3D>pci_devargs_lookup() > > > > > > static struct rte_devargs *pci_devargs_lookup(struct rte_pci_device *= dev) > > > { > > > struct rte_devargs *devargs; > > > struct rte_pci_addr addr; > > > > > > RTE_EAL_DEVARGS_FOREACH("pci", devargs) { > > > devargs->bus->parse(devargs->name, &addr); > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D> pci_parse() [1] > > > > > > if (!rte_pci_addr_cmp(&dev->addr, &addr)) > > > return devargs; > > > } > > > return NULL; > > > } > > > > > > [1]: This is the value of dev_args at this point in time - > > > p *devargs > > > $2 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x3a67970 }, > > > type =3D RTE_DEVTYPE_WHITELISTED_PCI, policy =3D RTE_DEV_WHITELISTED,= name > > > =3D "0000:06:02.0", '\000' , {args =3D 0x56eec00 > > > "representor=3D[1]", drv_str =3D 0x56eec00 "representor=3D[1]"}, > > > bus =3D 0x3a59a00 , cls =3D 0x0, bus_str =3D 0x0, cls= _str =3D > > > 0x0, data =3D 0x0} > > > > > > It has the 'representor' keyword still intact, but what gets passed t= o > > > pci_parse() is pci_parse (name=3D0x56d76b8 "0000:06:02.0", > > > addr=3D0x7ffdb46ed180) i.e just the original DBDF of the function > > > that was already probed once during OVS-DPDK init... I'm not sure if > > > that is the reason , but eventually it walks the entire for_each loop > > > and returns NULL ...i.e pci_devargs_lookup() returns NULL ... So this > > > is where the 'devargs' value passed is getting lost > > > Hope that gives what you were looking for ? > > > > > > > > > Thanks > > > Som > > > > > > > Hello, > > > > Yes, thank you for the thorough response. > > Your second device has the BDF 06:02.01 while its parent has 06:02.00. = The parent device already matched the devargs and its rte_device should alr= eady point to it. > > > > I think it is not correct to force link a parent device devargs to virt= ual functions spawned from it. > > The devargs is NULL, because there are none that were used to spawn the= virtual functions. > > > > Now, can you describe your overarching issue more precisely? You are ab= le to call rte_dev_probe twice on the same virtual function, because the ch= eck against it does not work without the devargs being linked? If that's co= rrect, this is the real bug IMO, and there could be other ways to mitigate = this. > > > > Thanks > > Gaetan