From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 96E0A42CBF; Thu, 15 Jun 2023 04:21:50 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1F39B42DAD; Thu, 15 Jun 2023 04:21:50 +0200 (CEST) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by mails.dpdk.org (Postfix) with ESMTP id 43BF042D82 for ; Thu, 15 Jun 2023 04:21:47 +0200 (CEST) Received: from kwepemm600004.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4QhQyJ342VzTlc0; Thu, 15 Jun 2023 10:21:12 +0800 (CST) Received: from [10.67.103.231] (10.67.103.231) by kwepemm600004.china.huawei.com (7.193.23.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 15 Jun 2023 10:21:44 +0800 Message-ID: Date: Thu, 15 Jun 2023 10:21:40 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH V4 1/5] drivers/bus: restore driver assignment at front of probing To: Ferruh Yigit , , CC: , , , liuyonglong References: <20220825024425.10534-1-lihuisong@huawei.com> <20221206092649.8287-1-lihuisong@huawei.com> <20221206092649.8287-2-lihuisong@huawei.com> <79192ffa-eb80-e56c-5c64-4a22374c7a1a@amd.com> <461a4536-fbbd-3848-af8d-2aa1ebaf8ccb@huawei.com> <5d06f478-6142-651b-d54b-0a3c75e9e7a0@amd.com> <80f82803-0ea0-adf6-4dd6-ccafc6643fbd@huawei.com> From: "lihuisong (C)" In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.103.231] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600004.china.huawei.com (7.193.23.242) X-CFilter-Loop: Reflected X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Ferruh, add the call stack as belows. 在 2023/6/7 18:11, lihuisong (C) 写道: > > 在 2023/6/7 0:12, Ferruh Yigit 写道: >> On 2/28/2023 2:21 AM, lihuisong (C) wrote: >>> 在 2023/2/16 0:09, Ferruh Yigit 写道: >>>> On 1/12/2023 2:44 AM, lihuisong (C) wrote: >>>>> 在 2023/1/11 20:51, Ferruh Yigit 写道: >>>>>> On 12/6/2022 9:26 AM, Huisong Li wrote: >>>>>>> The driver assignment was moved back at the end of the device >>>>>>> probing >>>>>>> because there is no something to use rte_driver during the phase of >>>>>>> probing. See commit 391797f04208 ("drivers/bus: move driver >>>>>>> assignment >>>>>>> to end of probing") >>>>>>> >>>>>>> However, it is necessary for probing callback to reference >>>>>>> rte_driver >>>>>>> before probing. For example, probing callback may call some APIs >>>>>>> which >>>>>>> access the rte_pci_driver::driver by the device::driver pointer >>>>>>> to get >>>>>>> driver information. In this case, a segment fault will occur in >>>>>>> probing >>>>>>> callback if there is not this assignment. >>>>>>> >>>>>> Probing callback gets driver as parameter, so callback function can >>>>>> access it via 'drv->driver', is there a specific usecase that >>>>>> 'dev->device->driver' needs to be accessed explicitly? >>>>>> >>>>>> I assume this is related to coming patches that setting up device in >>>>>> testpmd event callback, but can you please clarify exact need. >>>>> For example, rte_eth_dev_info_get is called in this event callback >>>>> to get >>>>> driver name. >>>>> >>> Hi Ferruh, >>> >>> Sorry for the delay. I missed this email. >>> Thanks for your review. >>>> Why 'rte_eth_dev_info_get()' is called in the event called at first >>>> place? >>> There is no limitation on rte_eth_dev_info_get() in event callback. >>> The upper layer is entirely possible to use. >>> After all, it is more convenient for app to use this pointer to get >>> driver information in a probing callback. >>> >> I think there is a logical limit in calling 'rte_eth_dev_info_get()' in >> NEW event callback, we may document this if it is missing. >> >> The NEW event callback is to let application do proper tasks before port >> addition finalized in the ethdev layer, but you are trying to call >> 'rte_eth_dev_info_get()' in the callback when it is not ready yet. > We probably shouldn't add this logical limit. > In terms of application perception, app may need to do something when > a new or destory event is received, instead of just for printing a > message. In addition, if no this patch(1/5), a segment fault will occur when attach a new device. The call stack is as belows: --> testpmd> port attach 0000:7d:00.0 Attaching a new port... EAL: Using IOMMU type 1 (Type 1) EAL: Ignore mapping IO port bar(1) EAL: Ignore mapping IO port bar(3) EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0) Thread 1 "dpdk-testpmd" received signal SIGSEGV, Segmentation fault. rte_eth_dev_info_get (port_id=0, dev_info=0x17f95ba80) at ../lib/ethdev/rte_ethdev.c:3708 3708        dev_info->driver_name = dev->device->driver->name; Missing separate debuginfos, use: dnf debuginfo-install libatomic-10.3.1-10.oe2203.aarch64 libnl3-3.5.0-4.oe2203.aarch64 openssl-libs-1.1.1m-1.oe2203.aarch64 zlib-1.2.11-19.oe2203.aarch64 (gdb) bt #0  rte_eth_dev_info_get (port_id=0, dev_info=0x17f95ba80)     at ../lib/ethdev/rte_ethdev.c:3708 #1  0x00000000005e6038 in eth_dev_info_get_print_err (port_id=0, dev_info=0x17f95ba80)     at ../app/test-pmd/util.c:441 #2  0x00000000005d5654 in init_config_port_offloads (pid=0, socket_id=0)     at ../app/test-pmd/testpmd.c:1632 #3  0x00000000005d5ec0 in reconfig (new_port_id=0, socket_id=0)     at ../app/test-pmd/testpmd.c:1828 #4  0x00000000005dafa4 in setup_attached_port (pi=0) at ../app/test-pmd/testpmd.c:3639 #5  0x00000000005dbd14 in eth_event_callback (port_id=0, type=RTE_ETH_EVENT_NEW,     param=0x0, ret_param=0x0) at ../app/test-pmd/testpmd.c:3959 #6  0x00000000008677dc in rte_eth_dev_callback_process (dev=0x1826fd80 ,     event=RTE_ETH_EVENT_NEW, ret_param=0x0) at ../lib/ethdev/ethdev_driver.c:188 #7  0x0000000000867898 in rte_eth_dev_probing_finish (dev=0x1826fd80 )     at ../lib/ethdev/ethdev_driver.c:212 #8  0x00000000055621c8 in rte_eth_dev_pci_generic_probe (pci_dev=0x18500470,     private_data_size=8448, dev_init=0x5571454 )     at ../lib/ethdev/ethdev_pci.h:139 #9  0x00000000055717bc in eth_hns3_pci_probe (pci_drv=0x17e46728 ,     pci_dev=0x18500470) at ../drivers/net/hns3/hns3_ethdev.c:6577 #10 0x0000000000970474 in rte_pci_probe_one_driver (dr=0x17e46728 ,     dev=0x18500470) at ../drivers/bus/pci/pci_common.c:312 #11 0x00000000009706e0 in pci_probe_all_drivers (dev=0x18500470)     at ../drivers/bus/pci/pci_common.c:396 #12 0x0000000000970f78 in pci_plug (dev=0x18500480) at ../drivers/bus/pci/pci_common.c:670 #13 0x00000000008e31d4 in local_dev_probe (devargs=0xffffffffb6f0 "0000:7d:00.0",     new_dev=0xffffffff9478) at ../lib/eal/common/eal_common_dev.c:212 #14 0x00000000008e3330 in rte_dev_probe (devargs=0xffffffffb6f0 "0000:7d:00.0")     at ../lib/eal/common/eal_common_dev.c:264 #15 0x00000000005daeb4 in attach_port (identifier=0xffffffffb6f0 "0000:7d:00.0")     at ../app/test-pmd/testpmd.c:3608 #16 0x0000000000579a74 in cmd_operate_attach_port_parsed (parsed_result=0xffffffffb5f0,     cl=0x184f9010, data=0x0) at ../app/test-pmd/cmdline.c:1194 --Type for more, q to quit, c to continue without paging-- #17 0x00000000008624a4 in __cmdline_parse (cl=0x184f9010,     buf=0x184f9058 "port attach 0000:7d:00.0\n", call_fn=true)     at ../lib/cmdline/cmdline_parse.c:294 #18 0x00000000008624dc in cmdline_parse (cl=0x184f9010,     buf=0x184f9058 "port attach 0000:7d:00.0\n") at ../lib/cmdline/cmdline_parse.c:302 #19 0x0000000000860308 in cmdline_valid_buffer (rdl=0x184f9020,     buf=0x184f9058 "port attach 0000:7d:00.0\n", size=26) at ../lib/cmdline/cmdline.c:24 #20 0x000000000086588c in rdline_char_in (rdl=0x184f9020, c=10 '\n')     at ../lib/cmdline/cmdline_rdline.c:444 #21 0x0000000000860750 in cmdline_in (cl=0x184f9010,     buf=0xfffffffff77f "\n\220\367\377\377\377\377", size=1)     at ../lib/cmdline/cmdline.c:146 #22 0x0000000000860a14 in cmdline_interact (cl=0x184f9010) at ../lib/cmdline/cmdline.c:226 #23 0x0000000000587908 in prompt () at ../app/test-pmd/cmdline.c:13065 #24 0x00000000005ddaf4 in main (argc=6, argv=0xfffffffffad8)     at ../app/test-pmd/testpmd.c:4690 >> >>>> This set updates multiple things to extend 'RTE_ETH_EVENT_NEW' event >>>> callback function support, like: >>>> >>>> - Assign device driver *before* probing completed, so that even >>>> callback >>>> can run 'rte_eth_dev_info_get()' >>>> >>>> - Add a new ethdev state so that port can be recognized as valid >>>> port in >>>> the even callback. >>> Yes >>>> - Stop forwarding implicitly in even callback in case event >>>> callback run >>>> while forwarding is on. >>> But this is a modification for tesptmd to make it work well. >>>> All looks to me hack/complexity to make a specific case work, which is >>>> make secondary *testmp* application work with attached/detached >>>> device. >>> It is not just a testpmd application case, but, I think, still >>> exists in >>> actual case. >>> Because eal lib supports hotplugging device on primary and secondary >>> process and the communication each other when attach or detach device. >>> The reason why no one has ever put forward this question, I think it >>> may >>> be attributed to the fact that the scene is not or rarely tested. >>>> And finally patch 4/5 adds port setup to testpmd event callback for >>>> this. >>>> >>>> >>>> I understand the intention, but I disagree with bus and ethdev level >>>> changes for this. >>> I'm just raising this issue, and we can discuss how to deal with it >>> together.😁 >>>> Event callback may not be only way to share port attach/detach >>>> information between primary and secondary, there is a MP socket and >>>> 'rte_mp_handle' thread to handle communication between primary and >>>> secondary process, this should be able to use carrying device >>>> information, as far as I remember this is why it is introduced at >>>> first >>>> place. >>>> >>>> Did you consider using MP socket for your use case? >>> Actually, the probing event callback called in patch 4/5 is the result >>> of the MP socket communication (please see hogplug_mp.c). >>>> Following is a sample usage: >>>> >>>> Primary: >>>> started as: >>>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 0-1 >>>> --log-level=*:debug -- -i --num-procs=2 --proc-id=0 >>>> >>>> `` >>>> testpmd> show port summary all >>>> Number of available ports: 0 >>>> Port MAC Address       Name         Driver         Status Link >>>> >>>> testpmd> port attach net_null0 >>>> Attaching a new port... >>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 ) >>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 ) >>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 ) >>>> vdev_probe_all_drivers(): Search driver to probe device net_null0 >>>> rte_pmd_null_probe(): Initializing pmd_null for net_null0 >>>> rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet >>>> copy >>>> is disabled >>>> eth_dev_null_create(): Creating null ethdev on numa socket 0 >>>> EAL: request: eal_dev_mp_request >>>> EAL: msg: bus_vdev_mp >>>> vdev_action(): send vdev, net_null0 >>>> EAL: sendmsg: bus_vdev_mp >>>> EAL: reply: bus_vdev_mp >>>> EAL: msg: eal_dev_mp_request >>>> Port 0 is attached. Now total ports is 1 >>>> Done >>>> >>>> testpmd> show port summary all >>>> Number of available ports: 1 >>>> Port MAC Address       Name         Driver         Status Link >>>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down 10 Gbps >>>> >>>> testpmd> port detach 0 >>>> Port was not closed >>>> Removing a device... >>>> EAL: request: eal_dev_mp_request >>>> EAL: msg: eal_dev_mp_request >>>> eth_dev_close(): Closing null ethdev on NUMA socket 0 >>>> Port 0 is closed >>>> Device is detached >>>> Now total ports is 0 >>> Please note the log *"Now total ports is 0"*, >>> which indicates the port number is updated if we detached device in >>> this >>> process. >>> >> Yes, as expected. > Yeah, it is just expected for the process did attatch or dettach > operation. > Because the thread for updating the port number info (in > setup_attached_port) and doing probe thread is the same. > please see attach_port(). > This is ok for testpmd that does not support multiple processes. But > testpmd support multiple processes now. >> >> >>>> Done >>>> >>>> testpmd> show port summary all >>>> Number of available ports: 0 >>>> Port MAC Address       Name         Driver         Status Link >>>> testpmd> >>>> >>>> `` >>>> >>>> Secondary: >>>> started as: >>>> sudo ./build/app/dpdk-testpmd --no-pci --proc-type=auto -l 2-3 >>>> --log-level=*:debug -- -i --num-procs=2 --proc-id=1 >>>> >>>> `` >>>> testpmd> show port summary all >>>> Number of available ports: 0 >>>> Port MAC Address       Name         Driver         Status Link >>>> >>>> testpmd> EAL: msg: eal_dev_mp_request >>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 ) >>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 ) >>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 ) >>>> EAL: request: bus_vdev_mp >>>> EAL: msg: bus_vdev_mp >>>> vdev_action(): receive vdev, net_null0 >>>> EAL: msg: bus_vdev_mp >>>> vdev_scan(): Received 1 vdevs >>>> vdev_probe_all_drivers(): Search driver to probe device net_null0 >>>> rte_pmd_null_probe(): Initializing pmd_null for net_null0 >>>> EAL: reply: eal_dev_mp_request >>>> >>>> testpmd> show port summary all >>>> Number of available ports: 1 >>>> Port MAC Address       Name         Driver         Status Link >>>> 0    DE:E5:79:00:A9:68 net_null0    net_null       down 10 Gbps >>>> >>>> testpmd> EAL: msg: eal_dev_mp_request >>>> dpaa: rte_dpaa_bus_parse(): Parse device name (net_null0 ) >>>> fslmc: rte_fslmc_parse(): Parsing dev=(net_null0 ) >>>> fslmc: rte_fslmc_parse(): Unknown or unsupported device (net_null0 ) >>>> eth_dev_close(): Closing null ethdev on NUMA socket 4294967295 >>>> Port 0 is closed >>>> EAL: reply: eal_dev_mp_request >>> But the port number in this process does not be updated after finishing >>> the destroy event. >>> That's the problem. >>> >> Please see following command output, >> it says "Number of available ports: 0", > Here comes from rte_eth_dev_count_avail(). >> so that shows port number is updated after destroy event, isn't it? > no. > There are three global variables,  "fwd_ports_ids[]", "ports[]", > "nb_ports" maintained in testpmd. > The secondary doesn't update these info after receiving a destory or > new message in your test. >> >> >>>> testpmd> show port summary all >>>> Number of available ports: 0 >>>> Port MAC Address       Name         Driver         Status Link >>>> testpmd> >>>> `` >>>> >>>> >>>> . >> . > .