From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2B604A0503; Fri, 20 May 2022 17:15:05 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1A0B940222; Fri, 20 May 2022 17:15:05 +0200 (CEST) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2045.outbound.protection.outlook.com [40.107.94.45]) by mails.dpdk.org (Postfix) with ESMTP id B45B240156 for ; Fri, 20 May 2022 17:15:03 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=io7m8p2w28iRUhdZG19Dtcz5iY/eBdw1713MKSoNyOdhdTsPOzQ47fkiYx9MtJrjM8YS8mfDNXkNX3471MhTWVLSkcWRs1LjijuiIco+FcOM9/ZUZDxe+ZEc0aMAlVEZwzvuiH3b3GT93btRHomawWQTyd9KXN1+6TtnGqheNhGyOSuMYjEEmAKu+S4EKgiMbWQYUzsa36jl5/1Sqru16UFU+5qoPMArgNO7Pe8eUvf0z5KORpKYC0xIvMYh1M6FmsdG4dRV0oEOgbtsHTv2NDkyZkpu/hu20wFjVitlokynviFheaRG3TxfLN2lbgk+Q8m5TjHxbkWxzB10dDCTBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OGUyU0QMtp7Z6lX0iO/AfVvZ6vsZUckrXRTkEW1x7E4=; b=UDMfldF25skOFWXr9brKByCnw2N/4FH9qAkogfwjckxiwHQra58PyV0SuuvbYnXdQuoQ5/PCKhh4g9xRrsMBMQWsVj6dv2ckMj8yyAfOGSHLGNNTdKP72+ijjPv7js/FoUlZbbKD2ZQeTyq6GG/IZobG2FGDy5fnsHWNgkQ4HBzWlht547HguxZXS22kbDnO2Qf7roAFAUthV+pxQrcYrx0GnHAAzEkyRfyApzJ2Jp00YcATBZQA4HXa0ukIsreB5tPyCCeKuqKyUds/dJpjwSIn7q9Yol2hIzv28Z3+A83Js8xGbTVlsoWpMvGURtzOxbpj8NswE9ZEXN2iYYoAPQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=softfail (sender ip is 149.199.80.198) smtp.rcpttodomain=huawei.com smtp.mailfrom=amd.com; dmarc=fail (p=quarantine sp=quarantine pct=100) action=quarantine header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xilinx.onmicrosoft.com; s=selector2-xilinx-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OGUyU0QMtp7Z6lX0iO/AfVvZ6vsZUckrXRTkEW1x7E4=; b=btfB0MwTe5+XdLWr7y+48FMX9QbU4HGnoxWDUZvsXBYyiDcgv8/pN+OQh72azcDM56Os1r7yZLwRdMzVPdKlT8p9bVEKvSu7FJ9K7J8yRpl/Y+0O3HDjKeEmTrUH/CjxXLiQrMY6EjYppTQmivKCBW3ZFwZ01t2K6G7Llr8PNAM= Received: from DM5PR07CA0046.namprd07.prod.outlook.com (2603:10b6:3:16::32) by SJ0PR02MB8768.namprd02.prod.outlook.com (2603:10b6:a03:3e1::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5273.14; Fri, 20 May 2022 15:15:01 +0000 Received: from DM3NAM02FT017.eop-nam02.prod.protection.outlook.com (2603:10b6:3:16:cafe::87) by DM5PR07CA0046.outlook.office365.com (2603:10b6:3:16::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5144.25 via Frontend Transport; Fri, 20 May 2022 15:15:00 +0000 X-MS-Exchange-Authentication-Results: spf=softfail (sender IP is 149.199.80.198) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=fail action=quarantine header.from=amd.com; Received-SPF: SoftFail (protection.outlook.com: domain of transitioning amd.com discourages use of 149.199.80.198 as permitted sender) Received: from xir-pvapexch02.xlnx.xilinx.com (149.199.80.198) by DM3NAM02FT017.mail.protection.outlook.com (10.13.5.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5273.14 via Frontend Transport; Fri, 20 May 2022 15:15:00 +0000 Received: from xir-pvapexch01.xlnx.xilinx.com (172.21.17.15) by xir-pvapexch02.xlnx.xilinx.com (172.21.17.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.14; Fri, 20 May 2022 16:14:58 +0100 Received: from smtp.xilinx.com (172.21.105.198) by xir-pvapexch01.xlnx.xilinx.com (172.21.17.15) with Microsoft SMTP Server id 15.1.2176.14 via Frontend Transport; Fri, 20 May 2022 16:14:58 +0100 Envelope-to: fengchengwen@huawei.com, yuying.zhang@intel.com, aman.deep.singh@intel.com, dev@dpdk.org, xiaoyun.li@intel.com, andrew.rybchenko@oktetlabs.ru, thomas@monjalon.net Received: from [10.71.119.221] (port=59811) by smtp.xilinx.com with esmtp (Exim 4.90) (envelope-from ) id 1ns4LC-0002AS-Mn; Fri, 20 May 2022 16:14:58 +0100 Message-ID: Date: Fri, 20 May 2022 16:14:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH] app/testpmd: remove invalid ports when other process detach Content-Language: en-US From: Ferruh Yigit To: fengchengwen , , CC: , , Andrew Rybchenko , Thomas Monjalon References: <20220302023326.16509-1-fengchengwen@huawei.com> <5569707.V25eIC5XRa@thomas> <31263d7d-b880-ff29-70d9-2a8f75b0bb45@huawei.com> <80bdb6d3-8259-80b1-3733-ef19b6b4f840@amd.com> In-Reply-To: <80bdb6d3-8259-80b1-3733-ef19b6b4f840@amd.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0190ad8c-729a-4c5f-83ee-08da3a73821d X-MS-TrafficTypeDiagnostic: SJ0PR02MB8768:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 2 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: QYYr0Pv7vPo7qqgg1G87z++QOJWw+lrtdeASUvrQI3f9AORhyN52D/CFzBPi0cSgOJ6KIlgMpBQoYGWIQs5JN1P99t83Umr1D7nMzvDbVyIr+FJwOxfdzFr4RPwJ2KbY/avbnB9mQkkUU8q9lQeMrJnMu0dV8kHJHCXKzXsc99s7K0aPj6kjzwxDnvfR/czVOsxW+sKE/siIrgo/hKY+Y0JyIgDqfc1CGEq3+quFlspU5N6zJuf7IXvH//0TCWZp/UxhiLdRdnfLLMmIou8eJTluKANnUgQpDCffjZSK+kMYBn4sb/OkgDzuyjrym05ipU/IYyMfke3c2CRThSB4EKe77JighOEPRXAqoqUpytdExa3cQvl+BCTUa5NdFF3RAgDqMmNaTO+uitG+WQs6VeZJmwzOsmCHCgolfMDa8zhKgBcN6ulzhFnwzPBKX5CRnlxow1p0zKhCwZ5uYM1ejjraqa3bSkPzuR2du59DfPkRTXCwzVqbn45f+dmrD/3aa4H03Fb61dvkBL99RX5j7/kppf7Wht1DuUcwbOGSbZgzWnRJa6kcGaAhVrM9bEqV2lfrcrLo9fMhw7oAkdKz9QlsehBkTBIHTM2FkJZR5QQmp6eCnhVZ0sp23dtKnFR90K3P5vr+H1OGJ0YxfeKhqfCOzfNjOPjp1mJD1kpy4+/zNSxEAAi87V0suQXgK9y/5eudXQtsGJ9UzHBQrVmzdHAXQ7RHAKHaf0VHYTw+Elo= X-Forefront-Antispam-Report: CIP:149.199.80.198; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:xir-pvapexch02.xlnx.xilinx.com; PTR:unknown-80-198.xilinx.com; CAT:NONE; SFS:(13230001)(4636009)(46966006)(40470700004)(86362001)(31696002)(26005)(2616005)(31686004)(40460700003)(110136005)(47076005)(336012)(316002)(54906003)(70206006)(8676002)(82310400005)(4326008)(36756003)(70586007)(7636003)(2906002)(44832011)(53546011)(83380400001)(8936002)(9786002)(356005)(508600001)(35950700001)(5660300002)(50156003)(43740500002); DIR:OUT; SFP:1101; X-OriginatorOrg: xilinx.onmicrosoft.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 May 2022 15:15:00.2156 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0190ad8c-729a-4c5f-83ee-08da3a73821d X-MS-Exchange-CrossTenant-Id: 657af505-d5df-48d0-8300-c31994686c5c X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=657af505-d5df-48d0-8300-c31994686c5c; Ip=[149.199.80.198]; Helo=[xir-pvapexch02.xlnx.xilinx.com] X-MS-Exchange-CrossTenant-AuthSource: DM3NAM02FT017.eop-nam02.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB8768 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 5/20/2022 4:05 PM, Ferruh Yigit wrote: > [CAUTION: External Email] > > On 3/2/2022 8:36 AM, fengchengwen wrote: >> On 2022/3/2 16:26, Thomas Monjalon wrote: >>> 02/03/2022 03:33, Chengwen Feng: >>>> Start main and secondary process: >>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8 >>>>     --num-procs=2 --proc-id=0 >>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8 >>>>     --num-procs=2 --proc-id=1 >>>> Execute following command in main process: >>>>     port stop 0 >>>>     port detach 0 >>>> Execute following command in secondary process: >>>>     set fwd mac >>>>     start >>>> The secondary process will display: >>>>     Invalid port_id=0 >>>>     telcore 19 called rx_pkt_burst for not ready port 0 >>>>     stpmd> 8: [/lib64/libc.so.6(+0xdf600) [0xffff9e1dc600]] >>>>     7: [/lib64/libpthread.so.0(+0x7c48) [0xffff9e28ac48]] >>>>     6: [/usr/app/testpmd(eal_thread_loop+0x2c4) [0xb23574]] >>>>     5: [/usr/app/testpmd() [0x9c21d8]] >>>>     4: [/usr/app/testpmd() [0x9c2108]] >>>>     3: [/usr/app/testpmd() [0x9b6cf0]] >>>>     2: [/usr/app/testpmd() [0xad8620]] >>>>     1: [/usr/app/testpmd(rte_dump_stack+0x20) [0xb1a130]] >>>> >>>> The root cause it that the secondary process has not removed invalid >>>> ports when it processes RTE_ETH_EVENT_DESTROY event. >>> >>> Why the ports are not removed? >> > > It is referring to application (testpmd) level state, ethdev port is > removed. > > Above mentioned problem is valid since testpmd secondary process support > is added. > When primary hot remove a device, it updates relevant application state > too, but for secondary process ethdev removes device without secondary > process updating its state. > > I agree that using ethdev events is appropriate for this case, since > action originated from ethdev for secondary process, need a way to > notify application about it. > Another option can be checking and removing invalid ports in secondary > before each forwarding start, but since we already have an event for > destroy, using it looks good to me. > Closing port in the primary has the same problem, secondary process state is not updated and crashes. A high level design decision can be to reduce the application level states and rely on ethdev states more, to reduce/remove this ethdev and application state sync requirement, @Aman, @Yuying, what do you think about this high level, long term goal? (Like why there is a testpmd 'RTE_PORT_CLOSED' state, should be in ethdev?) ((BTW, why testpmd defined 'RTE_PORT_CLOSED' has 'RTE_' prefix, @Aman can you do a cleanup for it, 'RTE_' -> 'TESTPMD_' ?)) > >> Testpmd register function eth_event_callback to deal with DESTROY event, >> currently it only assign ports[port_id].port_status with >> RTE_PORT_CLOSED, it >> doesn't update other global variables like nb_ports. >> >>> >>>> This patch adds a delay remove invalid ports invoke when process the >>>> RTE_ETH_EVENT_DESTROY event. >>> >>> Why do we need this delay? >> >> The remove_invalid_ports will scan rte_eth_devices[], when process the >> DESTROY >> event, the rte_eth_devices[x] still valid, so here we should a delay >> logic. >> > > There is a dependency problem in the DESTROY event. > > We need some ethdev information to be able to deliver the event, so > ethdev can't be completely destroyed until DESTROY event is processed. > Which means when application received DESTROY event, ethdev is not fully > destroyed yet. > > Except from above, 'remove_invalid_ports()' is called twice for primary > process (as mentioned in commit log), this timer can be helping these > two calls not conflict, but I am not sure if we can rely on a time for > this. > Since the problem is mainly for secondary process, assuming control > commands like detaching a device will be called by primary process, so > @Feng what do you think about adding a secondary process check to in the > 'remove_invalid_ports_callback()' call? So 'remove_invalid_ports()' is > called only once for primary process. > >>> >>> [...] >>>> +static void >>>> +remove_invalid_ports_callback(void *arg) >>>> +{ >>>> +   RTE_SET_USED(arg); >>>> +   remove_invalid_ports(); >>>> +} >>> [...] >>>>     case RTE_ETH_EVENT_DESTROY: >>>>             ports[port_id].port_status = RTE_PORT_CLOSED; >>>>             printf("Port %u is closed\n", port_id); >>>> +           if (rte_eal_alarm_set(100000, >>>> remove_invalid_ports_callback, >>>> +                           (void *)(intptr_t)port_id)) >>>> +                   fprintf(stderr, >>>> +                           "Could not set up deferred device >>>> released\n"); >>>>             break; >>> >>> >>> >>> >>> . >>> >> >