From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 4396F432B0;
	Mon,  6 Nov 2023 14:01:25 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id C9FFD402BA;
	Mon,  6 Nov 2023 14:01:24 +0100 (CET)
Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189])
 by mails.dpdk.org (Postfix) with ESMTP id 9B1D9402B6
 for <dev@dpdk.org>; Mon,  6 Nov 2023 14:01:23 +0100 (CET)
Received: from dggpeml100024.china.huawei.com (unknown [172.30.72.55])
 by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4SPBFM3tzVzMmV9;
 Mon,  6 Nov 2023 20:56:55 +0800 (CST)
Received: from [10.67.121.161] (10.67.121.161) by
 dggpeml100024.china.huawei.com (7.185.36.115) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2507.31; Mon, 6 Nov 2023 21:01:21 +0800
Subject: Re: [PATCH v2 5/7] app/testpmd: add error recovery usage demo
To: "lihuisong (C)" <lihuisong@huawei.com>, <thomas@monjalon.net>,
 <ferruh.yigit@amd.com>, <konstantin.ananyev@huawei.com>,
 <ajit.khaparde@broadcom.com>, Aman Singh <aman.deep.singh@intel.com>, Yuying
 Zhang <yuying.zhang@intel.com>
CC: <dev@dpdk.org>, <andrew.rybchenko@oktetlabs.ru>,
 <kalesh-anakkur.purayil@broadcom.com>, <Honnappa.Nagarahalli@arm.com>
References: <20230301030610.49468-1-fengchengwen@huawei.com>
 <20231020100746.31520-1-fengchengwen@huawei.com>
 <20231020100746.31520-6-fengchengwen@huawei.com>
 <c0a86fa9-e278-a686-7778-04aa97b9c8e2@huawei.com>
From: fengchengwen <fengchengwen@huawei.com>
Message-ID: <839b4c9e-6a2e-1264-72f7-82c911cd82ee@huawei.com>
Date: Mon, 6 Nov 2023 21:01:21 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
 Thunderbird/68.11.0
MIME-Version: 1.0
In-Reply-To: <c0a86fa9-e278-a686-7778-04aa97b9c8e2@huawei.com>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Originating-IP: [10.67.121.161]
X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To
 dggpeml100024.china.huawei.com (7.185.36.115)
X-CFilter-Loop: Reflected
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Hi Huisong,

On 2023/11/1 12:08, lihuisong (C) wrote:
> 
> 在 2023/10/20 18:07, Chengwen Feng 写道:
>> This patch adds error recovery usage demo which will:
>> 1. stop packet forwarding when the RTE_ETH_EVENT_ERR_RECOVERING event
>>     is received.
>> 2. restart packet forwarding when the RTE_ETH_EVENT_RECOVERY_SUCCESS
>>     event is received.
>> 3. prompt the ports that fail to recovery and need to be removed when
>>     the RTE_ETH_EVENT_RECOVERY_FAILED event is received.
> Why not suggest that try to call dev_reset() or other way to recovery?

It was already discussed many times, which is the reason why introduced the
RTE_ETH_EVENT_RECOVERY_XXX event, please refer previous thread.

>>
>> In addition, a message is added to the printed information, requiring
>> no command to be executed during the error recovery.
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>> ---
>>   app/test-pmd/testpmd.c | 80 ++++++++++++++++++++++++++++++++++++++++++
>>   app/test-pmd/testpmd.h |  4 ++-
>>   2 files changed, 83 insertions(+), 1 deletion(-)
>>
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index 595b77748c..39a25238e5 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -3942,6 +3942,77 @@ rmv_port_callback(void *arg)
>>           start_packet_forwarding(0);
>>   }
>>   +static int need_start_when_recovery_over;
>> +
>> +static bool
>> +has_port_in_err_recovering(void)
>> +{
>> +    struct rte_port *port;
>> +    portid_t pid;
>> +
>> +    RTE_ETH_FOREACH_DEV(pid) {
>> +        port = &ports[pid];
>> +        if (port->err_recovering)
>> +            return true;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>> +static void
>> +err_recovering_callback(portid_t port_id)
>> +{
>> +    if (!has_port_in_err_recovering())
>> +        printf("Please stop executing any commands until recovery result events are received!\n");
>> +
>> +    ports[port_id].err_recovering = 1;
>> +    ports[port_id].recover_failed = 0;
>> +
>> +    /* To simplify implementation, stop forwarding regardless of whether the port is used. */
>> +    if (!test_done) {
>> +        printf("Stop packet forwarding because some ports are in error recovering!\n");
>> +        stop_packet_forwarding();
>> +        need_start_when_recovery_over = 1;
>> +    }
>> +}
>> +
>> +static void
>> +recover_success_callback(portid_t port_id)
>> +{
>> +    ports[port_id].err_recovering = 0;
>> +    if (has_port_in_err_recovering())
>> +        return;
>> +
>> +    if (need_start_when_recovery_over) {
>> +        printf("Recovery success! Restart packet forwarding!\n");
>> +        start_packet_forwarding(0);
> s/start_packet_forwarding(0)/start_packet_forwarding() ?

start_packet_forwarding must have one parameter, 0 is proper use for here.

Thanks
Chengwen

>> +        need_start_when_recovery_over = 0;
>> +    } else {
>> +        printf("Recovery success!\n");
>> +    }
>> +}
>> +
>> +static void
>> +recover_failed_callback(portid_t port_id)
>> +{
>> +    struct rte_port *port;
>> +    portid_t pid;
>> +
>> +    ports[port_id].err_recovering = 0;
>> +    ports[port_id].recover_failed = 1;
>> +    if (has_port_in_err_recovering())
>> +        return;
>> +
>> +    need_start_when_recovery_over = 0;
>> +    printf("The ports:");
>> +    RTE_ETH_FOREACH_DEV(pid) {
>> +        port = &ports[pid];
>> +        if (port->recover_failed)
>> +            printf(" %u", pid);
>> +    }
>> +    printf(" recovery failed! Please remove them!\n");
>> +}
>> +
>>   /* This function is used by the interrupt thread */
>>   static int
>>   eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>> @@ -3997,6 +4068,15 @@ eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
>>           }
>>           break;
>>       }
>> +    case RTE_ETH_EVENT_ERR_RECOVERING:
>> +        err_recovering_callback(port_id);
>> +        break;
>> +    case RTE_ETH_EVENT_RECOVERY_SUCCESS:
>> +        recover_success_callback(port_id);
>> +        break;
>> +    case RTE_ETH_EVENT_RECOVERY_FAILED:
>> +        recover_failed_callback(port_id);
>> +        break;
>>       default:
>>           break;
>>       }
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index 09a36b90b8..42782d5a05 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -342,7 +342,9 @@ struct rte_port {
>>       uint8_t                 member_flag : 1, /**< bonding member port */
>>                   bond_flag : 1, /**< port is bond device */
>>                   fwd_mac_swap : 1, /**< swap packet MAC before forward */
>> -                update_conf : 1; /**< need to update bonding device configuration */
>> +                update_conf : 1, /**< need to update bonding device configuration */
>> +                err_recovering : 1, /**< port is in error recovering */
>> +                recover_failed : 1; /**< port recover failed */
>>       struct port_template    *pattern_templ_list; /**< Pattern templates. */
>>       struct port_template    *actions_templ_list; /**< Actions templates. */
>>       struct port_table       *table_list; /**< Flow tables. */
> .