From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9FEB4A0548 for ; Mon, 17 May 2021 09:27:26 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5CE7A4003C; Mon, 17 May 2021 09:27:26 +0200 (CEST) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by mails.dpdk.org (Postfix) with ESMTP id 600464003C for ; Mon, 17 May 2021 09:27:24 +0200 (CEST) Received: from dggems703-chm.china.huawei.com (unknown [172.30.72.60]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Fk9ch5m6Mz16R5s; Mon, 17 May 2021 15:24:36 +0800 (CST) Received: from dggeme756-chm.china.huawei.com (10.3.19.102) by dggems703-chm.china.huawei.com (10.3.19.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Mon, 17 May 2021 15:27:20 +0800 Received: from [10.67.103.128] (10.67.103.128) by dggeme756-chm.china.huawei.com (10.3.19.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Mon, 17 May 2021 15:27:20 +0800 To: "Xueming(Steven) Li" , "stable@dpdk.org" CC: "ferruh.yigit@intel.com" References: <1621069664-39500-1-git-send-email-humin29@huawei.com> From: "Min Hu (Connor)" Message-ID: <1bf3e39d-2aeb-64cd-0a36-1da93fa1dbfa@huawei.com> Date: Mon, 17 May 2021 15:27:20 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="gbk"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.103.128] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggeme756-chm.china.huawei.com (10.3.19.102) X-CFilter-Loop: Reflected Subject: Re: [dpdk-stable] [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" ÔÚ 2021/5/17 14:58, Xueming(Steven) Li дµÀ: > Hi Min, > >> -----Original Message----- >> From: Min Hu (Connor) >> Sent: Saturday, May 15, 2021 5:08 PM >> To: stable@dpdk.org >> Cc: ferruh.yigit@intel.com; Xueming(Steven) Li >> Subject: [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF >> >> From: Chengwen Feng >> >> [ upstream commit 76a3836b98c4af6b9aaeaaa50907fe6143d31c55 ] >> >> When start testpmd with two hns3 VFs(0000:bd:01.0, 0000:bd:01.7), and then execute the following commands: >> testpmd> create bonded device 1 0 >> testpmd> set bonding mac_addr 2 3c:12:34:56:78:9a >> testpmd> add bonding slave 0 2 >> testpmd> add bonding slave 1 2 >> testpmd> set portmask 0x4 >> testpmd> port start 2 >> >> It will occurs the following error in a low probability: >> 0000:bd:01.0 hns3_get_mbx_resp(): VF could not get mbx(3,0) >> head(16) tail(15) lost(1) from PF in_irq:0 >> 0000:bd:01.0 hns3vf_set_default_mac_addr(): Failed to set mac >> addr(3C:**:**:**:78:9A) for vf: -62 >> mac_address_slaves_update(1541) - Failed to update port Id 0 >> MAC address >> >> The problem replay: >> 1. The 'port start 2' command will start slave ports and then set slave >> mac address, the function call flow: bond_ethdev_start -> >> mac_address_slaves_update. >> 2. There are also a monitor task which running in intr thread will check >> slave ports link status and update slave ports mac address, the >> function call flow: bond_ethdev_slave_link_status_change_monitor -> >> bond_ethdev_lsc_event_callback -> mac_address_slaves_update. >> 3. Because the above step1&2 running on different threads, they may both >> call drivers ops mac_addr_set which is hns3vf_set_default_mac_addr. >> 4. hns3vf_set_default_mac_addr will first acquire hw.lock and then send >> mailbox to PF and wait PF's response message. Note: the PF's >> response is an independent message which will received in hw.cmq.crq, >> the receiving operation can only performed in intr thread. >> 5. So if the step1 operation hold the hw.lock and try get response >> message, and step2 operation try acquire the hw.lock and so it can't >> process the response message, this will lead to step1 fail. >> >> The solution: >> 1. make all threads could process the mailbox response message, which >> protected by the hw.cmq.crq.lock. >> 2. use the following rules to avoid deadlock: >> 2.1. ensure use the correct locking sequence: hw.lock > >> hw.mbx_resp.lock > hw.cmq.crq.lock. >> 2.2. make sure don't acquire such as hw.lock & hw.mbx_resp.lock again >> when process mailbox response message. >> >> Fixes: 463e748964f5 ("net/hns3: support mailbox") >> Cc: stable@dpdk.org >> >> Signed-off-by: Chengwen Feng >> Signed-off-by: Min Hu (Connor) >> --- >> drivers/net/hns3/hns3_ethdev.h | 1 - >> drivers/net/hns3/hns3_ethdev_vf.c | 3 --- >> drivers/net/hns3/hns3_mbx.c | 47 +++++++++------------------------------ >> 3 files changed, 11 insertions(+), 40 deletions(-) >> >> diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h index 4c40df1..2065592 100644 >> --- a/drivers/net/hns3/hns3_ethdev.h >> +++ b/drivers/net/hns3/hns3_ethdev.h >> @@ -424,7 +424,6 @@ struct hns3_hw { >> struct hns3_cmq cmq; >> struct hns3_mbx_resp_status mbx_resp; /* mailbox response */ >> struct hns3_mbx_arq_ring arq; /* mailbox async rx queue */ >> - pthread_t irq_thread_id; >> struct hns3_mac mac; >> unsigned int secondary_cnt; /* Number of secondary processes init'd. */ >> struct hns3_tqp_stats tqp_stats; >> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c >> index 52ad825..99275f1 100644 >> --- a/drivers/net/hns3/hns3_ethdev_vf.c >> +++ b/drivers/net/hns3/hns3_ethdev_vf.c >> @@ -1112,9 +1112,6 @@ hns3vf_interrupt_handler(void *param) >> enum hns3vf_evt_cause event_cause; >> uint32_t clearval; >> >> - if (hw->irq_thread_id == 0) >> - hw->irq_thread_id = pthread_self(); >> - >> /* Disable interrupt */ >> hns3vf_disable_irq0(hw); >> >> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c index d2a5db8..975a60b 100644 >> --- a/drivers/net/hns3/hns3_mbx.c >> +++ b/drivers/net/hns3/hns3_mbx.c >> @@ -40,36 +40,14 @@ hns3_resp_to_errno(uint16_t resp_code) >> return -EIO; >> } >> >> -static void >> -hns3_poll_all_sync_msg(void) >> -{ >> - struct rte_eth_dev *eth_dev; >> - struct hns3_adapter *adapter; >> - const char *name; >> - uint16_t port_id; >> - >> - RTE_ETH_FOREACH_DEV(port_id) { >> - eth_dev = &rte_eth_devices[port_id]; >> - name = eth_dev->device->driver->name; >> - if (strcmp(name, "net_hns3") && strcmp(name, "net_hns3_vf")) >> - continue; >> - adapter = eth_dev->data->dev_private; >> - if (!adapter || adapter->hw.adapter_state == HNS3_NIC_CLOSED) >> - continue; >> - /* Synchronous msg, the mbx_resp.req_msg_data is non-zero */ >> - if (adapter->hw.mbx_resp.req_msg_data) >> - hns3_dev_handle_mbx_msg(&adapter->hw); >> - } >> -} >> - >> static int >> hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1, >> uint8_t *resp_data, uint16_t resp_len) { >> #define HNS3_MAX_RETRY_MS 500 >> +#define HNS3_WAIT_RESP_US 100 >> struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw); >> struct hns3_mbx_resp_status *mbx_resp; >> - bool in_irq = false; >> uint64_t now; >> uint64_t end; >> >> @@ -96,26 +74,19 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1, >> return -EIO; >> } >> >> - /* >> - * The mbox response is running on the interrupt thread. >> - * Sending mbox in the interrupt thread cannot wait for the >> - * response, so polling the mbox response on the irq thread. >> - */ >> - if (pthread_equal(hw->irq_thread_id, pthread_self())) { >> - in_irq = true; >> - hns3_poll_all_sync_msg(); >> - } else { >> - rte_delay_ms(HNS3_POLL_RESPONE_MS); >> - } >> + hns3_dev_handle_mbx_msg(hw); >> + rte_delay_us(HNS3_WAIT_RESP_US); >> + >> now = get_timeofday_ms(); >> } >> hw->mbx_resp.req_msg_data = 0; >> if (now >= end) { >> hw->mbx_resp.lost++; >> hns3_err(hw, >> - "VF could not get mbx(%u,%u) head(%u) tail(%u) lost(%u) from PF in_irq:%d", >> + "VF could not get mbx(%u,%u) head(%u) tail(%u) " >> + "lost(%u) from PF", >> code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail, >> - hw->mbx_resp.lost, in_irq); >> + hw->mbx_resp.lost); >> return -ETIME; >> } >> rte_io_rmb(); >> @@ -365,9 +336,11 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw) >> uint16_t flag; >> uint8_t *temp; >> int i; >> + rte_spinlock_lock(&hw->cmq.crq.lock); >> >> while (!hns3_cmd_crq_empty(hw)) { >> if (rte_atomic16_read(&hw->reset.disable_cmd)) >> + rte_spinlock_unlock(&hw->cmq.crq.lock); >> return; > > Seems "{ }" needed around if block, added during merge, thanks. > Well, you are right, thanks Xueming. >> >> desc = &crq->desc[crq->next_to_use]; >> @@ -439,4 +412,6 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw) >> >> /* Write back CMDQ_RQ header pointer, IMP need this pointer */ >> hns3_write_dev(hw, HNS3_CMDQ_RX_HEAD_REG, crq->next_to_use); >> + >> + rte_spinlock_unlock(&hw->cmq.crq.lock); >> } >> -- >> 2.7.4 > > . >