From: "Min Hu (Connor)" <humin29@huawei.com>
To: "Xueming(Steven) Li" <xuemingl@nvidia.com>,
"stable@dpdk.org" <stable@dpdk.org>
Cc: "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-stable] [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
Date: Mon, 17 May 2021 15:27:20 +0800 [thread overview]
Message-ID: <1bf3e39d-2aeb-64cd-0a36-1da93fa1dbfa@huawei.com> (raw)
In-Reply-To: <BY5PR12MB43244C2FA678E3BAD47BC024A12D9@BY5PR12MB4324.namprd12.prod.outlook.com>
在 2021/5/17 14:58, Xueming(Steven) Li 写道:
> Hi Min,
>
>> -----Original Message-----
>> From: Min Hu (Connor) <humin29@huawei.com>
>> Sent: Saturday, May 15, 2021 5:08 PM
>> To: stable@dpdk.org
>> Cc: ferruh.yigit@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
>> Subject: [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
>>
>> From: Chengwen Feng <fengchengwen@huawei.com>
>>
>> [ upstream commit 76a3836b98c4af6b9aaeaaa50907fe6143d31c55 ]
>>
>> When start testpmd with two hns3 VFs(0000:bd:01.0, 0000:bd:01.7), and then execute the following commands:
>> testpmd> create bonded device 1 0
>> testpmd> set bonding mac_addr 2 3c:12:34:56:78:9a
>> testpmd> add bonding slave 0 2
>> testpmd> add bonding slave 1 2
>> testpmd> set portmask 0x4
>> testpmd> port start 2
>>
>> It will occurs the following error in a low probability:
>> 0000:bd:01.0 hns3_get_mbx_resp(): VF could not get mbx(3,0)
>> head(16) tail(15) lost(1) from PF in_irq:0
>> 0000:bd:01.0 hns3vf_set_default_mac_addr(): Failed to set mac
>> addr(3C:**:**:**:78:9A) for vf: -62
>> mac_address_slaves_update(1541) - Failed to update port Id 0
>> MAC address
>>
>> The problem replay:
>> 1. The 'port start 2' command will start slave ports and then set slave
>> mac address, the function call flow: bond_ethdev_start ->
>> mac_address_slaves_update.
>> 2. There are also a monitor task which running in intr thread will check
>> slave ports link status and update slave ports mac address, the
>> function call flow: bond_ethdev_slave_link_status_change_monitor ->
>> bond_ethdev_lsc_event_callback -> mac_address_slaves_update.
>> 3. Because the above step1&2 running on different threads, they may both
>> call drivers ops mac_addr_set which is hns3vf_set_default_mac_addr.
>> 4. hns3vf_set_default_mac_addr will first acquire hw.lock and then send
>> mailbox to PF and wait PF's response message. Note: the PF's
>> response is an independent message which will received in hw.cmq.crq,
>> the receiving operation can only performed in intr thread.
>> 5. So if the step1 operation hold the hw.lock and try get response
>> message, and step2 operation try acquire the hw.lock and so it can't
>> process the response message, this will lead to step1 fail.
>>
>> The solution:
>> 1. make all threads could process the mailbox response message, which
>> protected by the hw.cmq.crq.lock.
>> 2. use the following rules to avoid deadlock:
>> 2.1. ensure use the correct locking sequence: hw.lock >
>> hw.mbx_resp.lock > hw.cmq.crq.lock.
>> 2.2. make sure don't acquire such as hw.lock & hw.mbx_resp.lock again
>> when process mailbox response message.
>>
>> Fixes: 463e748964f5 ("net/hns3: support mailbox")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>> drivers/net/hns3/hns3_ethdev.h | 1 -
>> drivers/net/hns3/hns3_ethdev_vf.c | 3 ---
>> drivers/net/hns3/hns3_mbx.c | 47 +++++++++------------------------------
>> 3 files changed, 11 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h index 4c40df1..2065592 100644
>> --- a/drivers/net/hns3/hns3_ethdev.h
>> +++ b/drivers/net/hns3/hns3_ethdev.h
>> @@ -424,7 +424,6 @@ struct hns3_hw {
>> struct hns3_cmq cmq;
>> struct hns3_mbx_resp_status mbx_resp; /* mailbox response */
>> struct hns3_mbx_arq_ring arq; /* mailbox async rx queue */
>> - pthread_t irq_thread_id;
>> struct hns3_mac mac;
>> unsigned int secondary_cnt; /* Number of secondary processes init'd. */
>> struct hns3_tqp_stats tqp_stats;
>> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
>> index 52ad825..99275f1 100644
>> --- a/drivers/net/hns3/hns3_ethdev_vf.c
>> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
>> @@ -1112,9 +1112,6 @@ hns3vf_interrupt_handler(void *param)
>> enum hns3vf_evt_cause event_cause;
>> uint32_t clearval;
>>
>> - if (hw->irq_thread_id == 0)
>> - hw->irq_thread_id = pthread_self();
>> -
>> /* Disable interrupt */
>> hns3vf_disable_irq0(hw);
>>
>> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c index d2a5db8..975a60b 100644
>> --- a/drivers/net/hns3/hns3_mbx.c
>> +++ b/drivers/net/hns3/hns3_mbx.c
>> @@ -40,36 +40,14 @@ hns3_resp_to_errno(uint16_t resp_code)
>> return -EIO;
>> }
>>
>> -static void
>> -hns3_poll_all_sync_msg(void)
>> -{
>> - struct rte_eth_dev *eth_dev;
>> - struct hns3_adapter *adapter;
>> - const char *name;
>> - uint16_t port_id;
>> -
>> - RTE_ETH_FOREACH_DEV(port_id) {
>> - eth_dev = &rte_eth_devices[port_id];
>> - name = eth_dev->device->driver->name;
>> - if (strcmp(name, "net_hns3") && strcmp(name, "net_hns3_vf"))
>> - continue;
>> - adapter = eth_dev->data->dev_private;
>> - if (!adapter || adapter->hw.adapter_state == HNS3_NIC_CLOSED)
>> - continue;
>> - /* Synchronous msg, the mbx_resp.req_msg_data is non-zero */
>> - if (adapter->hw.mbx_resp.req_msg_data)
>> - hns3_dev_handle_mbx_msg(&adapter->hw);
>> - }
>> -}
>> -
>> static int
>> hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>> uint8_t *resp_data, uint16_t resp_len) {
>> #define HNS3_MAX_RETRY_MS 500
>> +#define HNS3_WAIT_RESP_US 100
>> struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
>> struct hns3_mbx_resp_status *mbx_resp;
>> - bool in_irq = false;
>> uint64_t now;
>> uint64_t end;
>>
>> @@ -96,26 +74,19 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>> return -EIO;
>> }
>>
>> - /*
>> - * The mbox response is running on the interrupt thread.
>> - * Sending mbox in the interrupt thread cannot wait for the
>> - * response, so polling the mbox response on the irq thread.
>> - */
>> - if (pthread_equal(hw->irq_thread_id, pthread_self())) {
>> - in_irq = true;
>> - hns3_poll_all_sync_msg();
>> - } else {
>> - rte_delay_ms(HNS3_POLL_RESPONE_MS);
>> - }
>> + hns3_dev_handle_mbx_msg(hw);
>> + rte_delay_us(HNS3_WAIT_RESP_US);
>> +
>> now = get_timeofday_ms();
>> }
>> hw->mbx_resp.req_msg_data = 0;
>> if (now >= end) {
>> hw->mbx_resp.lost++;
>> hns3_err(hw,
>> - "VF could not get mbx(%u,%u) head(%u) tail(%u) lost(%u) from PF in_irq:%d",
>> + "VF could not get mbx(%u,%u) head(%u) tail(%u) "
>> + "lost(%u) from PF",
>> code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail,
>> - hw->mbx_resp.lost, in_irq);
>> + hw->mbx_resp.lost);
>> return -ETIME;
>> }
>> rte_io_rmb();
>> @@ -365,9 +336,11 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
>> uint16_t flag;
>> uint8_t *temp;
>> int i;
>> + rte_spinlock_lock(&hw->cmq.crq.lock);
>>
>> while (!hns3_cmd_crq_empty(hw)) {
>> if (rte_atomic16_read(&hw->reset.disable_cmd))
>> + rte_spinlock_unlock(&hw->cmq.crq.lock);
>> return;
>
> Seems "{ }" needed around if block, added during merge, thanks.
>
Well, you are right, thanks Xueming.
>>
>> desc = &crq->desc[crq->next_to_use];
>> @@ -439,4 +412,6 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
>>
>> /* Write back CMDQ_RQ header pointer, IMP need this pointer */
>> hns3_write_dev(hw, HNS3_CMDQ_RX_HEAD_REG, crq->next_to_use);
>> +
>> + rte_spinlock_unlock(&hw->cmq.crq.lock);
>> }
>> --
>> 2.7.4
>
> .
>
prev parent reply other threads:[~2021-05-17 7:27 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-15 9:07 Min Hu (Connor)
2021-05-17 6:58 ` Xueming(Steven) Li
2021-05-17 7:27 ` Min Hu (Connor) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1bf3e39d-2aeb-64cd-0a36-1da93fa1dbfa@huawei.com \
--to=humin29@huawei.com \
--cc=ferruh.yigit@intel.com \
--cc=stable@dpdk.org \
--cc=xuemingl@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).