patches for DPDK stable branches
 help / color / mirror / Atom feed
From: "Min Hu (Connor)" <humin29@huawei.com>
To: "Xueming(Steven) Li" <xuemingl@nvidia.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Cc: "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-stable] [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
Date: Mon, 17 May 2021 15:27:20 +0800	[thread overview]
Message-ID: <1bf3e39d-2aeb-64cd-0a36-1da93fa1dbfa@huawei.com> (raw)
In-Reply-To: <BY5PR12MB43244C2FA678E3BAD47BC024A12D9@BY5PR12MB4324.namprd12.prod.outlook.com>



在 2021/5/17 14:58, Xueming(Steven) Li 写道:
> Hi Min,
> 
>> -----Original Message-----
>> From: Min Hu (Connor) <humin29@huawei.com>
>> Sent: Saturday, May 15, 2021 5:08 PM
>> To: stable@dpdk.org
>> Cc: ferruh.yigit@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
>> Subject: [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
>>
>> From: Chengwen Feng <fengchengwen@huawei.com>
>>
>> [ upstream commit 76a3836b98c4af6b9aaeaaa50907fe6143d31c55 ]
>>
>> When start testpmd with two hns3 VFs(0000:bd:01.0, 0000:bd:01.7), and then execute the following commands:
>> 	testpmd> create bonded device 1 0
>> 	testpmd> set bonding mac_addr 2 3c:12:34:56:78:9a
>> 	testpmd> add bonding slave 0 2
>> 	testpmd> add bonding slave 1 2
>> 	testpmd> set portmask 0x4
>> 	testpmd> port start 2
>>
>> It will occurs the following error in a low probability:
>> 	0000:bd:01.0 hns3_get_mbx_resp(): VF could not get mbx(3,0)
>> 		head(16) tail(15) lost(1) from PF in_irq:0
>> 	0000:bd:01.0 hns3vf_set_default_mac_addr(): Failed to set mac
>> 		addr(3C:**:**:**:78:9A) for vf: -62
>> 	mac_address_slaves_update(1541) - Failed to update port Id 0
>> 		MAC address
>>
>> The problem replay:
>> 1. The 'port start 2' command will start slave ports and then set slave
>>     mac address, the function call flow: bond_ethdev_start ->
>>     mac_address_slaves_update.
>> 2. There are also a monitor task which running in intr thread will check
>>     slave ports link status and update slave ports mac address, the
>>     function call flow: bond_ethdev_slave_link_status_change_monitor ->
>>     bond_ethdev_lsc_event_callback -> mac_address_slaves_update.
>> 3. Because the above step1&2 running on different threads, they may both
>>     call drivers ops mac_addr_set which is hns3vf_set_default_mac_addr.
>> 4. hns3vf_set_default_mac_addr will first acquire hw.lock and then send
>>     mailbox to PF and wait PF's response message.  Note: the PF's
>>     response is an independent message which will received in hw.cmq.crq,
>>     the receiving operation can only performed in intr thread.
>> 5. So if the step1 operation hold the hw.lock and try get response
>>     message, and step2 operation try acquire the hw.lock and so it can't
>>     process the response message, this will lead to step1 fail.
>>
>> The solution:
>> 1. make all threads could process the mailbox response message, which
>>     protected by the hw.cmq.crq.lock.
>> 2. use the following rules to avoid deadlock:
>> 2.1. ensure use the correct locking sequence: hw.lock >
>>       hw.mbx_resp.lock > hw.cmq.crq.lock.
>> 2.2. make sure don't acquire such as hw.lock & hw.mbx_resp.lock again
>>       when process mailbox response message.
>>
>> Fixes: 463e748964f5 ("net/hns3: support mailbox")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>>   drivers/net/hns3/hns3_ethdev.h    |  1 -
>>   drivers/net/hns3/hns3_ethdev_vf.c |  3 ---
>>   drivers/net/hns3/hns3_mbx.c       | 47 +++++++++------------------------------
>>   3 files changed, 11 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h index 4c40df1..2065592 100644
>> --- a/drivers/net/hns3/hns3_ethdev.h
>> +++ b/drivers/net/hns3/hns3_ethdev.h
>> @@ -424,7 +424,6 @@ struct hns3_hw {
>>   	struct hns3_cmq cmq;
>>   	struct hns3_mbx_resp_status mbx_resp; /* mailbox response */
>>   	struct hns3_mbx_arq_ring arq;         /* mailbox async rx queue */
>> -	pthread_t irq_thread_id;
>>   	struct hns3_mac mac;
>>   	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
>>   	struct hns3_tqp_stats tqp_stats;
>> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
>> index 52ad825..99275f1 100644
>> --- a/drivers/net/hns3/hns3_ethdev_vf.c
>> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
>> @@ -1112,9 +1112,6 @@ hns3vf_interrupt_handler(void *param)
>>   	enum hns3vf_evt_cause event_cause;
>>   	uint32_t clearval;
>>
>> -	if (hw->irq_thread_id == 0)
>> -		hw->irq_thread_id = pthread_self();
>> -
>>   	/* Disable interrupt */
>>   	hns3vf_disable_irq0(hw);
>>
>> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c index d2a5db8..975a60b 100644
>> --- a/drivers/net/hns3/hns3_mbx.c
>> +++ b/drivers/net/hns3/hns3_mbx.c
>> @@ -40,36 +40,14 @@ hns3_resp_to_errno(uint16_t resp_code)
>>   	return -EIO;
>>   }
>>
>> -static void
>> -hns3_poll_all_sync_msg(void)
>> -{
>> -	struct rte_eth_dev *eth_dev;
>> -	struct hns3_adapter *adapter;
>> -	const char *name;
>> -	uint16_t port_id;
>> -
>> -	RTE_ETH_FOREACH_DEV(port_id) {
>> -		eth_dev = &rte_eth_devices[port_id];
>> -		name = eth_dev->device->driver->name;
>> -		if (strcmp(name, "net_hns3") && strcmp(name, "net_hns3_vf"))
>> -			continue;
>> -		adapter = eth_dev->data->dev_private;
>> -		if (!adapter || adapter->hw.adapter_state == HNS3_NIC_CLOSED)
>> -			continue;
>> -		/* Synchronous msg, the mbx_resp.req_msg_data is non-zero */
>> -		if (adapter->hw.mbx_resp.req_msg_data)
>> -			hns3_dev_handle_mbx_msg(&adapter->hw);
>> -	}
>> -}
>> -
>>   static int
>>   hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>>   		  uint8_t *resp_data, uint16_t resp_len)  {
>>   #define HNS3_MAX_RETRY_MS	500
>> +#define HNS3_WAIT_RESP_US	100
>>   	struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
>>   	struct hns3_mbx_resp_status *mbx_resp;
>> -	bool in_irq = false;
>>   	uint64_t now;
>>   	uint64_t end;
>>
>> @@ -96,26 +74,19 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>>   			return -EIO;
>>   		}
>>
>> -		/*
>> -		 * The mbox response is running on the interrupt thread.
>> -		 * Sending mbox in the interrupt thread cannot wait for the
>> -		 * response, so polling the mbox response on the irq thread.
>> -		 */
>> -		if (pthread_equal(hw->irq_thread_id, pthread_self())) {
>> -			in_irq = true;
>> -			hns3_poll_all_sync_msg();
>> -		} else {
>> -			rte_delay_ms(HNS3_POLL_RESPONE_MS);
>> -		}
>> +		hns3_dev_handle_mbx_msg(hw);
>> +		rte_delay_us(HNS3_WAIT_RESP_US);
>> +
>>   		now = get_timeofday_ms();
>>   	}
>>   	hw->mbx_resp.req_msg_data = 0;
>>   	if (now >= end) {
>>   		hw->mbx_resp.lost++;
>>   		hns3_err(hw,
>> -			 "VF could not get mbx(%u,%u) head(%u) tail(%u) lost(%u) from PF in_irq:%d",
>> +			 "VF could not get mbx(%u,%u) head(%u) tail(%u) "
>> +			 "lost(%u) from PF",
>>   			 code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail,
>> -			 hw->mbx_resp.lost, in_irq);
>> +			 hw->mbx_resp.lost);
>>   		return -ETIME;
>>   	}
>>   	rte_io_rmb();
>> @@ -365,9 +336,11 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
>>   	uint16_t flag;
>>   	uint8_t *temp;
>>   	int i;
>> +	rte_spinlock_lock(&hw->cmq.crq.lock);
>>
>>   	while (!hns3_cmd_crq_empty(hw)) {
>>   		if (rte_atomic16_read(&hw->reset.disable_cmd))
>> +			rte_spinlock_unlock(&hw->cmq.crq.lock);
>>   			return;
> 
> Seems "{ }" needed around if block, added during merge, thanks.
> 
Well, you are right, thanks Xueming.
>>
>>   		desc = &crq->desc[crq->next_to_use];
>> @@ -439,4 +412,6 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
>>
>>   	/* Write back CMDQ_RQ header pointer, IMP need this pointer */
>>   	hns3_write_dev(hw, HNS3_CMDQ_RX_HEAD_REG, crq->next_to_use);
>> +
>> +	rte_spinlock_unlock(&hw->cmq.crq.lock);
>>   }
>> --
>> 2.7.4
> 
> .
> 

      reply	other threads:[~2021-05-17  7:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-15  9:07 Min Hu (Connor)
2021-05-17  6:58 ` Xueming(Steven) Li
2021-05-17  7:27   ` Min Hu (Connor) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1bf3e39d-2aeb-64cd-0a36-1da93fa1dbfa@huawei.com \
    --to=humin29@huawei.com \
    --cc=ferruh.yigit@intel.com \
    --cc=stable@dpdk.org \
    --cc=xuemingl@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).