patches for DPDK stable branches
 help / color / mirror / Atom feed
From: "Xueming(Steven) Li" <xuemingl@nvidia.com>
To: "Min Hu (Connor)" <humin29@huawei.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Cc: "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-stable] [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
Date: Mon, 17 May 2021 06:58:49 +0000	[thread overview]
Message-ID: <BY5PR12MB43244C2FA678E3BAD47BC024A12D9@BY5PR12MB4324.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1621069664-39500-1-git-send-email-humin29@huawei.com>

Hi Min,

> -----Original Message-----
> From: Min Hu (Connor) <humin29@huawei.com>
> Sent: Saturday, May 15, 2021 5:08 PM
> To: stable@dpdk.org
> Cc: ferruh.yigit@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
> Subject: [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
> 
> From: Chengwen Feng <fengchengwen@huawei.com>
> 
> [ upstream commit 76a3836b98c4af6b9aaeaaa50907fe6143d31c55 ]
> 
> When start testpmd with two hns3 VFs(0000:bd:01.0, 0000:bd:01.7), and then execute the following commands:
> 	testpmd> create bonded device 1 0
> 	testpmd> set bonding mac_addr 2 3c:12:34:56:78:9a
> 	testpmd> add bonding slave 0 2
> 	testpmd> add bonding slave 1 2
> 	testpmd> set portmask 0x4
> 	testpmd> port start 2
> 
> It will occurs the following error in a low probability:
> 	0000:bd:01.0 hns3_get_mbx_resp(): VF could not get mbx(3,0)
> 		head(16) tail(15) lost(1) from PF in_irq:0
> 	0000:bd:01.0 hns3vf_set_default_mac_addr(): Failed to set mac
> 		addr(3C:**:**:**:78:9A) for vf: -62
> 	mac_address_slaves_update(1541) - Failed to update port Id 0
> 		MAC address
> 
> The problem replay:
> 1. The 'port start 2' command will start slave ports and then set slave
>    mac address, the function call flow: bond_ethdev_start ->
>    mac_address_slaves_update.
> 2. There are also a monitor task which running in intr thread will check
>    slave ports link status and update slave ports mac address, the
>    function call flow: bond_ethdev_slave_link_status_change_monitor ->
>    bond_ethdev_lsc_event_callback -> mac_address_slaves_update.
> 3. Because the above step1&2 running on different threads, they may both
>    call drivers ops mac_addr_set which is hns3vf_set_default_mac_addr.
> 4. hns3vf_set_default_mac_addr will first acquire hw.lock and then send
>    mailbox to PF and wait PF's response message.  Note: the PF's
>    response is an independent message which will received in hw.cmq.crq,
>    the receiving operation can only performed in intr thread.
> 5. So if the step1 operation hold the hw.lock and try get response
>    message, and step2 operation try acquire the hw.lock and so it can't
>    process the response message, this will lead to step1 fail.
> 
> The solution:
> 1. make all threads could process the mailbox response message, which
>    protected by the hw.cmq.crq.lock.
> 2. use the following rules to avoid deadlock:
> 2.1. ensure use the correct locking sequence: hw.lock >
>      hw.mbx_resp.lock > hw.cmq.crq.lock.
> 2.2. make sure don't acquire such as hw.lock & hw.mbx_resp.lock again
>      when process mailbox response message.
> 
> Fixes: 463e748964f5 ("net/hns3: support mailbox")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
> ---
>  drivers/net/hns3/hns3_ethdev.h    |  1 -
>  drivers/net/hns3/hns3_ethdev_vf.c |  3 ---
>  drivers/net/hns3/hns3_mbx.c       | 47 +++++++++------------------------------
>  3 files changed, 11 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h index 4c40df1..2065592 100644
> --- a/drivers/net/hns3/hns3_ethdev.h
> +++ b/drivers/net/hns3/hns3_ethdev.h
> @@ -424,7 +424,6 @@ struct hns3_hw {
>  	struct hns3_cmq cmq;
>  	struct hns3_mbx_resp_status mbx_resp; /* mailbox response */
>  	struct hns3_mbx_arq_ring arq;         /* mailbox async rx queue */
> -	pthread_t irq_thread_id;
>  	struct hns3_mac mac;
>  	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
>  	struct hns3_tqp_stats tqp_stats;
> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
> index 52ad825..99275f1 100644
> --- a/drivers/net/hns3/hns3_ethdev_vf.c
> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
> @@ -1112,9 +1112,6 @@ hns3vf_interrupt_handler(void *param)
>  	enum hns3vf_evt_cause event_cause;
>  	uint32_t clearval;
> 
> -	if (hw->irq_thread_id == 0)
> -		hw->irq_thread_id = pthread_self();
> -
>  	/* Disable interrupt */
>  	hns3vf_disable_irq0(hw);
> 
> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c index d2a5db8..975a60b 100644
> --- a/drivers/net/hns3/hns3_mbx.c
> +++ b/drivers/net/hns3/hns3_mbx.c
> @@ -40,36 +40,14 @@ hns3_resp_to_errno(uint16_t resp_code)
>  	return -EIO;
>  }
> 
> -static void
> -hns3_poll_all_sync_msg(void)
> -{
> -	struct rte_eth_dev *eth_dev;
> -	struct hns3_adapter *adapter;
> -	const char *name;
> -	uint16_t port_id;
> -
> -	RTE_ETH_FOREACH_DEV(port_id) {
> -		eth_dev = &rte_eth_devices[port_id];
> -		name = eth_dev->device->driver->name;
> -		if (strcmp(name, "net_hns3") && strcmp(name, "net_hns3_vf"))
> -			continue;
> -		adapter = eth_dev->data->dev_private;
> -		if (!adapter || adapter->hw.adapter_state == HNS3_NIC_CLOSED)
> -			continue;
> -		/* Synchronous msg, the mbx_resp.req_msg_data is non-zero */
> -		if (adapter->hw.mbx_resp.req_msg_data)
> -			hns3_dev_handle_mbx_msg(&adapter->hw);
> -	}
> -}
> -
>  static int
>  hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>  		  uint8_t *resp_data, uint16_t resp_len)  {
>  #define HNS3_MAX_RETRY_MS	500
> +#define HNS3_WAIT_RESP_US	100
>  	struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
>  	struct hns3_mbx_resp_status *mbx_resp;
> -	bool in_irq = false;
>  	uint64_t now;
>  	uint64_t end;
> 
> @@ -96,26 +74,19 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>  			return -EIO;
>  		}
> 
> -		/*
> -		 * The mbox response is running on the interrupt thread.
> -		 * Sending mbox in the interrupt thread cannot wait for the
> -		 * response, so polling the mbox response on the irq thread.
> -		 */
> -		if (pthread_equal(hw->irq_thread_id, pthread_self())) {
> -			in_irq = true;
> -			hns3_poll_all_sync_msg();
> -		} else {
> -			rte_delay_ms(HNS3_POLL_RESPONE_MS);
> -		}
> +		hns3_dev_handle_mbx_msg(hw);
> +		rte_delay_us(HNS3_WAIT_RESP_US);
> +
>  		now = get_timeofday_ms();
>  	}
>  	hw->mbx_resp.req_msg_data = 0;
>  	if (now >= end) {
>  		hw->mbx_resp.lost++;
>  		hns3_err(hw,
> -			 "VF could not get mbx(%u,%u) head(%u) tail(%u) lost(%u) from PF in_irq:%d",
> +			 "VF could not get mbx(%u,%u) head(%u) tail(%u) "
> +			 "lost(%u) from PF",
>  			 code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail,
> -			 hw->mbx_resp.lost, in_irq);
> +			 hw->mbx_resp.lost);
>  		return -ETIME;
>  	}
>  	rte_io_rmb();
> @@ -365,9 +336,11 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
>  	uint16_t flag;
>  	uint8_t *temp;
>  	int i;
> +	rte_spinlock_lock(&hw->cmq.crq.lock);
> 
>  	while (!hns3_cmd_crq_empty(hw)) {
>  		if (rte_atomic16_read(&hw->reset.disable_cmd))
> +			rte_spinlock_unlock(&hw->cmq.crq.lock);
>  			return;

Seems "{ }" needed around if block, added during merge, thanks.

> 
>  		desc = &crq->desc[crq->next_to_use];
> @@ -439,4 +412,6 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
> 
>  	/* Write back CMDQ_RQ header pointer, IMP need this pointer */
>  	hns3_write_dev(hw, HNS3_CMDQ_RX_HEAD_REG, crq->next_to_use);
> +
> +	rte_spinlock_unlock(&hw->cmq.crq.lock);
>  }
> --
> 2.7.4


  reply	other threads:[~2021-05-17  6:58 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-15  9:07 Min Hu (Connor)
2021-05-17  6:58 ` Xueming(Steven) Li [this message]
2021-05-17  7:27   ` Min Hu (Connor)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BY5PR12MB43244C2FA678E3BAD47BC024A12D9@BY5PR12MB4324.namprd12.prod.outlook.com \
    --to=xuemingl@nvidia.com \
    --cc=ferruh.yigit@intel.com \
    --cc=humin29@huawei.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).