DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Min Hu (Connor)" <humin29@huawei.com>
To: <dev@dpdk.org>
Cc: <ferruh.yigit@intel.com>
Subject: [dpdk-dev] [PATCH 02/10] net/hns3: fix set default MAC addr fail in bonding of VF
Date: Wed, 31 Mar 2021 18:01:36 +0800
Message-ID: <1617184904-55349-3-git-send-email-humin29@huawei.com> (raw)
In-Reply-To: <1617184904-55349-1-git-send-email-humin29@huawei.com>

From: Chengwen Feng <fengchengwen@huawei.com>

When start testpmd with two hns3 VFs(0000:bd:01.0, 0000:bd:01.7), and
then execute the following commands:
	testpmd> create bonded device 1 0
	testpmd> set bonding mac_addr 2 3c:12:34:56:78:9a
	testpmd> add bonding slave 0 2
	testpmd> add bonding slave 1 2
	testpmd> set portmask 0x4
	testpmd> port start 2

It will occurs the following error in a low probability:
	0000:bd:01.0 hns3_get_mbx_resp(): VF could not get mbx(3,0)
		head(16) tail(15) lost(1) from PF in_irq:0
	0000:bd:01.0 hns3vf_set_default_mac_addr(): Failed to set mac
		addr(3C:**:**:**:78:9A) for vf: -62
	mac_address_slaves_update(1541) - Failed to update port Id 0
		MAC address

The problem replay:
1. the 'port start 2' command will start slave ports and then set
slave mac address, the function call flow: bond_ethdev_start ->
mac_address_slaves_update.
2. there are also a monitor task which running in intr thread will
check slave ports link status and update slave ports mac address, the
function call flow: bond_ethdev_slave_link_status_change_monitor ->
bond_ethdev_lsc_event_callback -> mac_address_slaves_update.
3. because the above step1&2 running on different threads, they may
both call drivers ops mac_addr_set which is
hns3vf_set_default_mac_addr.
4. hns3vf_set_default_mac_addr will first acquire hw.lock and then
send mailbox to PF and wait PF's response message.
Note: the PF's response is an independent message which will received
in hw.cmq.crq, the receiving operation can only performed in intr
thread.
5. so if the step1 operation hold the hw.lock and try get response
message, and step2 operation try acquire the hw.lock and so it can't
process the response message, this will lead to step1 fail.

The solution:
1. make all threads could process the mailbox response message, which
protected by the hw.cmq.crq.lock.
2. use the following rules to avoid deadlock:
2.1. ensure use the correct locking sequence: hw.lock >
hw.mbx_resp.lock > hw.cmq.crq.lock.
2.2. make sure don't acquire such as hw.lock & hw.mbx_resp.lock again
when process mailbox response message.

Fixes: 463e748964f5 ("net/hns3: support mailbox")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
---
 drivers/net/hns3/hns3_ethdev.h    |  1 -
 drivers/net/hns3/hns3_ethdev_vf.c |  3 ---
 drivers/net/hns3/hns3_mbx.c       | 51 +++++++++++----------------------------
 3 files changed, 14 insertions(+), 41 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index ac255a3..b11457b 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -439,7 +439,6 @@ struct hns3_hw {
 	struct hns3_cmq cmq;
 	struct hns3_mbx_resp_status mbx_resp; /* mailbox response */
 	struct hns3_mbx_arq_ring arq;         /* mailbox async rx queue */
-	pthread_t irq_thread_id;
 	struct hns3_mac mac;
 	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
 	struct hns3_tqp_stats tqp_stats;
diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
index 6c3ddcc..f70106d 100644
--- a/drivers/net/hns3/hns3_ethdev_vf.c
+++ b/drivers/net/hns3/hns3_ethdev_vf.c
@@ -1109,9 +1109,6 @@ hns3vf_interrupt_handler(void *param)
 	enum hns3vf_evt_cause event_cause;
 	uint32_t clearval;
 
-	if (hw->irq_thread_id == 0)
-		hw->irq_thread_id = pthread_self();
-
 	/* Disable interrupt */
 	hns3vf_disable_irq0(hw);
 
diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c
index 61d1584..ad0b4e6 100644
--- a/drivers/net/hns3/hns3_mbx.c
+++ b/drivers/net/hns3/hns3_mbx.c
@@ -40,36 +40,14 @@ hns3_resp_to_errno(uint16_t resp_code)
 	return -EIO;
 }
 
-static void
-hns3_poll_all_sync_msg(void)
-{
-	struct rte_eth_dev *eth_dev;
-	struct hns3_adapter *adapter;
-	const char *name;
-	uint16_t port_id;
-
-	RTE_ETH_FOREACH_DEV(port_id) {
-		eth_dev = &rte_eth_devices[port_id];
-		name = eth_dev->device->driver->name;
-		if (strcmp(name, "net_hns3") && strcmp(name, "net_hns3_vf"))
-			continue;
-		adapter = eth_dev->data->dev_private;
-		if (!adapter || adapter->hw.adapter_state == HNS3_NIC_CLOSED)
-			continue;
-		/* Synchronous msg, the mbx_resp.req_msg_data is non-zero */
-		if (adapter->hw.mbx_resp.req_msg_data)
-			hns3_dev_handle_mbx_msg(&adapter->hw);
-	}
-}
-
 static int
 hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
 		  uint8_t *resp_data, uint16_t resp_len)
 {
 #define HNS3_MAX_RETRY_MS	500
+#define HNS3_WAIT_RESP_US	100
 	struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
 	struct hns3_mbx_resp_status *mbx_resp;
-	bool in_irq = false;
 	uint64_t now;
 	uint64_t end;
 
@@ -96,26 +74,19 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
 			return -EIO;
 		}
 
-		/*
-		 * The mbox response is running on the interrupt thread.
-		 * Sending mbox in the interrupt thread cannot wait for the
-		 * response, so polling the mbox response on the irq thread.
-		 */
-		if (pthread_equal(hw->irq_thread_id, pthread_self())) {
-			in_irq = true;
-			hns3_poll_all_sync_msg();
-		} else {
-			rte_delay_ms(HNS3_POLL_RESPONE_MS);
-		}
+		hns3_dev_handle_mbx_msg(hw);
+		rte_delay_us(HNS3_WAIT_RESP_US);
+
 		now = get_timeofday_ms();
 	}
 	hw->mbx_resp.req_msg_data = 0;
 	if (now >= end) {
 		hw->mbx_resp.lost++;
 		hns3_err(hw,
-			 "VF could not get mbx(%u,%u) head(%u) tail(%u) lost(%u) from PF in_irq:%d",
+			 "VF could not get mbx(%u,%u) head(%u) tail(%u) "
+			 "lost(%u) from PF",
 			 code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail,
-			 hw->mbx_resp.lost, in_irq);
+			 hw->mbx_resp.lost);
 		return -ETIME;
 	}
 	rte_io_rmb();
@@ -368,9 +339,13 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
 	uint8_t *temp;
 	int i;
 
+	rte_spinlock_lock(&hw->cmq.crq.lock);
+
 	while (!hns3_cmd_crq_empty(hw)) {
-		if (__atomic_load_n(&hw->reset.disable_cmd, __ATOMIC_RELAXED))
+		if (__atomic_load_n(&hw->reset.disable_cmd, __ATOMIC_RELAXED)) {
+			rte_spinlock_unlock(&hw->cmq.crq.lock);
 			return;
+		}
 
 		desc = &crq->desc[crq->next_to_use];
 		req = (struct hns3_mbx_pf_to_vf_cmd *)desc->data;
@@ -441,4 +416,6 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
 
 	/* Write back CMDQ_RQ header pointer, IMP need this pointer */
 	hns3_write_dev(hw, HNS3_CMDQ_RX_HEAD_REG, crq->next_to_use);
+
+	rte_spinlock_unlock(&hw->cmq.crq.lock);
 }
-- 
2.7.4


  parent reply	other threads:[~2021-03-31 10:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 10:01 [dpdk-dev] [PATCH 00/10] some bugfixes for hns3 PMD Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 01/10] net/hns3: fix some function names for copper media type Min Hu (Connor)
2021-03-31 10:01 ` Min Hu (Connor) [this message]
2021-03-31 10:01 ` [dpdk-dev] [PATCH 03/10] net/hns3: fix the FLR miss detection Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 04/10] net/hns3: fix lack of rollback after setting PVID failed Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 05/10] net/hns3: fix flow control exception Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 06/10] net/hns3: fix flow counter not cleared when create Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 07/10] net/hns3: fix VF mailbox head field wrong update Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 08/10] net/hns3: support get device version when dump register Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 09/10] net/hns3: delete redundant blank line Min Hu (Connor)
2021-03-31 10:01 ` [dpdk-dev] [PATCH 10/10] net/hns3: fix code style static warning Min Hu (Connor)
2021-04-08 17:02 ` [dpdk-dev] [PATCH 00/10] some bugfixes for hns3 PMD Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1617184904-55349-3-git-send-email-humin29@huawei.com \
    --to=humin29@huawei.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git