From: "Xueming(Steven) Li" <xuemingl@nvidia.com>
To: "Min Hu (Connor)" <humin29@huawei.com>,
"stable@dpdk.org" <stable@dpdk.org>
Cc: "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-stable] [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
Date: Mon, 17 May 2021 06:58:49 +0000 [thread overview]
Message-ID: <BY5PR12MB43244C2FA678E3BAD47BC024A12D9@BY5PR12MB4324.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1621069664-39500-1-git-send-email-humin29@huawei.com>
Hi Min,
> -----Original Message-----
> From: Min Hu (Connor) <humin29@huawei.com>
> Sent: Saturday, May 15, 2021 5:08 PM
> To: stable@dpdk.org
> Cc: ferruh.yigit@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
> Subject: [PATCH 20.11] net/hns3: fix setting default MAC address in bonding of VF
>
> From: Chengwen Feng <fengchengwen@huawei.com>
>
> [ upstream commit 76a3836b98c4af6b9aaeaaa50907fe6143d31c55 ]
>
> When start testpmd with two hns3 VFs(0000:bd:01.0, 0000:bd:01.7), and then execute the following commands:
> testpmd> create bonded device 1 0
> testpmd> set bonding mac_addr 2 3c:12:34:56:78:9a
> testpmd> add bonding slave 0 2
> testpmd> add bonding slave 1 2
> testpmd> set portmask 0x4
> testpmd> port start 2
>
> It will occurs the following error in a low probability:
> 0000:bd:01.0 hns3_get_mbx_resp(): VF could not get mbx(3,0)
> head(16) tail(15) lost(1) from PF in_irq:0
> 0000:bd:01.0 hns3vf_set_default_mac_addr(): Failed to set mac
> addr(3C:**:**:**:78:9A) for vf: -62
> mac_address_slaves_update(1541) - Failed to update port Id 0
> MAC address
>
> The problem replay:
> 1. The 'port start 2' command will start slave ports and then set slave
> mac address, the function call flow: bond_ethdev_start ->
> mac_address_slaves_update.
> 2. There are also a monitor task which running in intr thread will check
> slave ports link status and update slave ports mac address, the
> function call flow: bond_ethdev_slave_link_status_change_monitor ->
> bond_ethdev_lsc_event_callback -> mac_address_slaves_update.
> 3. Because the above step1&2 running on different threads, they may both
> call drivers ops mac_addr_set which is hns3vf_set_default_mac_addr.
> 4. hns3vf_set_default_mac_addr will first acquire hw.lock and then send
> mailbox to PF and wait PF's response message. Note: the PF's
> response is an independent message which will received in hw.cmq.crq,
> the receiving operation can only performed in intr thread.
> 5. So if the step1 operation hold the hw.lock and try get response
> message, and step2 operation try acquire the hw.lock and so it can't
> process the response message, this will lead to step1 fail.
>
> The solution:
> 1. make all threads could process the mailbox response message, which
> protected by the hw.cmq.crq.lock.
> 2. use the following rules to avoid deadlock:
> 2.1. ensure use the correct locking sequence: hw.lock >
> hw.mbx_resp.lock > hw.cmq.crq.lock.
> 2.2. make sure don't acquire such as hw.lock & hw.mbx_resp.lock again
> when process mailbox response message.
>
> Fixes: 463e748964f5 ("net/hns3: support mailbox")
> Cc: stable@dpdk.org
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
> ---
> drivers/net/hns3/hns3_ethdev.h | 1 -
> drivers/net/hns3/hns3_ethdev_vf.c | 3 ---
> drivers/net/hns3/hns3_mbx.c | 47 +++++++++------------------------------
> 3 files changed, 11 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h index 4c40df1..2065592 100644
> --- a/drivers/net/hns3/hns3_ethdev.h
> +++ b/drivers/net/hns3/hns3_ethdev.h
> @@ -424,7 +424,6 @@ struct hns3_hw {
> struct hns3_cmq cmq;
> struct hns3_mbx_resp_status mbx_resp; /* mailbox response */
> struct hns3_mbx_arq_ring arq; /* mailbox async rx queue */
> - pthread_t irq_thread_id;
> struct hns3_mac mac;
> unsigned int secondary_cnt; /* Number of secondary processes init'd. */
> struct hns3_tqp_stats tqp_stats;
> diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
> index 52ad825..99275f1 100644
> --- a/drivers/net/hns3/hns3_ethdev_vf.c
> +++ b/drivers/net/hns3/hns3_ethdev_vf.c
> @@ -1112,9 +1112,6 @@ hns3vf_interrupt_handler(void *param)
> enum hns3vf_evt_cause event_cause;
> uint32_t clearval;
>
> - if (hw->irq_thread_id == 0)
> - hw->irq_thread_id = pthread_self();
> -
> /* Disable interrupt */
> hns3vf_disable_irq0(hw);
>
> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c index d2a5db8..975a60b 100644
> --- a/drivers/net/hns3/hns3_mbx.c
> +++ b/drivers/net/hns3/hns3_mbx.c
> @@ -40,36 +40,14 @@ hns3_resp_to_errno(uint16_t resp_code)
> return -EIO;
> }
>
> -static void
> -hns3_poll_all_sync_msg(void)
> -{
> - struct rte_eth_dev *eth_dev;
> - struct hns3_adapter *adapter;
> - const char *name;
> - uint16_t port_id;
> -
> - RTE_ETH_FOREACH_DEV(port_id) {
> - eth_dev = &rte_eth_devices[port_id];
> - name = eth_dev->device->driver->name;
> - if (strcmp(name, "net_hns3") && strcmp(name, "net_hns3_vf"))
> - continue;
> - adapter = eth_dev->data->dev_private;
> - if (!adapter || adapter->hw.adapter_state == HNS3_NIC_CLOSED)
> - continue;
> - /* Synchronous msg, the mbx_resp.req_msg_data is non-zero */
> - if (adapter->hw.mbx_resp.req_msg_data)
> - hns3_dev_handle_mbx_msg(&adapter->hw);
> - }
> -}
> -
> static int
> hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
> uint8_t *resp_data, uint16_t resp_len) {
> #define HNS3_MAX_RETRY_MS 500
> +#define HNS3_WAIT_RESP_US 100
> struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
> struct hns3_mbx_resp_status *mbx_resp;
> - bool in_irq = false;
> uint64_t now;
> uint64_t end;
>
> @@ -96,26 +74,19 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
> return -EIO;
> }
>
> - /*
> - * The mbox response is running on the interrupt thread.
> - * Sending mbox in the interrupt thread cannot wait for the
> - * response, so polling the mbox response on the irq thread.
> - */
> - if (pthread_equal(hw->irq_thread_id, pthread_self())) {
> - in_irq = true;
> - hns3_poll_all_sync_msg();
> - } else {
> - rte_delay_ms(HNS3_POLL_RESPONE_MS);
> - }
> + hns3_dev_handle_mbx_msg(hw);
> + rte_delay_us(HNS3_WAIT_RESP_US);
> +
> now = get_timeofday_ms();
> }
> hw->mbx_resp.req_msg_data = 0;
> if (now >= end) {
> hw->mbx_resp.lost++;
> hns3_err(hw,
> - "VF could not get mbx(%u,%u) head(%u) tail(%u) lost(%u) from PF in_irq:%d",
> + "VF could not get mbx(%u,%u) head(%u) tail(%u) "
> + "lost(%u) from PF",
> code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail,
> - hw->mbx_resp.lost, in_irq);
> + hw->mbx_resp.lost);
> return -ETIME;
> }
> rte_io_rmb();
> @@ -365,9 +336,11 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
> uint16_t flag;
> uint8_t *temp;
> int i;
> + rte_spinlock_lock(&hw->cmq.crq.lock);
>
> while (!hns3_cmd_crq_empty(hw)) {
> if (rte_atomic16_read(&hw->reset.disable_cmd))
> + rte_spinlock_unlock(&hw->cmq.crq.lock);
> return;
Seems "{ }" needed around if block, added during merge, thanks.
>
> desc = &crq->desc[crq->next_to_use];
> @@ -439,4 +412,6 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
>
> /* Write back CMDQ_RQ header pointer, IMP need this pointer */
> hns3_write_dev(hw, HNS3_CMDQ_RX_HEAD_REG, crq->next_to_use);
> +
> + rte_spinlock_unlock(&hw->cmq.crq.lock);
> }
> --
> 2.7.4
next prev parent reply other threads:[~2021-05-17 6:58 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-15 9:07 Min Hu (Connor)
2021-05-17 6:58 ` Xueming(Steven) Li [this message]
2021-05-17 7:27 ` Min Hu (Connor)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BY5PR12MB43244C2FA678E3BAD47BC024A12D9@BY5PR12MB4324.namprd12.prod.outlook.com \
--to=xuemingl@nvidia.com \
--cc=ferruh.yigit@intel.com \
--cc=humin29@huawei.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).