From: "lihuisong (C)" <lihuisong@huawei.com>
To: Jie Hai <haijie1@huawei.com>, <dev@dpdk.org>,
<ferruh.yigit@amd.com>, Yisen Zhuang <yisen.zhuang@huawei.com>,
Chunsong Feng <fengchunsong@huawei.com>,
Ferruh Yigit <ferruh.yigit@intel.com>
Cc: <fengchengwen@huawei.com>
Subject: Re: [PATCH 1/2] net/hns3: fix sync mailbox failure forever
Date: Sat, 11 Nov 2023 11:21:15 +0800 [thread overview]
Message-ID: <e54190ed-853c-bb1c-001c-ce4b20f4a2d3@huawei.com> (raw)
In-Reply-To: <20231111015915.2776769-2-haijie1@huawei.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
在 2023/11/11 9:59, Jie Hai 写道:
> From: Dengdui Huang <huangdengdui@huawei.com>
>
> Currently, hns3 VF driver uses the following points to match
> the response and request message for the mailbox synchronous
> message between VF and PF.
> 1. req_msg_data which is consist of message code and subcode,
> is used to match request and response.
> 2. head means the number of send success for VF.
> 3. tail means the number of receive success for VF.
> 4. lost means the number of send timeout for VF.
> And 'head', 'tail' and 'lost' are dynamically updated during
> the communication.
>
> Now there is a issue that all sync mailbox message will
> send failure forever at the flollowing case:
> 1. VF sends the message A
> then head=UINT32_MAX-1, tail=UINT32_MAX-3, lost=2.
> 2. VF sends the message B
> then head=UINT32_MAX, tail=UINT32_MAX-2, lost=2.
> 3. VF sends the message C, the message will be timeout because
> it can't get the response within 500ms.
> then head=0, tail=0, lost=2
> note: tail is assigned to head if tail > head according to
> current code logic. From now on, all subsequent sync milbox
> messages fail to be sent.
>
> It's very complicated to use the fields 'lost','tail','head'.
> The code and subcode of the request sync mailbox are used as the
> matching code of the message, which is used to match the response
> message for receiving the synchronization response.
>
> This patch drops these fields and uses the following solution
> to solve this issue:
> In the handling response message process, using the req_msg_data
> of the request and response message to judge whether the sync
> mailbox message has been received.
>
> Fixes: 463e748964f5 ("net/hns3: support mailbox")
> Cc: stable@dpdk.org
>
> Signed-off-by: Dengdui Huang <huangdengdui@huawei.com>
> Signed-off-by: Jie Hai <haijie1@huawei.com>
> ---
> drivers/net/hns3/hns3_cmd.c | 3 --
> drivers/net/hns3/hns3_mbx.c | 81 ++++++-------------------------------
> drivers/net/hns3/hns3_mbx.h | 10 -----
> 3 files changed, 13 insertions(+), 81 deletions(-)
>
> diff --git a/drivers/net/hns3/hns3_cmd.c b/drivers/net/hns3/hns3_cmd.c
> index a5c4c11dc8c4..2c1664485bef 100644
> --- a/drivers/net/hns3/hns3_cmd.c
> +++ b/drivers/net/hns3/hns3_cmd.c
> @@ -731,9 +731,6 @@ hns3_cmd_init(struct hns3_hw *hw)
> hw->cmq.csq.next_to_use = 0;
> hw->cmq.crq.next_to_clean = 0;
> hw->cmq.crq.next_to_use = 0;
> - hw->mbx_resp.head = 0;
> - hw->mbx_resp.tail = 0;
> - hw->mbx_resp.lost = 0;
> hns3_cmd_init_regs(hw);
>
> rte_spinlock_unlock(&hw->cmq.crq.lock);
> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c
> index 8e0a58aa02d8..f1743c195efa 100644
> --- a/drivers/net/hns3/hns3_mbx.c
> +++ b/drivers/net/hns3/hns3_mbx.c
> @@ -40,23 +40,6 @@ hns3_resp_to_errno(uint16_t resp_code)
> return -EIO;
> }
>
> -static void
> -hns3_mbx_proc_timeout(struct hns3_hw *hw, uint16_t code, uint16_t subcode)
> -{
> - if (hw->mbx_resp.matching_scheme ==
> - HNS3_MBX_RESP_MATCHING_SCHEME_OF_ORIGINAL) {
> - hw->mbx_resp.lost++;
> - hns3_err(hw,
> - "VF could not get mbx(%u,%u) head(%u) tail(%u) "
> - "lost(%u) from PF",
> - code, subcode, hw->mbx_resp.head, hw->mbx_resp.tail,
> - hw->mbx_resp.lost);
> - return;
> - }
> -
> - hns3_err(hw, "VF could not get mbx(%u,%u) from PF", code, subcode);
> -}
> -
> static int
> hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code, uint16_t subcode,
> uint8_t *resp_data, uint16_t resp_len)
> @@ -67,7 +50,6 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code, uint16_t subcode,
> struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
> struct hns3_mbx_resp_status *mbx_resp;
> uint32_t wait_time = 0;
> - bool received;
>
> if (resp_len > HNS3_MBX_MAX_RESP_DATA_SIZE) {
> hns3_err(hw, "VF mbx response len(=%u) exceeds maximum(=%d)",
> @@ -93,20 +75,14 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code, uint16_t subcode,
> hns3_dev_handle_mbx_msg(hw);
> rte_delay_us(HNS3_WAIT_RESP_US);
>
> - if (hw->mbx_resp.matching_scheme ==
> - HNS3_MBX_RESP_MATCHING_SCHEME_OF_ORIGINAL)
> - received = (hw->mbx_resp.head ==
> - hw->mbx_resp.tail + hw->mbx_resp.lost);
> - else
> - received = hw->mbx_resp.received_match_resp;
> - if (received)
> + if (hw->mbx_resp.received_match_resp)
> break;
>
> wait_time += HNS3_WAIT_RESP_US;
> }
> hw->mbx_resp.req_msg_data = 0;
> if (wait_time >= mbx_time_limit) {
> - hns3_mbx_proc_timeout(hw, code, subcode);
> + hns3_err(hw, "VF could not get mbx(%u,%u) from PF", code, subcode);
> return -ETIME;
> }
> rte_io_rmb();
> @@ -132,7 +108,6 @@ hns3_mbx_prepare_resp(struct hns3_hw *hw, uint16_t code, uint16_t subcode)
> * we get the exact scheme which is used.
> */
> hw->mbx_resp.req_msg_data = (uint32_t)code << 16 | subcode;
> - hw->mbx_resp.head++;
>
> /* Update match_id and ensure the value of match_id is not zero */
> hw->mbx_resp.match_id++;
> @@ -185,7 +160,6 @@ hns3_send_mbx_msg(struct hns3_hw *hw, uint16_t code, uint16_t subcode,
> req->match_id = hw->mbx_resp.match_id;
> ret = hns3_cmd_send(hw, &desc, 1);
> if (ret) {
> - hw->mbx_resp.head--;
> rte_spinlock_unlock(&hw->mbx_resp.lock);
> hns3_err(hw, "VF failed(=%d) to send mbx message to PF",
> ret);
> @@ -254,41 +228,10 @@ hns3_handle_asserting_reset(struct hns3_hw *hw,
> hns3_schedule_reset(HNS3_DEV_HW_TO_ADAPTER(hw));
> }
>
> -/*
> - * Case1: receive response after timeout, req_msg_data
> - * is 0, not equal resp_msg, do lost--
> - * Case2: receive last response during new send_mbx_msg,
> - * req_msg_data is different with resp_msg, let
> - * lost--, continue to wait for response.
> - */
> -static void
> -hns3_update_resp_position(struct hns3_hw *hw, uint32_t resp_msg)
> -{
> - struct hns3_mbx_resp_status *resp = &hw->mbx_resp;
> - uint32_t tail = resp->tail + 1;
> -
> - if (tail > resp->head)
> - tail = resp->head;
> - if (resp->req_msg_data != resp_msg) {
> - if (resp->lost)
> - resp->lost--;
> - hns3_warn(hw, "Received a mismatched response req_msg(%x) "
> - "resp_msg(%x) head(%u) tail(%u) lost(%u)",
> - resp->req_msg_data, resp_msg, resp->head, tail,
> - resp->lost);
> - } else if (tail + resp->lost > resp->head) {
> - resp->lost--;
> - hns3_warn(hw, "Received a new response again resp_msg(%x) "
> - "head(%u) tail(%u) lost(%u)", resp_msg,
> - resp->head, tail, resp->lost);
> - }
> - rte_io_wmb();
> - resp->tail = tail;
> -}
> -
> static void
> hns3_handle_mbx_response(struct hns3_hw *hw, struct hns3_mbx_pf_to_vf_cmd *req)
> {
> +#define HNS3_MBX_RESP_CODE_OFFSET 16
> struct hns3_mbx_resp_status *resp = &hw->mbx_resp;
> uint32_t msg_data;
>
> @@ -298,12 +241,6 @@ hns3_handle_mbx_response(struct hns3_hw *hw, struct hns3_mbx_pf_to_vf_cmd *req)
> * match_id to its response. So VF could use the match_id
> * to match the request.
> */
> - if (resp->matching_scheme !=
> - HNS3_MBX_RESP_MATCHING_SCHEME_OF_MATCH_ID) {
> - resp->matching_scheme =
> - HNS3_MBX_RESP_MATCHING_SCHEME_OF_MATCH_ID;
> - hns3_info(hw, "detect mailbox support match id!");
> - }
> if (req->match_id == resp->match_id) {
> resp->resp_status = hns3_resp_to_errno(req->msg[3]);
> memcpy(resp->additional_info, &req->msg[4],
> @@ -319,11 +256,19 @@ hns3_handle_mbx_response(struct hns3_hw *hw, struct hns3_mbx_pf_to_vf_cmd *req)
> * support copy request's match_id to its response. So VF follows the
> * original scheme to process.
> */
> + msg_data = (uint32_t)req->msg[1] << HNS3_MBX_RESP_CODE_OFFSET | req->msg[2];
> + if (resp->req_msg_data != msg_data) {
> + hns3_warn(hw,
> + "received response tag (%u) is mismatched with requested tag (%u)",
> + msg_data, resp->req_msg_data);
> + return;
> + }
> +
> resp->resp_status = hns3_resp_to_errno(req->msg[3]);
> memcpy(resp->additional_info, &req->msg[4],
> HNS3_MBX_MAX_RESP_DATA_SIZE);
> - msg_data = (uint32_t)req->msg[1] << 16 | req->msg[2];
> - hns3_update_resp_position(hw, msg_data);
> + rte_io_wmb();
> + resp->received_match_resp = true;
> }
>
> static void
> diff --git a/drivers/net/hns3/hns3_mbx.h b/drivers/net/hns3/hns3_mbx.h
> index c378783c6ca4..4a328802b920 100644
> --- a/drivers/net/hns3/hns3_mbx.h
> +++ b/drivers/net/hns3/hns3_mbx.h
> @@ -93,21 +93,11 @@ enum hns3_mbx_link_fail_subcode {
> #define HNS3_MBX_MAX_RESP_DATA_SIZE 8
> #define HNS3_MBX_DEF_TIME_LIMIT_MS 500
>
> -enum {
> - HNS3_MBX_RESP_MATCHING_SCHEME_OF_ORIGINAL = 0,
> - HNS3_MBX_RESP_MATCHING_SCHEME_OF_MATCH_ID
> -};
> -
> struct hns3_mbx_resp_status {
> rte_spinlock_t lock; /* protects against contending sync cmd resp */
>
> - uint8_t matching_scheme;
> -
> /* The following fields used in the matching scheme for original */
> uint32_t req_msg_data;
> - uint32_t head;
> - uint32_t tail;
> - uint32_t lost;
>
> /* The following fields used in the matching scheme for match_id */
> uint16_t match_id;
next prev parent reply other threads:[~2023-11-11 3:21 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-11 1:59 [PATCH 0/2] net/hns3: fix mailbox bug and replace __atomic_xxx API Jie Hai
2023-11-11 1:59 ` [PATCH 1/2] net/hns3: fix sync mailbox failure forever Jie Hai
2023-11-11 3:21 ` lihuisong (C) [this message]
2023-11-11 5:20 ` Ferruh Yigit
2023-11-11 1:59 ` [PATCH 2/2] net/hns3: use stdatomic API Jie Hai
2023-11-11 3:20 ` lihuisong (C)
2023-11-11 5:22 ` [PATCH 0/2] net/hns3: fix mailbox bug and replace __atomic_xxx API Ferruh Yigit
2023-12-07 1:42 ` [PATCH v2 0/4] net/hns3: bugfix on reset and stdatomic API Jie Hai
2023-12-07 1:42 ` [PATCH v2 1/4] net/hns3: fix VF multiple count on one reset Jie Hai
2023-12-07 1:42 ` [PATCH v2 2/4] net/hns3: fix disable command with firmware Jie Hai
2023-12-07 1:42 ` [PATCH v2 3/4] net/hns3: fix incorrect reset level comparison Jie Hai
2023-12-07 1:42 ` [PATCH v2 4/4] net/hns3: use stdatomic API Jie Hai
2023-12-07 13:57 ` Ferruh Yigit
2023-12-07 17:42 ` Ferruh Yigit
2023-12-08 7:24 ` Jie Hai
2023-12-08 9:54 ` Ferruh Yigit
2023-12-08 7:08 ` Jie Hai
2023-12-07 13:57 ` Ferruh Yigit
2023-12-08 7:44 ` [PATCH v3 0/3] net/hns3: bugfix on reset Jie Hai
2023-12-08 7:44 ` [PATCH v3 1/3] net/hns3: fix VF multiple count on one reset Jie Hai
2023-12-08 7:44 ` [PATCH v3 2/3] net/hns3: fix disable command with firmware Jie Hai
2023-12-08 7:44 ` [PATCH v3 3/3] net/hns3: fix incorrect reset level comparison Jie Hai
2023-12-08 12:31 ` [PATCH v3 0/3] net/hns3: bugfix on reset Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e54190ed-853c-bb1c-001c-ce4b20f4a2d3@huawei.com \
--to=lihuisong@huawei.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=fengchunsong@huawei.com \
--cc=ferruh.yigit@amd.com \
--cc=ferruh.yigit@intel.com \
--cc=haijie1@huawei.com \
--cc=yisen.zhuang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).