patches for DPDK stable branches
 help / color / mirror / Atom feed
From: "Xueming(Steven) Li" <xuemingl@nvidia.com>
To: "Min Hu (Connor)" <humin29@huawei.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Cc: "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-stable] [PATCH 20.11 1/4] net/hns3: fix possible mismatched response of mailbox
Date: Tue, 18 May 2021 04:14:51 +0000	[thread overview]
Message-ID: <BY5PR12MB4324F541F08D33BC420BEC30A12C9@BY5PR12MB4324.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1621259445-5182-1-git-send-email-humin29@huawei.com>

Thanks, merged.

> -----Original Message-----
> From: Min Hu (Connor) <humin29@huawei.com>
> Sent: Monday, May 17, 2021 9:51 PM
> To: stable@dpdk.org
> Cc: ferruh.yigit@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
> Subject: [PATCH 20.11 1/4] net/hns3: fix possible mismatched response of mailbox
> 
> From: Chengwen Feng <fengchengwen@huawei.com>
> 
> [ upstream commit c8dec72de00039f367c1a8cd1f40378d04cf3e8f ]
> 
> Currently, the mailbox synchronous communication between VF and PF use the following fields to maintain communication:
> 1. Req_msg_data which was combined by message code and subcode, used to
>    match request and response.
> 2. Head which means the number of requests successfully sent by VF.
> 3. Tail which means the number of responses successfully received by VF.
> 4. Lost which means the number of requests which are timeout.
> 
> There may possible mismatches of the following situation:
> 1. VF sends message A with code=1 subcode=1.
> 	Then head=1, tail=0, lost=0.
> 2. PF was blocked about 500ms when processing the message A.
> 3. VF will detect message A timeout because it can't get the response within 500ms.
> 	Then head=1, tail=0, lost=1.
> 4. VF sends message B with code=1 subcode=1 which equal message A.
> 	Then head=2, tail=0, lost=1.
> 5. PF processes the first message A and send the response message to VF.
> 6. VF will update tail field to 1, but the lost field will remain
>    unchanged because the code/subcode equal message B's, so driver will
>    return success because now the head(2) equals tail(1) plus lost(1).
>    This will lead to mismatch of request and response.
> 
> To fix the above bug, we use the following scheme:
> 1. The message sent from VF was labelled with match_id which was a
>    unique 16-bit non-zero value.
> 2. The response sent from PF will label with match_id which got from the
>    request.
> 3. The VF uses the match_id to match request and response message.
> 
> This scheme depends on the PF driver, if the PF driver don't support then VF will uses the original scheme.
> 
> Fixes: 463e748964f5 ("net/hns3: support mailbox")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
> ---
>  drivers/net/hns3/hns3_mbx.c | 120 +++++++++++++++++++++++++++++++++++---------
>  drivers/net/hns3/hns3_mbx.h |  20 +++++++-
>  2 files changed, 114 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/hns3/hns3_mbx.c b/drivers/net/hns3/hns3_mbx.c index 975a60b..79ac16a 100644
> --- a/drivers/net/hns3/hns3_mbx.c
> +++ b/drivers/net/hns3/hns3_mbx.c
> @@ -40,14 +40,32 @@ hns3_resp_to_errno(uint16_t resp_code)
>  	return -EIO;
>  }
> 
> +static void
> +hns3_mbx_proc_timeout(struct hns3_hw *hw, uint16_t code, uint16_t
> +subcode) {
> +	if (hw->mbx_resp.matching_scheme ==
> +	    HNS3_MBX_RESP_MATCHING_SCHEME_OF_ORIGINAL) {
> +		hw->mbx_resp.lost++;
> +		hns3_err(hw,
> +			 "VF could not get mbx(%u,%u) head(%u) tail(%u) "
> +			 "lost(%u) from PF",
> +			 code, subcode, hw->mbx_resp.head, hw->mbx_resp.tail,
> +			 hw->mbx_resp.lost);
> +		return;
> +	}
> +
> +	hns3_err(hw, "VF could not get mbx(%u,%u) from PF", code, subcode); }
> +
>  static int
> -hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
> +hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code, uint16_t subcode,
>  		  uint8_t *resp_data, uint16_t resp_len)  {
>  #define HNS3_MAX_RETRY_MS	500
>  #define HNS3_WAIT_RESP_US	100
>  	struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
>  	struct hns3_mbx_resp_status *mbx_resp;
> +	bool received;
>  	uint64_t now;
>  	uint64_t end;
> 
> @@ -59,8 +77,7 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
> 
>  	now = get_timeofday_ms();
>  	end = now + HNS3_MAX_RETRY_MS;
> -	while ((hw->mbx_resp.head != hw->mbx_resp.tail + hw->mbx_resp.lost) &&
> -	       (now < end)) {
> +	while (now < end) {
>  		if (rte_atomic16_read(&hw->reset.disable_cmd)) {
>  			hns3_err(hw, "Don't wait for mbx respone because of "
>  				 "disable_cmd");
> @@ -77,16 +94,20 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>  		hns3_dev_handle_mbx_msg(hw);
>  		rte_delay_us(HNS3_WAIT_RESP_US);
> 
> +		if (hw->mbx_resp.matching_scheme ==
> +		    HNS3_MBX_RESP_MATCHING_SCHEME_OF_ORIGINAL)
> +			received = (hw->mbx_resp.head ==
> +				    hw->mbx_resp.tail + hw->mbx_resp.lost);
> +		else
> +			received = hw->mbx_resp.received_match_resp;
> +		if (received)
> +			break;
> +
>  		now = get_timeofday_ms();
>  	}
>  	hw->mbx_resp.req_msg_data = 0;
>  	if (now >= end) {
> -		hw->mbx_resp.lost++;
> -		hns3_err(hw,
> -			 "VF could not get mbx(%u,%u) head(%u) tail(%u) "
> -			 "lost(%u) from PF",
> -			 code0, code1, hw->mbx_resp.head, hw->mbx_resp.tail,
> -			 hw->mbx_resp.lost);
> +		hns3_mbx_proc_timeout(hw, code, subcode);
>  		return -ETIME;
>  	}
>  	rte_io_rmb();
> @@ -101,6 +122,29 @@ hns3_get_mbx_resp(struct hns3_hw *hw, uint16_t code0, uint16_t code1,
>  	return 0;
>  }
> 
> +static void
> +hns3_mbx_prepare_resp(struct hns3_hw *hw, uint16_t code, uint16_t
> +subcode) {
> +	/*
> +	 * Init both matching scheme fields because we may not know the exact
> +	 * scheme will be used when in the initial phase.
> +	 *
> +	 * Also, there are OK to init both matching scheme fields even though
> +	 * we get the exact scheme which is used.
> +	 */
> +	hw->mbx_resp.req_msg_data = (uint32_t)code << 16 | subcode;
> +	hw->mbx_resp.head++;
> +
> +	/* Update match_id and ensure the value of match_id is not zero */
> +	hw->mbx_resp.match_id++;
> +	if (hw->mbx_resp.match_id == 0)
> +		hw->mbx_resp.match_id = 1;
> +	hw->mbx_resp.received_match_resp = false;
> +
> +	hw->mbx_resp.resp_status = 0;
> +	memset(hw->mbx_resp.additional_info, 0, HNS3_MBX_MAX_RESP_DATA_SIZE);
> +}
> +
>  int
>  hns3_send_mbx_msg(struct hns3_hw *hw, uint16_t code, uint16_t subcode,
>  		  const uint8_t *msg_data, uint8_t msg_len, bool need_resp, @@ -138,8 +182,8 @@ hns3_send_mbx_msg(struct
> hns3_hw *hw, uint16_t code, uint16_t subcode,
>  	if (need_resp) {
>  		req->mbx_need_resp |= HNS3_MBX_NEED_RESP_BIT;
>  		rte_spinlock_lock(&hw->mbx_resp.lock);
> -		hw->mbx_resp.req_msg_data = (uint32_t)code << 16 | subcode;
> -		hw->mbx_resp.head++;
> +		hns3_mbx_prepare_resp(hw, code, subcode);
> +		req->match_id = hw->mbx_resp.match_id;
>  		ret = hns3_cmd_send(hw, &desc, 1);
>  		if (ret) {
>  			rte_spinlock_unlock(&hw->mbx_resp.lock);
> @@ -251,6 +295,46 @@ hns3_update_resp_position(struct hns3_hw *hw, uint32_t resp_msg)  }
> 
>  static void
> +hns3_handle_mbx_response(struct hns3_hw *hw, struct
> +hns3_mbx_pf_to_vf_cmd *req) {
> +	struct hns3_mbx_resp_status *resp = &hw->mbx_resp;
> +	uint32_t msg_data;
> +
> +	if (req->match_id != 0) {
> +		/*
> +		 * If match_id is not zero, it means PF support copy request's
> +		 * match_id to its response. So VF could use the match_id
> +		 * to match the request.
> +		 */
> +		if (resp->matching_scheme !=
> +		    HNS3_MBX_RESP_MATCHING_SCHEME_OF_MATCH_ID) {
> +			resp->matching_scheme =
> +				HNS3_MBX_RESP_MATCHING_SCHEME_OF_MATCH_ID;
> +			hns3_info(hw, "detect mailbox support match id!");
> +		}
> +		if (req->match_id == resp->match_id) {
> +			resp->resp_status = hns3_resp_to_errno(req->msg[3]);
> +			memcpy(resp->additional_info, &req->msg[4],
> +			       HNS3_MBX_MAX_RESP_DATA_SIZE);
> +			rte_io_wmb();
> +			resp->received_match_resp = true;
> +		}
> +		return;
> +	}
> +
> +	/*
> +	 * If the below instructions can be executed, it means PF does not
> +	 * support copy request's match_id to its response. So VF follows the
> +	 * original scheme to process.
> +	 */
> +	resp->resp_status = hns3_resp_to_errno(req->msg[3]);
> +	memcpy(resp->additional_info, &req->msg[4],
> +	       HNS3_MBX_MAX_RESP_DATA_SIZE);
> +	msg_data = (uint32_t)req->msg[1] << 16 | req->msg[2];
> +	hns3_update_resp_position(hw, msg_data); }
> +
> +static void
>  hns3_link_fail_parse(struct hns3_hw *hw, uint8_t link_fail_code)  {
>  	switch (link_fail_code) {
> @@ -326,16 +410,12 @@ hns3_handle_promisc_info(struct hns3_hw *hw, uint16_t promisc_en)  void
> hns3_dev_handle_mbx_msg(struct hns3_hw *hw)  {
> -	struct hns3_mbx_resp_status *resp = &hw->mbx_resp;
>  	struct hns3_cmq_ring *crq = &hw->cmq.crq;
>  	struct hns3_mbx_pf_to_vf_cmd *req;
>  	struct hns3_cmd_desc *desc;
> -	uint32_t msg_data;
>  	uint16_t *msg_q;
>  	uint8_t opcode;
>  	uint16_t flag;
> -	uint8_t *temp;
> -	int i;
>  	rte_spinlock_lock(&hw->cmq.crq.lock);
> 
>  	while (!hns3_cmd_crq_empty(hw)) {
> @@ -361,15 +441,7 @@ hns3_dev_handle_mbx_msg(struct hns3_hw *hw)
> 
>  		switch (opcode) {
>  		case HNS3_MBX_PF_VF_RESP:
> -			resp->resp_status = hns3_resp_to_errno(req->msg[3]);
> -
> -			temp = (uint8_t *)&req->msg[4];
> -			for (i = 0; i < HNS3_MBX_MAX_RESP_DATA_SIZE; i++) {
> -				resp->additional_info[i] = *temp;
> -				temp++;
> -			}
> -			msg_data = (uint32_t)req->msg[1] << 16 | req->msg[2];
> -			hns3_update_resp_position(hw, msg_data);
> +			hns3_handle_mbx_response(hw, req);
>  			break;
>  		case HNS3_MBX_LINK_STAT_CHANGE:
>  		case HNS3_MBX_ASSERTING_RESET:
> diff --git a/drivers/net/hns3/hns3_mbx.h b/drivers/net/hns3/hns3_mbx.h index 7f7ade1..8a9d7ba 100644
> --- a/drivers/net/hns3/hns3_mbx.h
> +++ b/drivers/net/hns3/hns3_mbx.h
> @@ -83,12 +83,26 @@ enum hns3_mbx_link_fail_subcode {
>  #define HNS3_MBX_RING_MAP_BASIC_MSG_NUM	3
>  #define HNS3_MBX_RING_NODE_VARIABLE_NUM	3
> 
> +enum {
> +	HNS3_MBX_RESP_MATCHING_SCHEME_OF_ORIGINAL = 0,
> +	HNS3_MBX_RESP_MATCHING_SCHEME_OF_MATCH_ID
> +};
> +
>  struct hns3_mbx_resp_status {
>  	rte_spinlock_t lock; /* protects against contending sync cmd resp */
> +
> +	uint8_t matching_scheme;
> +
> +	/* The following fields used in the matching scheme for original */
>  	uint32_t req_msg_data;
>  	uint32_t head;
>  	uint32_t tail;
>  	uint32_t lost;
> +
> +	/* The following fields used in the matching scheme for match_id */
> +	uint16_t match_id;
> +	bool received_match_resp;
> +
>  	int resp_status;
>  	uint8_t additional_info[HNS3_MBX_MAX_RESP_DATA_SIZE];
>  };
> @@ -106,7 +120,8 @@ struct hns3_mbx_vf_to_pf_cmd {
>  	uint8_t mbx_need_resp;
>  	uint8_t rsv1;
>  	uint8_t msg_len;
> -	uint8_t rsv2[3];
> +	uint8_t rsv2;
> +	uint16_t match_id;
>  	uint8_t msg[HNS3_MBX_MAX_MSG_SIZE];
>  };
> 
> @@ -114,7 +129,8 @@ struct hns3_mbx_pf_to_vf_cmd {
>  	uint8_t dest_vfid;
>  	uint8_t rsv[3];
>  	uint8_t msg_len;
> -	uint8_t rsv1[3];
> +	uint8_t rsv1;
> +	uint16_t match_id;
>  	uint16_t msg[8];
>  };
> 
> --
> 2.7.4


      parent reply	other threads:[~2021-05-18  4:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17 13:50 Min Hu (Connor)
2021-05-17 13:50 ` [dpdk-stable] [PATCH 20.11 2/4] net/hns3: fix VF handling LSC event in secondary process Min Hu (Connor)
2021-05-18  4:15   ` Xueming(Steven) Li
2021-05-17 13:50 ` [dpdk-stable] [PATCH 20.11 3/4] net/hns3: fix timing in mailbox Min Hu (Connor)
2021-05-18  4:27   ` Xueming(Steven) Li
2021-05-17 13:50 ` [dpdk-stable] [PATCH 20.11 4/4] net/hns3: fix verification of NEON support Min Hu (Connor)
2021-05-18  4:27   ` Xueming(Steven) Li
2021-05-18  4:14 ` Xueming(Steven) Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BY5PR12MB4324F541F08D33BC420BEC30A12C9@BY5PR12MB4324.namprd12.prod.outlook.com \
    --to=xuemingl@nvidia.com \
    --cc=ferruh.yigit@intel.com \
    --cc=humin29@huawei.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).