From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f169.google.com (mail-wi0-f169.google.com [209.85.212.169]) by dpdk.org (Postfix) with ESMTP id 7DB91594B for ; Tue, 25 Aug 2015 20:52:39 +0200 (CEST) Received: by wicja10 with SMTP id ja10so23659442wic.1 for ; Tue, 25 Aug 2015 11:52:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=6Mhbd85HSDu2EXbJi9NGdMIvgFoZ2OMTWo32HP9F/1w=; b=M9tI/HrRcxPGv3VjgFkd3vqQnd5ae5tXAJ8hyTyHQtCmuont6JNn2M4e2kX79S5g3x ss9p9lGLj+slq1bGrC5WSrFK6XmYqQW+UA4ikeoUFrjBg06YM9L1qkCGjbCOyW+DpfTP nhcI/0ylAOX+61rMzYPFqXj3jN45zIDg5aGtNc6HsSyqokY4Ft4B8tBbOLBVO2wc5AJq lXEVqsUEUs5S0wQXWnR3vIsAcJKx97V5lUkfBMReQ7ryM3w03Id+i83xnbGDgEHxSlEn 5mShc2WSzJz1MYTSl/T5ubGMfhu3utYfSBFYb68MIHmZZN3tGxbj9TjKBerH4ujTxaBM 7Nrw== X-Gm-Message-State: ALoCoQkxRUqEOhgbBN/j2YZF/yKAWtk1hZMQHaCmkGgA1P8dQHGFIvCWtR9g5ugRvMrpiozJ/EOz X-Received: by 10.180.104.68 with SMTP id gc4mr7118539wib.78.1440528759190; Tue, 25 Aug 2015 11:52:39 -0700 (PDT) Received: from [10.0.0.4] ([109.64.112.195]) by smtp.googlemail.com with ESMTPSA id ck18sm150938wjb.47.2015.08.25.11.52.37 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 25 Aug 2015 11:52:38 -0700 (PDT) To: "Zhang, Helin" References: <1439489195-31553-1-git-send-email-vladz@cloudius-systems.com> <55CD7EA5.6060100@cloudius-systems.com> <6A0DE07E22DDAD4C9103DF62FEBC0909D3E116@shsmsx102.ccr.corp.intel.com> From: Vlad Zolotarov Message-ID: <55DCB975.2030000@cloudius-systems.com> Date: Tue, 25 Aug 2015 21:52:37 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Aug 2015 18:52:39 -0000 On 08/25/15 21:43, Zhang, Helin wrote: > > Hi Vlad > > I think this could possibly be the root cause of your TX hang issue. > Please try to limit the number to 8 or less, and then see if the issue > will still be there or not? > Helin, the issue has been seen on x540 devices. Pls., see a chapter 7.2.1.1 of x540 devices spec: A packet (or multiple packets in transmit segmentation) can span any number of buffers (and their descriptors) up to a limit of 40 minus WTHRESH minus 2 (see Section 7.2.3.3 for Tx Ring details and section Section 7.2.3.5.1 for WTHRESH details). For best performance it is recommended to minimize the number of buffers as possible. Could u, pls., clarify why do u think that the maximum number of data buffers is limited by 8? thanks, vlad > It does not have any check for the number of descriptors to be used > for a single packet, and it relies on the users to give correct mbuf > chains. > > We may need a check of this somewhere. Of cause the point you > indicated we also need to carefully investigate or fix. > > Regards, > > Helin > > *From:*Vladislav Zolotarov [mailto:vladz@cloudius-systems.com] > *Sent:* Tuesday, August 25, 2015 11:34 AM > *To:* Zhang, Helin > *Cc:* Lu, Wenzhuo; dev@dpdk.org > *Subject:* RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh > above 1 for all NICs but 82598 > > > On Aug 25, 2015 21:14, "Zhang, Helin" > wrote: > > > > Hi Vlad > > > > > > > > In addition, I’d double check with you what’s the maximum number of > descriptors would be used for a single packet transmitting? > > > > Datasheet said that it supports up to 8. I am wondering if more than > 8 were used in your case? > > If memory serves me well the maximum number of data descriptors per > single xmit packet is 40 minus 2 minus WTHRESH. Since WTHRESH in DPDK > is always zero it gives us 38 segments. We limit them by 33. > > > > > Thank you very much! > > > > > > > > Regards, > > > > Helin > > > > > > > > From: Zhang, Helin > > Sent: Wednesday, August 19, 2015 10:29 AM > > To: Vladislav Zolotarov > > Cc: Lu, Wenzhuo; dev@dpdk.org > > > > Subject: RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh > above 1 for all NICs but 82598 > > > > > > > > Hi Vlad > > > > > > > > Thank you very much for the patches! Give me a few more time to > double check with more guys, and possibly hardware experts. > > > > > > > > Regards, > > > > Helin > > > > > > > > From: Vladislav Zolotarov [mailto:vladz@cloudius-systems.com > ] > > Sent: Tuesday, August 18, 2015 9:56 PM > > To: Lu, Wenzhuo > > Cc: dev@dpdk.org ; Zhang, Helin > > Subject: RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh > above 1 for all NICs but 82598 > > > > > > > > > > On Aug 19, 2015 03:42, "Lu, Wenzhuo" > wrote: > > > > > > Hi Helin, > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org > ] On Behalf Of Vlad Zolotarov > > > > Sent: Friday, August 14, 2015 1:38 PM > > > > To: Zhang, Helin; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid > tx_rs_thresh above 1 for > > > > all NICs but 82598 > > > > > > > > > > > > > > > > On 08/13/15 23:28, Zhang, Helin wrote: > > > > > Hi Vlad > > > > > > > > > > I don't think the changes are needed. It says in datasheet > that the RS > > > > > bit should be set on the last descriptor of every packet, ONLY > WHEN > > > > TXDCTL.WTHRESH equals to ZERO. > > > > > > > > Of course it's needed! See below. > > > > Exactly the same spec a few lines above the place u've just > quoted states: > > > > > > > > "Software should not set the RS bit when TXDCTL.WTHRESH is > greater than > > > > zero." > > > > > > > > And since all three (3) ixgbe xmit callbacks are utilizing RS > bit notation ixgbe PMD > > > > is actually not supporting any value of WTHRESH different from zero. > > > I think Vlad is right. We need to fix this issue. Any suggestion? > If not, I'd like to ack this patch. > > > > Pls., note that there is a v2 of this patch on the list. I forgot to > patch ixgbevf_dev_info_get() in v1. > > > > > > > > > > > > > > > > > > > Regards, > > > > > Helin > > > > > > > > > >> -----Original Message----- > > > > >> From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com > ] > > > > >> Sent: Thursday, August 13, 2015 11:07 AM > > > > >> To: dev@dpdk.org > > > > >> Cc: Zhang, Helin; Ananyev, Konstantin; > avi@cloudius-systems.com ; Vlad > > > > >> Zolotarov > > > > >> Subject: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh > above 1 for > > > > all > > > > >> NICs but 82598 > > > > >> > > > > >> According to 82599 and x540 HW specifications RS bit *must* > be set in the > > > > last > > > > >> descriptor of *every* packet. > > > > > There is a condition that if TXDCTL.WTHRESH equal to zero. > > > > > > > > Right and ixgbe PMD requires this condition to be fulfilled in > order to > > > > function. See above. > > > > > > > > > > > > > >> This patch fixes the Tx hang we were constantly hitting with > a seastar-based > > > > >> application on x540 NIC. > > > > > Could you help to share with us how to reproduce the tx hang > issue, with using > > > > > typical DPDK examples? > > > > > > > > Sorry. I'm not very familiar with the typical DPDK examples to > help u > > > > here. However this is quite irrelevant since without this this patch > > > > ixgbe PMD obviously abuses the HW spec as has been explained above. > > > > > > > > We saw the issue when u stressed the xmit path with a lot of highly > > > > fragmented TCP frames (packets with up to 33 fragments with > non-headers > > > > fragments as small as 4 bytes) with all offload features enabled. > > > > > > > > Thanks, > > > > vlad > > > > > > > > > >> Signed-off-by: Vlad Zolotarov > > > > > >> --- > > > > >> drivers/net/ixgbe/ixgbe_ethdev.c | 9 +++++++++ > > > > >> drivers/net/ixgbe/ixgbe_rxtx.c | 23 ++++++++++++++++++++++- > > > > >> 2 files changed, 31 insertions(+), 1 deletion(-) > > > > >> > > > > >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> b/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> index b8ee1e9..6714fd9 100644 > > > > >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev > *dev, > > > > struct > > > > >> rte_eth_dev_info *dev_info) > > > > >> .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS | > > > > >> ETH_TXQ_FLAGS_NOOFFLOADS, > > > > >> }; > > > > >> + > > > > >> + /* > > > > >> + * According to 82599 and x540 specifications RS bit > *must* be set on > > > > the > > > > >> + * last descriptor of *every* packet. Therefore we will > not allow the > > > > >> + * tx_rs_thresh above 1 for all NICs newer than 82598. > > > > >> + */ > > > > >> + if (hw->mac.type > ixgbe_mac_82598EB) > > > > >> + dev_info->default_txconf.tx_rs_thresh = 1; > > > > >> + > > > > >> dev_info->hash_key_size = IXGBE_HKEY_MAX_INDEX * > sizeof(uint32_t); > > > > >> dev_info->reta_size = ETH_RSS_RETA_SIZE_128; > > > > >> dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL; diff -- > > > > git > > > > >> a/drivers/net/ixgbe/ixgbe_rxtx.c > b/drivers/net/ixgbe/ixgbe_rxtx.c index > > > > >> 91023b9..8dbdffc 100644 > > > > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c > > > > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > > > > >> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct > rte_eth_dev > > > > >> *dev, > > > > >> struct ixgbe_tx_queue *txq; > > > > >> struct ixgbe_hw *hw; > > > > >> uint16_t tx_rs_thresh, tx_free_thresh; > > > > >> + bool rs_deferring_allowed; > > > > >> > > > > >> PMD_INIT_FUNC_TRACE(); > > > > >> hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > > > >> > > > > >> /* > > > > >> + * According to 82599 and x540 specifications RS bit > *must* be set on > > > > the > > > > >> + * last descriptor of *every* packet. Therefore we will > not allow the > > > > >> + * tx_rs_thresh above 1 for all NICs newer than 82598. > > > > >> + */ > > > > >> + rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB); > > > > >> + > > > > >> + /* > > > > >> * Validate number of transmit descriptors. > > > > >> * It must not exceed hardware maximum, and must be multiple > > > > >> * of IXGBE_ALIGN. > > > > >> @@ -2110,6 +2118,8 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev > > > > *dev, > > > > >> * to transmit a packet is greater than the number of free TX > > > > >> * descriptors. > > > > >> * The following constraints must be satisfied: > > > > >> + * tx_rs_thresh must be less than 2 for NICs for which RS > deferring is > > > > >> + * forbidden (all but 82598). > > > > >> * tx_rs_thresh must be greater than 0. > > > > >> * tx_rs_thresh must be less than the size of the ring > minus 2. > > > > >> * tx_rs_thresh must be less than or equal to tx_free_thresh. > > > > >> @@ -2121,9 +2131,20 @@ ixgbe_dev_tx_queue_setup(struct > rte_eth_dev > > > > *dev, > > > > >> * When set to zero use default values. > > > > >> */ > > > > >> tx_rs_thresh = (uint16_t)((tx_conf->tx_rs_thresh) ? > > > > >> - tx_conf->tx_rs_thresh : DEFAULT_TX_RS_THRESH); > > > > >> + tx_conf->tx_rs_thresh : > > > > >> + (rs_deferring_allowed ? DEFAULT_TX_RS_THRESH : 1)); > > > > >> tx_free_thresh = (uint16_t)((tx_conf->tx_free_thresh) ? > > > > >> tx_conf->tx_free_thresh : DEFAULT_TX_FREE_THRESH); > > > > >> + > > > > >> + if (!rs_deferring_allowed && tx_rs_thresh > 1) { > > > > >> + PMD_INIT_LOG(ERR, "tx_rs_thresh must be less than > 2 since RS > > > > " > > > > >> + "must be set for every packet > for this HW. " > > > > >> + "(tx_rs_thresh=%u port=%d queue=%d)", > > > > >> + (unsigned int)tx_rs_thresh, > > > > >> + (int)dev->data->port_id, (int)queue_idx); > > > > >> + return -(EINVAL); > > > > >> + } > > > > >> + > > > > >> if (tx_rs_thresh >= (nb_desc - 2)) { > > > > >> PMD_INIT_LOG(ERR, "tx_rs_thresh must be less than the > > > > number " > > > > >> "of TX descriptors minus 2. > (tx_rs_thresh=%u " > > > > >> -- > > > > >> 2.1.0 > > > >