From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from rcdn-iport-3.cisco.com (rcdn-iport-3.cisco.com [173.37.86.74]) by dpdk.org (Postfix) with ESMTP id B55FF1B1C4 for ; Fri, 6 Oct 2017 14:38:10 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=30925; q=dns/txt; s=iport; t=1507293490; x=1508503090; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to; bh=MI5QsU5UnQZQ7E6X7/DM36OPh33TW8R6o5TyDQF5vaY=; b=OTTDJl2nqIYylIx1U+LLvSMLpVxC33QNdaj1xMLhjN7ALXRu+LmZFcmE XInJ78SLYu38b2/vlnS5UyVcHkphYkbf5MGwDFd183JY0SRgw/l44GgGS COMRanXZRam/Nxwh9zoTdO1uzUaZgMVdUZXVS50GurnovZth2xR+ks1ih s=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0CfAQA9eNdZ/40NJK1YAxkBAQEBAQEBA?= =?us-ascii?q?QEBAQcBAQEBAYJvAW1kbieDepoGgUsJIpYvDoIECiOBOQEkgzoChCBBFgECAQE?= =?us-ascii?q?BAQEBAWsohRkBBSMyJBALGCABCQICVwYNBgIBAYosEKVIgicniwEBAQEBAQEBA?= =?us-ascii?q?QEBAQEBAQEBAQEBAR6DLYExUYFRbnwrC4IgU4RRARIBKxULG4JMgmEFoTMClGO?= =?us-ascii?q?CFIVvg1qHLZVZgTkmDSSBAwt4FUmFT4FqJDYBhngNGAeCFQEBAQ?= X-IronPort-AV: E=Sophos;i="5.42,483,1500940800"; d="scan'208,217";a="293989897" Received: from alln-core-8.cisco.com ([173.36.13.141]) by rcdn-iport-3.cisco.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 06 Oct 2017 12:38:09 +0000 Received: from [10.150.214.147] ([10.150.214.147]) by alln-core-8.cisco.com (8.14.5/8.14.5) with ESMTP id v96Cc80H009755; Fri, 6 Oct 2017 12:38:08 GMT To: Bruce Richardson Cc: jingjing.wu@intel.com, dev@dpdk.org References: <20171005191111.27557-1-rmelton@cisco.com> <20171006085403.GA24124@bricha3-MOBL3.ger.corp.intel.com> From: "Roger B. Melton" Message-ID: Date: Fri, 6 Oct 2017 08:38:01 -0400 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20171006085403.GA24124@bricha3-MOBL3.ger.corp.intel.com> Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH] net/i40e: Improve i40evf buffer cleanup in tx vector mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Oct 2017 12:38:11 -0000 On 10/6/17 4:54 AM, Bruce Richardson wrote: > On Thu, Oct 05, 2017 at 03:11:11PM -0400, Roger B Melton wrote: >> --- >> >> i40evf tx vector logic frees mbufs, but it does not remove the >> mbufs from software rings which leads to double frees. This change >> corrects that oversight. We've validated this fix within our application. >> > Hi Roger, > > I'm a little concerned here by this driver fix, since if we are getting > double frees of mbufs there must be another bug somewhere in tracking > the free ring elements. Clearing the entries to NULL introduces extra > writes to the vector code path which will likely have a performance > impact, but also should be unnecessary, given proper tracking of the > ring status. > > Can you provide us with some details as to how to reproduce the issue, > especially with a public sample app? I'd really like to look more into > why this is happening, and if other fixes may work. > > Thanks, > /Bruce > > . > Hey Bruce, I've not attempted to reproduce the issue using sample apps.  It was initially difficult for us to reproduce with our application until we stumbled on a recipe.  The symptoms of the double free are two fold: * Application crashes: Corrupted packet headers, walking a chain of rx segments and loosing one in the middle,... * i40e adminq lockup - Meaning VF sends an OP and PF does not process it The former has been directly correlated to tx vector mode.  We still don't understand how this could lead to adminq lockup, but we have not observed the issue applied the patch I submitted.  We have questions out to Intel i40e team on this. Here's a high level view of the scenario under which the issues are observed and how we concluded that there were issues with tx vector free. We provided this information to the Intel i40e DPDK team, they reviewed the tx vector logic and suggested changes.  With the changes suggested by Intel (the patch I submitted) we have re-enabled tx vector when MTU is < MBUF size and observed no crashes. Below that you will find additional detail on the procedure within our application for changing MTU, including DPDK API calls. Let me know if you have additional questions. Regards, Roger o Scenario: + Intel i40e-da4 (4x10G) or Cisco branded equivalent + KVM or ESXi 6.5 host + In a single VM we have at least 1 VF from all 4 PFs. + We are running traffic on all interfaces. + We stop traffic + We stop a device (VF) + We start the device + We start traffic + At any point after we start the device we can crash o Experiment (Jumping over some of the work to get to the point where we believed that i40e driver was doing double frees): + Our application attaches userdata to mbufs and we put that userdata on a linked list. + We observed that when we processed the userdata it had been corrupted which lead to crashes. + This gave us a hint that the mbuf was on multiple lists. + We reviewed our application code and could not find a cause. + We began to suspect a double free in the i40evf PMD. # We disabled rx free logic and observed crashes (intentionally leaking mbufs in search of the double free). # We disabled tx free logic and observed no crashes # This gave us a hint that the double frees were coming from the i40evf PMD tx logic. + We had also observed that if we forced MTU to large always that there were no crashes # A side effect of forcing large MTU is that multi-segment is enabled. # This gave us a hint that enabling multi-segment was somehow avoiding the double free. + We forced multi-segments regardless of MTU and permitted MTU changes and observed no crashes. + We reviewed the i40evf mbuf free logic to see the effect of enabling multi-segment and observed that when multi-segment is enabled, rx vector was enabled but tx vector was not. + This lead us to examine RX vector mode free logic vs TX vector mode free logic. # RX free logic has special handling for vector mode free # TX free logic does not have any special handling for vector free o By enabling multiple segments always (even if MTU does not require multiple segments), we effectively disabled tx vector mode and this avoided the double free. o Our application no longer crashed, but it could not take advantage of tx vector optimizations. CP == Control Plane DP == Data Plane * CP sends admin down to DP * DP disables RX/TX o Block all future tx burst calls o rte_eth_dev_set_link_down() invoked o Block all future rx burst calls * DP notifies CP admin down action is complete * CP sends MTU change * DP processes MTU change o For each rxq: + rte_eth_rx_queue_info_get() + if not rxq_deferred_start rte_eth_dev_rx_queue_stop() o For each txq: + rte_eth_tx_queue_info_get() + if not txq_deferred_start rte_eth_dev_tx_queue_stop() o rte_eth_dev_stop() o Re-configure the port: (Note this is original code, not new code which is forcing multisegs always) + Set rx_mode.jumbo_frame if MTU > 1518 + Set rx_mode.enable_scatter if MTU > 2048 + txq_flags = ETH_TXQ_NOOFLOADS + if MTU > 2048, txq_flags |= ETH_TXQ_FLAGS_NOMULTSEGS + rte_eth_promiscuous_get() + rte_eth_dev_info_get() + rte_eth_dev_configure() # Init tx_vec_allowed and rx_vec_allowed to TRUE. + rte_eth_dev_info_get() + For each txq: rte_eth_tx_queue_setup() # If new MTU > 2048, ETH_TXQ_FLAGS_NOMULTSEGS was set in txq_flags & tx_vec_allowed will be cleared. + For each rxq: rte_eth_rx_queue_setup() o rte_eth_dev_set_mtu() o rte_eth_dev_start() o rte_eth_dev_info_get() o For each rxq: + if not rxq_deferred_start rte_eth_dev_rx_queue_start() o For each txq: + if not txq_deferred_start rte_eth_dev_tx_queue_start() * DP notifies CP MTU change applied. * CP sends admin up to DP * DP enables RX/TX o Enable all future tx burst calls o rte_eth_dev_set_link_up() invoked o Enable all future rx burst calls * DP notifies CP admin up action is complete -- ____________________________________________________________________ |Roger B. Melton | | Cisco Systems | |CPP Software :|: :|: 7100 Kit Creek Rd | |+1.919.476.2332 phone :|||: :|||: RTP, NC 27709-4987 | |+1.919.392.1094 fax .:|||||||:..:|||||||:. rmelton@cisco.com | | | | This email may contain confidential and privileged material for the| | sole use of the intended recipient. Any review, use, distribution | | or disclosure by others is strictly prohibited. If you are not the | | intended recipient (or authorized to receive for the recipient), | | please contact the sender by reply email and delete all copies of | | this message. | | | | For corporate legal information go to: | | http://www.cisco.com/web/about/doing_business/legal/cri/index.html | |__________________________ http://www.cisco.com ____________________|