From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 332F9A04BC for ; Fri, 9 Oct 2020 14:13:59 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1D3A31D5C4; Fri, 9 Oct 2020 14:13:58 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id C88181D59B; Fri, 9 Oct 2020 14:13:52 +0200 (CEST) IronPort-SDR: ssmMPqtEuHeFv9pi78pooFLXCuVhifP7C/YAs6WVtPZXlgO4NSgfr3u6Vm5Sq8Gq7aGtN6DF+j 6SIfhI+7D3Lg== X-IronPort-AV: E=McAfee;i="6000,8403,9768"; a="145341262" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="145341262" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 05:13:51 -0700 IronPort-SDR: jlT4ahogcI1RrX9U+FQsLXNYGyKDKZ6dCgMQC7VB0kDo8IDcMgYdl15cRZHKaG0RfUvZG/bh3+ z2Bb7JAWDq+g== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528907766" Received: from dhunt5-mobl5.ger.corp.intel.com (HELO [10.249.34.11]) ([10.249.34.11]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 05:13:49 -0700 To: Lukasz Wojciechowski , Bruce Richardson Cc: dev@dpdk.org, stable@dpdk.org References: <20200925224209.12173-1-l.wojciechow@partner.samsung.com> <20201008052323.11547-1-l.wojciechow@partner.samsung.com> <20201008052323.11547-5-l.wojciechow@partner.samsung.com> <6acff92e-213d-4565-d2ea-4cb71ba97172@intel.com> <43b64806-aa26-68a2-5757-d5a6ec7a8174@partner.samsung.com> From: David Hunt Message-ID: <3fd65112-5230-68d9-ecf3-e6c91df4fcf9@intel.com> Date: Fri, 9 Oct 2020 13:13:47 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: <43b64806-aa26-68a2-5757-d5a6ec7a8174@partner.samsung.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [dpdk-stable] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" On 8/10/2020 10:07 PM, Lukasz Wojciechowski wrote: > W dniu 08.10.2020 o 16:26, David Hunt pisze: >> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote: >>> The burst version of distributor implementation was missing proper >>> handling of worker shutdown. A worker processing packets received >>> from distributor can call rte_distributor_return_pkt() function >>> informing distributor that it want no more packets. Further calls to >>> rte_distributor_request_pkt() or rte_distributor_get_pkt() however >>> should inform distributor that new packets are requested again. >>> >>> Lack of the proper implementation has caused that even after worker >>> informed about returning last packets, new packets were still sent >>> from distributor causing deadlocks as no one could get them on worker >>> side. >>> >>> This patch adds handling shutdown of the worker in following way: >>> 1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag >>> was formerly unused in burst implementation and now it is used >>> for marking valid packets in retptr64 replacing invalid use >>> of RTE_DISTRIB_RETURN_BUF flag. >>> 2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake >>> in retptr64 to indicate that worker has shutdown. >>> 3) Worker that shuts down blocks also bufptr for itself with >>> RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any >>> in flight packets. >>> 4) When distributor receives information about shutdown of a worker, >>> it: marks worker as not active; retrieves any in flight and backlog >>> packets and process them to different workers; unlocks bufptr64 >>> by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in >>> the future if worker requests any new packages. >>> 5) Do not allow to: send or add to backlog any packets for not >>> active workers. Such workers are also ignored if matched. >>> 6) Adjust calls to handle_returns() and tags matching procedure >>> to react for possible activation deactivation of workers. >>> >>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library") >>> Cc: david.hunt@intel.com >>> Cc: stable@dpdk.org >>> >>> Signed-off-by: Lukasz Wojciechowski >>> --- >> >> Hi Lukasz, > Hi David, > Many thanks for your review. >>    I spent the most amount of time going through this particular >> patch, and it looks good to me (even the bit where >> rte_distributor_process is called recursively) :) > That's the same trick that was used in the legacy single version. :) >> I'll try and get some time to run through some more testing, but for now: >> >> Acked-by: David Hunt > Thanks and if you'll run the test, please take a look at the > performance. I think it has dropped because of these additional > synchronizations and actions on activation/deactivation. > > However the quality has increased much. With v5 version , I ran tests > over 100000 times and didn't get a single failure! > > Let me know about your results. > Going back through the patch set and running performance on each one, I see a 10% drop in performance at patch 2 in the series, which adds an extra handle_returns() call in the busy loop. Which avoids possible deadlock. I played around with that patch for a while, only calling handle_returns() every x times aroudn the loop, but the performance was worse again, probably because of the extra branch I added. However, it's more important to have stable performance than so it's still a good idea to have that fix applied, IMO. Maybe we can get back some lost performance in future optimisation patches. Thanks, Dave.