From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9B074A0C42; Wed, 28 Apr 2021 14:51:04 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3FC2040697; Wed, 28 Apr 2021 14:51:04 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by mails.dpdk.org (Postfix) with ESMTP id CB36D40147 for ; Wed, 28 Apr 2021 14:51:01 +0200 (CEST) IronPort-SDR: pHT/ocrAfqd4Ndbvasa3Pj1f0l19RwsFzi80pbQCyXFbxHzzfBgsfOvza5ojIz+PfHDgEq8Dzu 7s+c1MF5tbYQ== X-IronPort-AV: E=McAfee;i="6200,9189,9967"; a="194618798" X-IronPort-AV: E=Sophos;i="5.82,258,1613462400"; d="scan'208";a="194618798" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Apr 2021 05:51:00 -0700 IronPort-SDR: MOm8U9qtv9OQuKU7t8nGEL7a1lF3UfqeLkABHElLmA2KmlTrnxIpm/gL4XhLiDYodSpOi3hVik SzyjX7NbSnjg== X-IronPort-AV: E=Sophos;i="5.82,258,1613462400"; d="scan'208";a="386505923" Received: from dhunt5-mobl5.ger.corp.intel.com (HELO [10.252.12.238]) ([10.252.12.238]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Apr 2021 05:50:59 -0700 To: Lukasz Wojciechowski , Stanislaw Kardach , dev@dpdk.org Cc: upstream@semihalf.com, David Marchand References: <20210426163310.1043438-1-kda@semihalf.com> <28f0ef16-4473-2151-928f-e66dd7b7c4aa@partner.samsung.com> From: David Hunt Message-ID: Date: Wed, 28 Apr 2021 13:50:56 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: <28f0ef16-4473-2151-928f-e66dd7b7c4aa@partner.samsung.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [dpdk-dev] [RFC PATCH] test/distributor: fix burst flush on worker quit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 28/4/2021 8:46 AM, Lukasz Wojciechowski wrote: > Hi Stanislaw, > > W dniu 26.04.2021 o 18:33, Stanislaw Kardach pisze: >> While working on RISC-V port I have encountered a situation where worker >> threads get stuck in the rte_distributor_return_pkt() function in the >> burst test. >> After investigation some of the threads enter this function with >> flag RTE_DISTRIB_GET_BUF set in the d->retptr64[0]. At the same time >> main thread has already passed rte_distributor_process() so nobody will >> clear this flag and hence workers can't return. >> >> What I've noticed is that adding a flush just after the last _process(), >> similarly to how quit_workers() function is written in the >> test_distributor.c fixes the issue. >> Additionally the issue disappears when I remove the rdtsc delay code >> inside the rte_distributor_request_pkt(). >> However I can't get this to reproduce on x86 (even with SIMD forced >> off) and with artificial delays, which is why I wonder whether I'm not >> actually hiding some other issue. > I was able to reproduce the issue on x86 arch using VM with 32 emulated > CPU cores. > I guess it would be enough to just have more than 8 worker threads, so > not all of them would be awaken. >> Looking at the implementation of the distributor, it is based on >> __atomic_* builtins and the only platform related bit in the fast-path >> is the rte_rdtsc() and rte_pause(). There may be some issues in the >> toolchain (I've tried so far with the Ubuntu one - 10.2.0-8ubuntu1). >> I should add that all unit tests for distributor are passing so either >> there's some coverage corner case or the implementation works on RISC-V. >> As for RDTSC I'm using a sleep-stable time counter with 1MHz frequency >> and switching to high resolution cycle counter also removes the issue >> but that's the same as removing the rdtsc delay as mentioned above. >> >> I'd love to hear from You if this fix makes any sense. > Yes your patch fixes the issue and is perfectly fine. >> While modifying this test, I've also pulled in a fix from >> test_distributor.c which ensures that each thread gets his own wakeup >> packet as it's possible that when sending a burst of packets, they won't >> be spread over all the workers. >> >> Signed-off-by: Stanislaw Kardach >> Fixes: 7c3287a10535 ("test/distributor: add performance test for burst mode") >> Cc: david.hunt@intel.com >> Cc: l.wojciechow@partner.samsung.com >> Cc: David Marchand >> --- >> app/test/test_distributor_perf.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/app/test/test_distributor_perf.c b/app/test/test_distributor_perf.c >> index b25f79a34..fdbeae6d2 100644 >> --- a/app/test/test_distributor_perf.c >> +++ b/app/test/test_distributor_perf.c >> @@ -188,13 +188,15 @@ quit_workers(struct rte_distributor *d, struct rte_mempool *p) >> rte_mempool_get_bulk(p, (void *)bufs, num_workers); >> >> quit = 1; >> - for (i = 0; i < num_workers; i++) >> + for (i = 0; i < num_workers; i++) { >> bufs[i]->hash.usr = i << 1; >> - rte_distributor_process(d, bufs, num_workers); >> + rte_distributor_process(d, &bufs[i], 1); >> + } >> >> rte_mempool_put_bulk(p, (void *)bufs, num_workers); >> >> rte_distributor_process(d, NULL, 0); >> + rte_distributor_flush(d); >> rte_eal_mp_wait_lcore(); >> quit = 0; >> worker_idx = 0; >     Tested-by: Lukasz Wojciechowski >     Reviewed-by: Lukasz Wojciechowski Thanks, Stanislaw, Lukasz. Acked-by: David Hunt