From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C5F3BA09E4 for ; Fri, 29 Jan 2021 06:59:18 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B55EE2400F7; Fri, 29 Jan 2021 06:59:18 +0100 (CET) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id C6BF92400F7; Fri, 29 Jan 2021 06:59:17 +0100 (CET) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E4D1E13A1; Thu, 28 Jan 2021 21:59:16 -0800 (PST) Received: from net-x86-dell-8268.shanghai.arm.com (net-x86-dell-8268.shanghai.arm.com [10.169.210.127]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B153E3F66E; Thu, 28 Jan 2021 21:59:13 -0800 (PST) From: Feifei Wang To: Honnappa Nagarahalli , Konstantin Ananyev , Olivier Matz , Gavin Hu Cc: dev@dpdk.org, nd@arm.com, Feifei Wang , stable@dpdk.org, Honnappa Nagarahalli , Ruifeng Wang Date: Fri, 29 Jan 2021 13:59:03 +0800 Message-Id: <20210129055905.1768645-2-feifei.wang2@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210129055905.1768645-1-feifei.wang2@arm.com> References: <20201221111359.22013-1-feifei.wang2@arm.com> <20210129055905.1768645-1-feifei.wang2@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-stable] [PATCH v3 1/3] test/ring: reduce iteration numbers to make test duration shorter X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" When testing ring performance in the case that multiple lcores are mapped to the same physical core, e.g. --lcores '(0-3)@10', it takes a very long time to wait for the "enqueue_dequeue_bulk_helper" to finish. This is because too much iteration numbers and extremely low efficiency for enqueue and dequeue with this kind of core mapping. Following are the test results to show the above phenomenon: x86-Intel(R) Xeon(R) Gold 6240: $sudo ./app/test/dpdk-test --lcores '(0-1)@25' Testing using two hyperthreads(bulk (size: 8):) iter_shift: 3 5 7 9 11 13 *15 17 19 21 23 run time: 7s 7s 7s 8s 9s 16s 47s 170s 660s >0.5h >1h legacy APIs: SP/SC: 37 11 6 40525 40525 40209 40367 40407 40541 NoData NoData legacy APIs: MP/MC: 56 14 11 50657 40526 40526 40526 40625 40585 NoData NoData aarch64-n1sdp: $sudo ./app/test/dpdk-test --lcore '(0-1)@1' Testing using two hyperthreads(bulk (size: 8):) iter_shift: 3 5 7 9 11 13 *15 17 19 21 23 run time: 8s 8s 8s 9s 9s 14s 34s 111s 418s 25min >1h legacy APIs: SP/SC: 0.4 0.2 0.1 488 488 488 488 488 489 489 NoData legacy APIs: MP/MC: 0.4 0.3 0.2 488 488 488 488 490 489 489 NoData As the number of iterations increases, so does the time which is required to run the program. Currently (iter_shift = 23), it will take more than 1 hour to wait for the test to finish. To fix this, the "iter_shift" should decrease and ensure enough iterations to keep the test data stable. In order to achieve this, we also test with "-l" EAL argument: x86-Intel(R) Xeon(R) Gold 6240: $sudo ./app/test/dpdk-test -l 25-26 Testing using two NUMA nodes(bulk (size: 8):) iter_shift: 3 5 7 9 11 13 *15 17 19 21 23 run time: 6s 6s 6s 6s 6s 6s 6s 7s 8s 11s 27s legacy APIs: SP/SC: 47 20 13 22 54 83 91 73 81 75 95 legacy APIs: MP/MC: 44 18 18 240 245 270 250 249 252 250 253 aarch64-n1sdp: $sudo ./app/test/dpdk-test -l 1-2 Testing using two physical cores(bulk (size: 8):) iter_shift: 3 5 7 9 11 13 *15 17 19 21 23 run time: 8s 8s 8s 8s 8s 8s 8s 9s 9s 11s 23s legacy APIs: SP/SC: 0.7 0.4 1.2 1.8 2.0 2.0 2.0 2.0 2.0 2.0 2.0 legacy APIs: MP/MC: 0.3 0.4 1.3 1.9 2.9 2.9 2.9 2.9 2.9 2.9 2.9 According to above test data, when "iter_shift" is set as "15", the test run time is reduced to less than 1 minute and the test result can keep stable in x86 and aarch64 servers. Fixes: 1fa5d0099efc ("test/ring: add custom element size performance tests") Cc: honnappa.nagarahalli@arm.com Cc: stable@dpdk.org Signed-off-by: Feifei Wang Reviewed-by: Honnappa Nagarahalli Reviewed-by: Ruifeng Wang Acked-by: Konstantin Ananyev --- app/test/test_ring_perf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c index e63e25a86..fd82e2041 100644 --- a/app/test/test_ring_perf.c +++ b/app/test/test_ring_perf.c @@ -178,7 +178,7 @@ enqueue_dequeue_bulk_helper(const unsigned int flag, const int esize, struct thread_params *p) { int ret; - const unsigned int iter_shift = 23; + const unsigned int iter_shift = 15; const unsigned int iterations = 1 << iter_shift; struct rte_ring *r = p->r; unsigned int bsize = p->size; -- 2.25.1